EdgeL3: Compressing L3-Net for mote scale urban noise monitoring

Sangeeta Kumari, Dhrubojyoti Roy, Mark Cartwright, Juan Pablo Bello, Anish Arora

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Urban noise sensing in deeply embedded devices at the edge of the Internet of Things (IoT) is challenging not only because of the lack of sufficiently labeled training data but also because device resources are quite limited. Look, Listen, and Learn (L3), a recently proposed state-of-the-art transfer learning technique, mitigates the first challenge by training self-supervised deep audio embeddings through binary Audio-Visual Correspondence (AVC), and the resulting embeddings can be used to train a variety of downstream audio classification tasks. However, with close to 4.7 million parameters, the multi-layer L3-Net CNN is still prohibitively expensive to be run on small edge devices, such as 'motes' that use a single microcontroller and limited memory to achieve long-lived self-powered operation. In this paper, we comprehensively explore the feasibility of compressing the L3-Net for mote-scale inference. We use pruning, ablation, and knowledge distillation techniques to show that the originally proposed L3-Net architecture is substantially overparameterized, not only for AVC but for the target task of sound classification as evaluated on two popular downstream datasets. Our findings demonstrate the value of fine-tuning and knowledge distillation in regaining the performance lost through aggressive compression strategies. Finally, we present EdgeL3, the first L3-Net reference model compressed by 1-2 orders of magnitude for real-time urban noise monitoring on resource-constrained edge devices, that can fit in just 0.4 MB of memory through half-precision floating point representation.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE 33rd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages877-884
Number of pages8
ISBN (Electronic)9781728135106
DOIs
StatePublished - May 2019
Event33rd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019 - Rio de Janeiro, Brazil
Duration: May 20 2019May 24 2019

Publication series

NameProceedings - 2019 IEEE 33rd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019

Conference

Conference33rd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019
CountryBrazil
CityRio de Janeiro
Period5/20/195/24/19

Keywords

  • Audio embedding
  • Convolutional neural nets
  • Deep learning
  • Edge network
  • Finetuning
  • Knowledge distillation
  • Pruning
  • Transfer learning

ASJC Scopus subject areas

  • Information Systems and Management
  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Control and Optimization

Fingerprint Dive into the research topics of 'EdgeL<sup>3</sup>: Compressing L<sup>3</sup>-Net for mote scale urban noise monitoring'. Together they form a unique fingerprint.

Cite this