Smart audio signal classification for tracking of construction tasks

Karunakar Reddy Mannem, Eyob Mengiste, Saed Hasan, Borja García de Soto, Rafael Sacks

Research output: Contribution to journalArticlepeer-review


This paper presents a model for sound classification in construction that leverages a unique combination of Mel spectrograms and Mel-Frequency Cepstral Coefficient (MFCC) values. This model combines deep neural networks like Convolution Neural Networks (CNN) and Long short-term memory (LSTM) to create CNN-LSTM and MFCCs-LSTM architectures, enabling the extraction of spectral and temporal features from audio data. The audio data, generated from construction activities in a real-time closed environment is used to evaluate the proposed model and resulted in an overall Precision, Recall, and F1-score of 91%, 89%, and 91%, respectively. This performance surpasses other established models, including Deep Neural Networks (DNN), CNN, and Recurrent Neural Networks (RNN), as well as a combination of these models as CNN-DNN, CNN-RNN, and CNN-LSTM. These results underscore the potential of combining Mel spectrograms and MFCC values to provide a more informative representation of sound data, thereby enhancing sound classification in noisy environments.

Original languageEnglish (US)
Article number105485
JournalAutomation in Construction
StatePublished - Sep 2024


  • Activity tracking
  • Audio
  • CNN
  • LSTM
  • MFCC
  • Mel spectrograms
  • Sound

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Civil and Structural Engineering
  • Building and Construction


Dive into the research topics of 'Smart audio signal classification for tracking of construction tasks'. Together they form a unique fingerprint.

Cite this