DuFCALF: Instilling Sentience in Computerized Song Analysis

Himadri Mukherjee, Matteo Marciano, Ankita Dhar, Kaushik Roy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Music recommendation systems have evolved significantly in the past couple of years and have become extremely popular with the advancement of Artificial Intelligence (AI). Such systems categorize songs based on disparate perspectives, like genre, artist, tempo, etc. However, there have been fewer developments in the purview of song categorization based on feelings. Songs amplify the mood of a person and often, listeners choose the type of song they want to listen to based on their mood. Hence, modeling the emotional content of songs is crucial. However, this is a challenging affair because songs embody multiple instruments and vocals in a single instance. Each of them is often modulated differently for a better experience for the listeners as well. This is very common for present-day songs and many songs of different emotions often have very similar instruments and chord progressions which further enhances the challenge. In this paper, a system is presented to distinguish song clips based on their emotion. The clips were parameterized using two features which were fed to disparate deep networks. Thereafter, a dual feature cross architecture late fusion (DuFCALF) strategy was used to distinguish the moods. Experiments were performed with multitudinous sections of songs to test its efficacy for limited data and the sadness of songs was captured with over 80% accuracy. An overall increase of 4.43% in distinction of the emotions was obtained using DuFCALF over the best-performing baseline system.

Original languageEnglish (US)
Title of host publicationSpeech and Computer - 26th International Conference, SPECOM 2024, Proceedings
EditorsAlexey Karpov, Vlado Delić
PublisherSpringer Science and Business Media Deutschland GmbH
Pages277-292
Number of pages16
ISBN (Print)9783031780134
DOIs
StatePublished - 2025
Event26th International Conference on Speech and Computer, SPECOM 2024 - Belgrade, Serbia
Duration: Nov 25 2024Nov 28 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15300 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Conference on Speech and Computer, SPECOM 2024
Country/TerritorySerbia
CityBelgrade
Period11/25/2411/28/24

Keywords

  • Audioscape
  • Deep learning
  • Emotion
  • Music analysis

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'DuFCALF: Instilling Sentience in Computerized Song Analysis'. Together they form a unique fingerprint.

Cite this