A comparison of classifiers for detecting emotion from speech

Izhak Shafran, Mehryar Mohri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Accurate detection of emotion from speech has clear benefits for the design of more natural human-machine speech interfaces or for the extraction of useful information from large quantities of speech data. The task consists of assigning, out of a fixed set, an emotion category, e.g., anger, fear, or satisfaction, to a speech utterance. In recent work, several classifiers have been proposed for automatic detection of a speaker's emotion using spoken words as the input. These classifiers were designed independently and tested on separate corpora, making it difficult to compare their performance. This paper presents three classifiers, two popular classifiers from the literature modeling the word content via n-gram sequences, one based on an interpolated language model, another on a mutual information-based feature-selection approach, and compares them with a discriminant kernel-based technique that we recently adopted. We have implemented these three classification algorithms and evaluated their performance by applying them to a corpus collected from a spoken-dialog system that was widely deployed across the US. The results show that our kernel-based classifier achieves an accuracy of 80.6%, and out-performs both the interpolated language model classifier, which achieved a classification accuracy of 70.1%, and the classifier using mutual information-based feature selection (78.8%).

Original languageEnglish (US)
Title of host publication2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing
PublisherInstitute of Electrical and Electronics Engineers Inc.
PagesI341-I344
ISBN (Print)0780388747, 9780780388741
DOIs
StatePublished - 2005
Event2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Philadelphia, PA, United States
Duration: Mar 18 2005Mar 23 2005

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
VolumeI
ISSN (Print)1520-6149

Other

Other2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05
CountryUnited States
CityPhiladelphia, PA
Period3/18/053/23/05

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'A comparison of classifiers for detecting emotion from speech'. Together they form a unique fingerprint.

Cite this