Comprehensive modulation representation for automatic speech recognition

Yadong Wang, Steven Greenberg, Jayaganesh Swaminathan, Ramdas Kumaresan, David Poeppel

Research output: Contribution to conferencePaper

Abstract

We present a new feature representation for speech recognition based on both amplitude modulation spectra (AMS) and frequency modulation spectra (FMS). A comprehensive modulation spectral (CMS) approach is defined and analyzed based on a modulation model of the band-pass signal. The speech signal is processed first by a bank of specially designed auditory band-pass filters. CMS are extracted from the output of the filters as the features for automatic speech recognition (ASR). A significant improvement is demonstrated in performance on noisy speech. On the Aurora 2 task the new features result in an improvement of 23.43% relative to traditional mel-cepstrum front-end features using a 3 GMM HMM back-end. Although the improvements are relatively modest, the novelty of the method and its potential for performance enhancement warrants serious attention for future-generation ASR applications.

Original languageEnglish (US)
Pages3025-3028
Number of pages4
StatePublished - 2005
Event9th European Conference on Speech Communication and Technology - Lisbon, Portugal
Duration: Sep 4 2005Sep 8 2005

Other

Other9th European Conference on Speech Communication and Technology
CountryPortugal
CityLisbon
Period9/4/059/8/05

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Comprehensive modulation representation for automatic speech recognition'. Together they form a unique fingerprint.

  • Cite this

    Wang, Y., Greenberg, S., Swaminathan, J., Kumaresan, R., & Poeppel, D. (2005). Comprehensive modulation representation for automatic speech recognition. 3025-3028. Paper presented at 9th European Conference on Speech Communication and Technology, Lisbon, Portugal.