Abstract
We present a new feature representation for speech recognition based on both amplitude modulation spectra (AMS) and frequency modulation spectra (FMS). A comprehensive modulation spectral (CMS) approach is defined and analyzed based on a modulation model of the band-pass signal. The speech signal is processed first by a bank of specially designed auditory band-pass filters. CMS are extracted from the output of the filters as the features for automatic speech recognition (ASR). A significant improvement is demonstrated in performance on noisy speech. On the Aurora 2 task the new features result in an improvement of 23.43% relative to traditional mel-cepstrum front-end features using a 3 GMM HMM back-end. Although the improvements are relatively modest, the novelty of the method and its potential for performance enhancement warrants serious attention for future-generation ASR applications.
Original language | English (US) |
---|---|
Pages | 3025-3028 |
Number of pages | 4 |
State | Published - 2005 |
Event | 9th European Conference on Speech Communication and Technology - Lisbon, Portugal Duration: Sep 4 2005 → Sep 8 2005 |
Other
Other | 9th European Conference on Speech Communication and Technology |
---|---|
Country/Territory | Portugal |
City | Lisbon |
Period | 9/4/05 → 9/8/05 |
ASJC Scopus subject areas
- General Engineering