Hierarchical Spike Coding of sound

Yan Karklin, Chaitanya Ekanadham, Eero P. Simoncelli

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Natural sounds exhibit complex statistical regularities at multiple scales. Acoustic events underlying speech, for example, are characterized by precise temporal and frequency relationships, but they can also vary substantially according to the pitch, duration, and other high-level properties of speech production. Learning this structure from data while capturing the inherent variability is an important first step in building auditory processing systems, as well as understanding the mechanisms of auditory perception. Here we develop Hierarchical Spike Coding, a two-layer probabilistic generative model for complex acoustic structure. The first layer consists of a sparse spiking representation that encodes the sound using kernels positioned precisely in time and frequency. Patterns in the positions of first layer spikes are learned from the data: on a coarse scale, statistical regularities are encoded by a second-layer spiking representation, while fine-scale structure is captured by recurrent interactions within the first layer. When fit to speech data, the second layer acoustic features include harmonic stacks, sweeps, frequency modulations, and precise temporal onsets, which can be composed to represent complex acoustic events. Unlike spectrogram-based methods, the model gives a probability distribution over sound pressure waveforms. This allows us to use the second-layer representation to synthesize sounds directly, and to perform model-based denoising, on which we demonstrate a significant improvement over standard methods.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 25
Subtitle of host publication26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012
Pages3032-3040
Number of pages9
StatePublished - 2012
Event26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012 - Lake Tahoe, NV, United States
Duration: Dec 3 2012Dec 6 2012

Publication series

NameAdvances in Neural Information Processing Systems
Volume4
ISSN (Print)1049-5258

Other

Other26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012
Country/TerritoryUnited States
CityLake Tahoe, NV
Period12/3/1212/6/12

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Hierarchical Spike Coding of sound'. Together they form a unique fingerprint.

Cite this