Kalman filters for audio-video source localization

Tobias Gehrig, Kai Nickel, Hazim Kemal Ekenel, Ulrich Klee, John McDonough

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In prior work, we proposed using an extended Kalman filter to directly update position estimates in a speaker localization system based on time delays of arrival. We found that such a scheme provided superior tracking quality as compared with the conventional closed-form approximation methods. In this work, we enhance our audio localizer with video information. We propose an algorithm to incorporate detected face positions in different camera views into the Kalman filter without doing any explicit triangulation. This approach yields a robust source localizer that functions reliably both for segments wherein the speaker is silent, which would be detrimental for an audio only tracker, and wherein many faces appear, which would confuse a video only tracker. We tested our algorithm on a data set consisting of seminars held by actual speakers. Our experiments revealed that the audio-video localizer functioned better than a localizer based solely on audio or solely on video features.

Original languageEnglish (US)
Title of host publication2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Pages118-121
Number of pages4
DOIs
StatePublished - 2005
Event2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics - New Paltz, NY, United States
Duration: Oct 16 2005Oct 19 2005

Publication series

NameIEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Conference

Conference2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Country/TerritoryUnited States
CityNew Paltz, NY
Period10/16/0510/19/05

ASJC Scopus subject areas

  • Signal Processing

Fingerprint

Dive into the research topics of 'Kalman filters for audio-video source localization'. Together they form a unique fingerprint.

Cite this