This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: 1228 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 6, AUGUST 2010 Acoustic Source Localization and Tracking Using Track Before Detect Maurice F. Fallon , Member, IEEE , and Simon Godsill , Member, IEEE Abstract— Particle Filter-based Acoustic Source Localization algorithms attempt to track the position of a sound source—one or more people speaking in a room—based on the current data from a microphone array as well as all previous data up to that point. This paper first discusses some of the inherent behavioral traits of the steered beamformer localization function. Using conclusions drawn from that study, a multitarget methodology for acoustic source tracking based on the Track Before Detect (TBD) framework is introduced. The algorithm also implicitly evaluates source activity using a variable appended to the state vector. Using the TBD methodology avoids the need to identify a set of source measurements and also allows for a vast increase in the number of particles used for a comparitive computational load which results in increased tracking stability in challenging recording environments. An evaluation of tracking performance is given using a set of real speech recordings with two simultaneously active speech sources. Index Terms— Acoustic source localization, multi-target tracking, particle filtering, sequential estimation, tracking fil- ters. I. INTRODUCTION L OCALIZATION and tracking of speech sources—known as Acoustic Source Tracking (AST) or Localization—has become an increasingly active area of research with applications in the fields of video conferencing and speech. The aim is to use an array of distributed microphones, with no specific arrange- ment, to track a speaking person as they move around a room based on the path delays between the source and microphones as determined from the sound recordings at the microphones. Tracking speech sources is, however, complicated by several factors: 1) background noise due to the environment; 2) other active sound sources; 3) reverberation of the source signal itself. Manuscript received February 01, 2009; revised July 07, 2009. First published September 09, 2009; current version published July 14, 2010. This work was supported by Microsoft Research through the European Ph.D. Scholarship Pro- gram. The associate editor coordinating the review of this manuscript and ap- proving it for publication was Dr. Jingdong Chen. M. F. Fallon is with the Computer Science and Artificial Intelligence Lab- oratory, Massachusetts Institute of Technology, Cambridge, MA 02139 USA (e-mail: [email protected]). S. Godsill is with Signal Processing and Communications Laboratory, Cam- bridge University Engineering Department, Cambridge, CB2 1PZ, U.K. (e-mail: [email protected])....
View Full Document
This note was uploaded on 10/01/2010 for the course ELEC 6111 taught by Professor Brown during the Spring '10 term at E. Illinois.
- Spring '10