Pages

Wednesday, January 21, 2015

Real Time Sound Source Localization


Introduction

Humans usually are capable of perception theoretically the position of sound source in front of him using only two ears with an accuracy of ± 10 degree accuracy, and ± 15 degree accuracy in case the source is behind him. And that the time difference or phase difference of the sound source that each microphone detects are analyzed and the direction of sound source are obtained.

In core correlation methods the microphone outputs are cross-core related and consider the time that corresponds to the maximum peak in the output as estimated time delay between the microphones. Most popular method technique for Time Delay Estimation (TDE) is to add a weighing function to increase the accuracy by making the peaks sharper. In generalized cross correlation has an advantage of peak distinct at low SNR.

There for I used generalized cross correlation method to estimate time delay. The sound source localization system is accrued sound signal in real time.The overall system is composed of hardware part and software part. The hardware  part is consists of two USB microphones which capture the sound signal. Due to use of USB microphones, sound signal directly send PC. Software part is done by Matlab and which filter the sound signals and give sound source direction as output.

                                   In this project, the sound source localization system consists of three parts called front-end processing, time delay estimation, and direction calculation.



Front-end processing
            Front end processing which consists of data reading, filtering and voice detection is the first process of the overall system. USB microphones capture the sound signals and then sound signals are directly sent to PC. Signals are filtered using Butterworth filter which is implemented in Matlab. Only 100 Hz – 3500 Hz range sound signals are filtered because only human voice should be detected. Normally the whole human voice output ranges up to 20 kHz but the voice output range of only the words that humans speak is in between 100Hz to 3500Hz. Voice detection is used to eliminate the DC noises in the environment. 
Data reading

I used USB microphones ,therefore easily can read data in matlab. Following matlab code can be used to read sound signals using USB MICs.

ai = analoginput('winsound','3');
addchannel(ai, 1);
bi = analoginput('winsound','1');
addchannel(bi, 1);
set([ai bi],'SampleRate',48000);
set([ai bi],'TriggerType','Manual')
set([ai bi],'SamplesPerTrigger',Inf);

Where sample rate is set 48kHz and Trigger type is set to Manual to reduced time delay between starting of two MICs.

start([ai bi]);
trigger([ai bi]);

using this manual triggering you can reduced starting time delay up to ten to the minus six second.

Time Delay Estimation

Due to real time system and lower complexity, Generalized cross correlation method is used to estimate the time delay. In addition to that generalized cross correlation has an advantage of making the peak distinct at low signal to noise ratio. Following diagram shows the algorithm of generalized cross correlation method. 




No comments:

Post a Comment