IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0190241
(2005-07-26)
|
등록번호 |
US-7254241
(2007-08-07)
|
발명자
/ 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
8 인용 특허 :
4 |
초록
▼
A system and process for finding the location of a sound source using direct approaches having weighting factors that mitigate the effect of both correlated and reverberation noise is presented. When more than two microphones are used, the traditional time-delay-of-arrival (TDOA) based sound source
A system and process for finding the location of a sound source using direct approaches having weighting factors that mitigate the effect of both correlated and reverberation noise is presented. When more than two microphones are used, the traditional time-delay-of-arrival (TDOA) based sound source localization (SSL) approach involves two steps. The first step computes TDOA for each microphone pair, and the second step combines these estimates. This two-step process discards relevant information in the first step, thus degrading the SSL accuracy and robustness. In the present invention, direct, one-step, approaches are employed. Namely, a one-step TDOA SSL approach and a steered beam (SB) SSL approach are employed. Each of these approaches provides an accuracy and robustness not available with the traditional two-step approaches.
대표청구항
▼
Wherefore, what is claimed is: 1. A computer-implemented sound source localization process for finding the location of a sound source using signals output by a microphone array having a plurality of audio sensors, comprising the following process actions: inputting the signal generated by each audi
Wherefore, what is claimed is: 1. A computer-implemented sound source localization process for finding the location of a sound source using signals output by a microphone array having a plurality of audio sensors, comprising the following process actions: inputting the signal generated by each audio sensor of the microphone array; and selecting as the location of the sound source, a location that maximizes a sum of weighted cross correlations between the input signal from a first sensor and the input signal from the second sensor for pairs of array sensors, wherein the weighted cross correlations are weighted using a weighting function that enhances the robustness of the selected location of the sound source by mitigating an effect of uncorrelated noise and/or reverberation. 2. The process of claim 1, wherein the weighted cross correlations are computed in the frequency domain by using a frequency transform. 3. The process of claim 1, wherein the weighted cross correlations are computed in one of (i) the FFT domain or (ii) the MCLT domain. 4. The process of claim 1, wherein the weighted cross correlations are computed in the time domain. 5. The process of claim 1, wherein the sum of the weighted cross correlations is computed only for a set of pre-defined, candidate points. 6. The process of claim 1, wherein the location that maximizes the sum of the weighted cross correlations is computed with a gradient descendent procedure. 7. The process of claim 6, wherein the gradient descendent procedure is computed in a hierarchical manner. 8. A computer-readable medium having computer-executable instructions for finding the location of a sound source using signals output by a microphone array having a plurality of audio sensors, said computer-executable instructions comprising: (a) computing a N-point FFT of the input signal from each sensor; (b) establishing a set of candidate sound source locations; (c) selecting a previously unselected one of the candidate sound source locations; (d) selecting a previously unselected pair of sensors in the microphone array; (e) estimating the energy across a prescribed range of frequencies (f) associated with the sound coming from the selected candidate sound source location to the selected pair of sensors via the equation, |Wrs(f)Xr(f)Xs*(f)exp(-j2πf(τr-τs))|2, where r and s refer to a first and second sensor, respectively, of the selected pair of array sensors, Xr(f) is the N-point FFT of the input signal from the first sensor in the selected sensor pair, Xs(f) is the N-point FFT of the input signal from the second sensor in the selected sensor pair, τr is the time it takes sound to travel from the selected sound source location to the first sensor of the selected sensor pair, τs is the time it takes sound to travel from the selected sound source location to the second sensor of the selected sensor pair, and Wrs is a weighting function for mitigating the effect of both correlated and reverberation noise defined by the equation, where |Nr(f)|2 is the noise power spectrum associated with the signal from the first sensor of the selected sensor pair, |Ns(f)|2 is noise power spectrum associated with the signal from the second sensor of the selected sensor pair, and q is a prescribed proportion factor set to an estimated ratio between the energy of the reverberation and total signal at the selected sensors; (f) repeating actions (d) and (e) until all sensor pairs of interest have been selected; (g) summing the energy of the sound coming from the selected candidate sound source location estimated for each of the microphone array sensor pairs; (h) repeating actions (c) through (g) until all the candidate sound source locations have been selected; and (i) designating the candidate sound source location associated with the highest total estimated energy as the location of the sound source. 9. A computer-implemented sound source localization process for finding the location of a sound source using signals output by a microphone array having a plurality of audio sensors, comprising the following process actions: inputting the signal generated by each audio sensor of the microphone array; selecting as the location of the sound source, a location that maximizes a sum of the energy of a weighted input signal from each sensor of the microphone array, wherein the input signals are weighted using a weighting function that enhances the robustness of the selected location of the sound source by mitigating an effect of uncorrelated noise and/or reverberation. 10. The process of claim 9, wherein the input signal from each sensor of the microphone array is converted to a frequency domain using a frequency transform prior to weighting the signal. 11. The process of claim 9, wherein the input signal from each sensor of the microphone array is converted using a FFT prior to weighting the signal. 12. The process of claim 9, wherein the sum of the energy of the weighted input signal from each sensor of the microphone array is computed only for a set of pre-defined, candidate points. 13. A computer-readable medium having computer-executable instructions for finding the location of a sound source using signals output by a microphone array having a plurality of audio sensors, said computer-executable instructions comprising: (a) computing a N-point FFT of the input signal from each sensor; (b) establishing a set of candidate sound source locations; (c) selecting a previously unselected one of the candidate sound source locations; (d) selecting a previously unselected sensor in the microphone array; (e) estimating the energy across a prescribed range of frequencies (f) associated with the sound coming from the selected candidate sound source location to the selected sensor via the equation, |Vm(f)Xm(f)exp(-j2πfτm)|2, where m refers the selected sensor, Xm(f) is the N-point FFT of the input signal from the selected sensor, τm is the time it takes sound to travel from the selected sound source location to the selected sensor, and Vm is a weighting function for mitigating the effect of both correlated and reverberation noise defined by the equation, where |Nm(f)| is the N-point FFT of the noise portion of the input signal from the selected sensor, and q is a prescribed proportion factor set to an estimated ratio between the energy of the reverberation and total signal at the selected sensor; (f) repeating actions (d) and (e) until all the sensors have been selected; (g) summing the energy of the sound coming from the selected candidate sound source location estimated for each of the microphone array sensors; (h) repeating actions (c) through (g) until all the candidate sound source locations have been selected; and (i) designating the candidate sound source location associated with the highest total estimated energy as the location of the sound source.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.