[특허]System and process for robust sound source localization

System and process for robust sound source localization 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	H04R-003/00
출원번호	US-0190241 (2005-07-26)
등록번호	US-7254241 (2007-08-07)
발명자 / 주소	Rui,Yong Florencio,Dinei
출원인 / 주소	Microsoft Corporation
대리인 / 주소	Lyon & Harr, LLP
인용정보	피인용 횟수 : 8 인용 특허 : 4

초록 ▼

A system and process for finding the location of a sound source using direct approaches having weighting factors that mitigate the effect of both correlated and reverberation noise is presented. When more than two microphones are used, the traditional time-delay-of-arrival (TDOA) based sound source localization (SSL) approach involves two steps. The first step computes TDOA for each microphone pair, and the second step combines these estimates. This two-step process discards relevant information in the first step, thus degrading the SSL accuracy and robustness. In the present invention, direct, one-step, approaches are employed. Namely, a one-step TDOA SSL approach and a steered beam (SB) SSL approach are employed. Each of these approaches provides an accuracy and robustness not available with the traditional two-step approaches.

대표청구항 ▼

Wherefore, what is claimed is: 1. A computer-implemented sound source localization process for finding the location of a sound source using signals output by a microphone array having a plurality of audio sensors, comprising the following process actions: inputting the signal generated by each audio sensor of the microphone array; and selecting as the location of the sound source, a location that maximizes a sum of weighted cross correlations between the input signal from a first sensor and the input signal from the second sensor for pairs of array sensors, wherein the weighted cross correlations are weighted using a weighting function that enhances the robustness of the selected location of the sound source by mitigating an effect of uncorrelated noise and/or reverberation. 2. The process of claim 1, wherein the weighted cross correlations are computed in the frequency domain by using a frequency transform. 3. The process of claim 1, wherein the weighted cross correlations are computed in one of (i) the FFT domain or (ii) the MCLT domain. 4. The process of claim 1, wherein the weighted cross correlations are computed in the time domain. 5. The process of claim 1, wherein the sum of the weighted cross correlations is computed only for a set of pre-defined, candidate points. 6. The process of claim 1, wherein the location that maximizes the sum of the weighted cross correlations is computed with a gradient descendent procedure. 7. The process of claim 6, wherein the gradient descendent procedure is computed in a hierarchical manner. 8. A computer-readable medium having computer-executable instructions for finding the location of a sound source using signals output by a microphone array having a plurality of audio sensors, said computer-executable instructions comprising: (a) computing a N-point FFT of the input signal from each sensor; (b) establishing a set of candidate sound source locations; (c) selecting a previously unselected one of the candidate sound source locations; (d) selecting a previously unselected pair of sensors in the microphone array; (e) estimating the energy across a prescribed range of frequencies (f) associated with the sound coming from the selected candidate sound source location to the selected pair of sensors via the equation, |Wrs(f)Xr(f)Xs*(f)exp(-j2πf(τr-τs))|2, where r and s refer to a first and second sensor, respectively, of the selected pair of array sensors, Xr(f) is the N-point FFT of the input signal from the first sensor in the selected sensor pair, Xs(f) is the N-point FFT of the input signal from the second sensor in the selected sensor pair, τr is the time it takes sound to travel from the selected sound source location to the first sensor of the selected sensor pair, τs is the time it takes sound to travel from the selected sound source location to the second sensor of the selected sensor pair, and Wrs is a weighting function for mitigating the effect of both correlated and reverberation noise defined by the equation, where |Nr(f)|2 is the noise power spectrum associated with the signal from the first sensor of the selected sensor pair, |Ns(f)|2 is noise power spectrum associated with the signal from the second sensor of the selected sensor pair, and q is a prescribed proportion factor set to an estimated ratio between the energy of the reverberation and total signal at the selected sensors; (f) repeating actions (d) and (e) until all sensor pairs of interest have been selected; (g) summing the energy of the sound coming from the selected candidate sound source location estimated for each of the microphone array sensor pairs; (h) repeating actions (c) through (g) until all the candidate sound source locations have been selected; and (i) designating the candidate sound source location associated with the highest total estimated energy as the location of the sound source. 9. A computer-implemented sound source localization process for finding the location of a sound source using signals output by a microphone array having a plurality of audio sensors, comprising the following process actions: inputting the signal generated by each audio sensor of the microphone array; selecting as the location of the sound source, a location that maximizes a sum of the energy of a weighted input signal from each sensor of the microphone array, wherein the input signals are weighted using a weighting function that enhances the robustness of the selected location of the sound source by mitigating an effect of uncorrelated noise and/or reverberation. 10. The process of claim 9, wherein the input signal from each sensor of the microphone array is converted to a frequency domain using a frequency transform prior to weighting the signal. 11. The process of claim 9, wherein the input signal from each sensor of the microphone array is converted using a FFT prior to weighting the signal. 12. The process of claim 9, wherein the sum of the energy of the weighted input signal from each sensor of the microphone array is computed only for a set of pre-defined, candidate points. 13. A computer-readable medium having computer-executable instructions for finding the location of a sound source using signals output by a microphone array having a plurality of audio sensors, said computer-executable instructions comprising: (a) computing a N-point FFT of the input signal from each sensor; (b) establishing a set of candidate sound source locations; (c) selecting a previously unselected one of the candidate sound source locations; (d) selecting a previously unselected sensor in the microphone array; (e) estimating the energy across a prescribed range of frequencies (f) associated with the sound coming from the selected candidate sound source location to the selected sensor via the equation, |Vm(f)Xm(f)exp(-j2πfτm)|2, where m refers the selected sensor, Xm(f) is the N-point FFT of the input signal from the selected sensor, τm is the time it takes sound to travel from the selected sound source location to the selected sensor, and Vm is a weighting function for mitigating the effect of both correlated and reverberation noise defined by the equation, where |Nm(f)| is the N-point FFT of the noise portion of the input signal from the selected sensor, and q is a prescribed proportion factor set to an estimated ratio between the energy of the reverberation and total signal at the selected sensor; (f) repeating actions (d) and (e) until all the sensors have been selected; (g) summing the energy of the sound coming from the selected candidate sound source location estimated for each of the microphone array sensors; (h) repeating actions (c) through (g) until all the candidate sound source locations have been selected; and (i) designating the candidate sound source location associated with the highest total estimated energy as the location of the sound source.

이 특허에 인용된 특허 (4)

Pi Sheng Chang ; Aidong Ning ; Michael G. Lambert ; Wayne J. Haas, Acoustic source location using a microphone array.
상세보기
Chan David S. K. (Bethel CT), Dipmeter data processing technique.
상세보기
Benesty, Jacob; Elko, Gary Wayne; Huang, Yiteng, Method and apparatus for passive acoustic source localization for video camera steering applications.
상세보기
Rui,Yong; Florencio,Dinei A., System and process for robust sound source localization.
상세보기

이 특허를 인용한 특허 (8)

Jones, David K.; Payne, Jamie; Holstege, Cody Jay; Schneider, Mark Andrew, Floor power distribution system.
상세보기
Jones, David K.; Payne, Jamie; Holstege, Cody Jay; Schneider, Mark Andrew, Floor power distribution system.
상세보기
Zhang, Cha; Florencio, Dinei; Zhang, Zhengyou, Multi-sensor sound source localization.
상세보기
Tashev, Ivan; Acero, Alejandro; Yoon, Byung-Jun, Robust adaptive beamforming with enhanced noise suppression.
상세보기
Tashev, Ivan; Acero, Alejandro, Sensor array beamformer post-processor.
상세보기
Shimada, Osamu; Sugiyama, Akihiko, Signal processing system, apparatus and method used on the system, and program thereof.
상세보기
Wilson, Scott Edward, Sound gathering system.
상세보기
Lovitt, Andrew William; Cooper, Kenneth Harry, Utilizing spoken cues to influence response rendering for virtual assistants.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

System and process for robust sound source localization 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (4)

이 특허를 인용한 특허 (8)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

System and process for robust sound source localization 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (4)

이 특허를 인용한 특허 (8)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트