[특허]Identifying far-end sound

Identifying far-end sound 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-015/00 G10L-011/00 G10L-019/12 G10L-021/02 G10L-017/00
출원번호	US-0953764 (2007-12-10)
등록번호	US-8219387 (2012-07-10)
발명자 / 주소	Cutler, Ross Sun, Xinding Velayutham, Senthil
출원인 / 주소	Microsoft Corporation
인용정보	피인용 횟수 : 3 인용 특허 : 30

초록 ▼

Frames containing audio data may be received, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom. Probability distribution functions are determined from the frames of audio data. A probability distribution function comprises likelihoods that respective directions are directions of sources of sounds. An active speaker may be identified in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo.

대표청구항 ▼

1. One or more volatile and/or non-volatile physical computer readable media storing information to enable one or more devices to perform a process, the process comprising: receiving frames containing audio data, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom;determining, from the frames of audio data, probability distribution functions, a probability distribution function comprising likelihoods that respective directions are directions of sources of sounds; andidentifying an active speaker in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo. 2. One or more volatile and/or non-volatile physical computer readable media storing information to enable one or more devices to perform a process according to claim 1, wherein the determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo comprises: identifying a plurality of local maximums of a probability distribution function. 3. One or more volatile and/or non-volatile physical computer readable media storing information to enable one or more devices to perform a process according to claim 2, the process further comprising determining whether the local maximums are substantially at pre-determined locations in the probability distribution functions. 4. One or more volatile and/or non-volatile physical computer readable media storing information to enable one or more devices to perform a process according to claim 3, the process further comprising finding a difference between a maximal local maximum and a minimal local maximum of the probability distribution function. 5. One or more volatile and/or non-volatile physical computer readable media storing information to enable one or more devices to perform a process according to claim 2, the process further comprising determining whether the identified local maximums are similar to local maximums that occur when substantially all of the sound being received by the microphone array is sound from a loudspeaker. 6. One or more volatile and/or non-volatile physical computer readable media storing information to enable one or more devices to perform a process according to claim 1, wherein the determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo comprises: determining whether characteristics of a probability distribution function are sufficiently similar to predetermined characteristics. 7. One or more volatile and/or non-volatile physical computer readable media storing information to enable one or more devices to perform a process according to claim 6, wherein the predetermined characteristics comprise characteristics of a probability distribution function that would occur if the microphone array was receiving sound predominantly from the loudspeaker. 8. One or more volatile and/or non-volatile physical computer readable media storing information to enable one or more devices to perform a process according to claim 1, wherein the determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo comprises: determining whether the probability distribution functions have local maximums near predetermined directions. 9. A method performed by one or more devices that comprise one or more processors and storage, the method comprising: receiving, in the storage, frames containing audio data, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom;determining by the one or more processors, from the frames of audio data, probability distribution functions, a probability distribution function comprising likelihoods that respective directions are directions of sources of sounds, and storing the probability distribution functions in the storage; andidentifying, by the one or more processors, an active speaker in frames of video data in the storage based on the video data and based on audio information derived from the audio data by the one or more processors, where use of the audio information as a basis for identifying the active speaker is controlled by the one or more processors determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo. 10. A method according to claim 9, further comprising determining whether characteristics of the probability distribution functions are similar to characteristics of a probability distribution function that corresponds to the microphone array primarily receiving sound from a loudspeaker. 11. A method according to claim 10, further comprising: receiving frames of audio data from a far-end source and using the frames to produce sound with a loudspeaker co-located with the microphone array, where the sound received at the microphone includes the sound produced with the loudspeaker;generating audio frames from the sound received at the microphone array, performing echo cancellation on the audio frames, wherein the probability distribution functions are computed from the audio frames after the echo cancellation; andallowing a probability distribution function to be used in the active speaker detection process when characteristics of the probability distribution function are determined to be not similar to characteristics of a probability distribution function that corresponds to the microphone array primarily receiving sound from a loudspeaker. 12. A method according to claim 9, further comprising identifying and analyzing local maximums of the probability distribution functions. 13. A method according to claim 12, wherein the analyzing the local maximums comprises comparing them to direction(s) of one or more loudspeakers. 14. A method according to claim 13, wherein the analyzing further comprises identifying a maximal local maximum and a minimal local maximum. 15. A method according to claim 14, further comprising subtracting the magnitude of the minimal local maximum from the magnitude of a maximal local maximum and dividing by the magnitude of the minimal local maximum. 16. A method according to claim 15, further comprising subtracting the magnitude of the minimal local maximum from the magnitude of a maximal local maximum and dividing by the magnitude of the minimal local maximum. 17. A method according to claim 9, further comprising analyzing the probability distribution functions to determine whether the probability functions are to be used to detect an active speaker.

이 특허에 인용된 특허 (30)

Yoo, Jae Ha, Acoustic echo control system and double talk control method thereof.
상세보기
Ning, Aidong, Adaptive thresholds in acoustic echo canceller for use during double talk.
상세보기
Nagata Yoshifumi,JPX, Apparatus for detecting position of object capable of simultaneously detecting plural objects and detection method therefor.
상세보기
Addeo Eric J. (Long Valley NJ) Desmarias Joseph J. (Morris Plains NJ) Shtirmer Gennady (Morris Plains NJ), Audio processing system for teleconferencing system.
상세보기
Feng,Albert S.; Liu,Chen; Bilger,Robert C.; Jones,Douglas L.; Lansing,Charissa R.; O'Brien,William D.; Wheeler,Bruce C., Binaural signal processing techniques.
상세보기
Jones,Michael J.; Viola,Paul A., Detecting arbitrarily oriented objects in images.
상세보기
Viola,Paul A.; Jones,Michael J., Detecting pedestrians using patterns of motion and appearance in videos.
상세보기
Armbr?ster, Werner, Digital adaptive filter and acoustic echo canceller using the same.
상세보기
Miller William J. (N. Miami FL) Chiu Ran F. (Los Altos CA) Joerger Richard B. (Pembroke Pines FL) Newdeck Frank W. (Hatboro PA), Digital voice compression having a digitally controlled AGC circuit and means for including the true gain in the compres.
상세보기
Genter Roland E. (Falls Church VA), Echo canceler with subband attenuation and noise injection control.
상세보기
Okuda, Kozo, Echo canceling method, echo canceller and voice switch.
상세보기
Romesburg Eric Douglas, Echo canceller for non-linear circuits.
상세보기
Marton,Trygve Frederik; Aarnes,Ingvar Flaten, Echo canceller with reduced requirement for processing power.
상세보기
Terada, Yasuhiro; Matsui, Minoru, Echo sound signal suppressing apparatus.
상세보기
Velardo ; Jr. Patrick M. (Belleville NJ) Wynn Woodson D. (Basking Ridge NJ), Method and apparatus for reducing residual far-end echo in voice communication networks.
상세보기
Martinez Tony R. ; Moncur R. Brian ; Shepherd D. Lynn ; Parr Randall J. ; Wilson D. Randall ; Hansen Carl Hal, Method and apparatus for signal classification using a multilayer network.
상세보기
Jones,Michael J.; Viola,Paul, Method and system for object detection in digital images.
상세보기
Julstrom Stephen D. (Chicago IL), Microphone actuation control system suitable for teleconference systems.
상세보기
Sih Gilbert C. (San Diego CA), Network echo canceller.
상세보기
Ebenezer,Samuel Ponvarma, Noise reduction and comfort noise gain control using bark band weiner filter and linear attenuation.
상세보기
Viola,Paul A.; Jones,Michael J., Object recognition system.
상세보기
Visser,Erik; Lee,Te Won, Separation of target acoustic signals in a multi-transducer arrangement.
상세보기
Robert W Series GB, Speech analysis using multiple noise compensation.
상세보기
Greg C. Burnett ; John F. Holzrichter ; Lawrence C. Ng, System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech.
상세보기
Viola,Paul A.; Jones,Michael J., System and method for detecting objects in images.
상세보기
Tashev, Ivan, System and method for improving the precision of localization estimates.
상세보기
Rui,Yong, System and process for locating a speaker using 360 degree sound source localization.
상세보기
Rui,Yong; Florencio,Dinei, System and process for robust sound source localization.
상세보기
Etter,Walter, Systems and methods for far-end noise reduction and near-end noise compensation in a mixed time-frequency domain compander to improve signal quality in communications systems.
상세보기
Alexander Osovets, Voice conferencing system having local sound amplification.
상세보기

이 특허를 인용한 특허 (3)

Jones, David K.; Payne, Jamie; Holstege, Cody Jay; Schneider, Mark Andrew, Floor power distribution system.
상세보기
Jones, David K.; Payne, Jamie; Holstege, Cody Jay; Schneider, Mark Andrew, Floor power distribution system.
상세보기
Wilson, Scott Edward, Sound gathering system.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Identifying far-end sound 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (30)

이 특허를 인용한 특허 (3)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Identifying far-end sound 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (30)

이 특허를 인용한 특허 (3)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트