[특허]Microphone array method and system, and speech recognition method and system using the same

Microphone array method and system, and speech recognition method and system using the same 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	H04R-003/00
출원번호	UP-0836207 (2004-05-03)
등록번호	US-7567678 (2009-08-05)
우선권정보	KR-10-2003-0028340(2003-05-02); KR-10-2004-0013029(2004-02-26)
발명자 / 주소	Kong, Dong geon Choi, Chang kyu Bang, Seok won Lee, Bon young
출원인 / 주소	Samsung Electronics Co., Ltd.
대리인 / 주소	Staas & Halsey LLP
인용정보	피인용 횟수 : 10 인용 특허 : 6

초록 ▼

A microphone array system including an input unit to receive sound signals using a plurality of microphones; a frequency splitter splitting each sound signal received into a plurality of narrowband signals; an average spatial covariance matrix estimator using spatial smoothing to obtain a spatial covariance matrix for each frequency component of the sound signal, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the plurality of microphones, are obtained with respect to each frequency component of the sound signal and an average spatial covariance matrix is calculated; a signal source location detector to detect an incidence angle of the sound signal according to the average spatial covariance matrix calculated; a signal distortion compensator to calculates a weight for each frequency component of the sound signal based on the incidence angle of the sound signal and multiply the calculated weight by each frequency component.

대표청구항 ▼

What is claimed is: 1. A microphone array system comprising: an input unit to receive sound signals using a plurality of microphones; a frequency splitter to split each sound signal received through the input unit into a plurality of narrowband signals; an average spatial covariance matrix estimator which uses spatial smoothing to obtain a spatial covariance matrix for each frequency component of the sound signal, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the plurality of microphones, are obtained with respect to each frequency component of the sound signal processed by the frequency splitter and an average spatial covariance matrix is calculated; a signal source location detector to detect an incidence angle of the sound signal according to the average spatial covariance matrix calculated using the spatial smoothing; a signal distortion compensator to calculate a weight for each frequency component of the sound signal based on the incidence angle of the sound signal and multiply the calculated weight by each frequency component, thereby compensating for distortion of each frequency component; and a signal restoring unit to restore a sound signal using the distortion compensated frequency components, wherein the spatial smoothing is performed according to an equation where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 2. The microphone array system of claim 1, wherein the frequency splitter uses discrete Fourier transform to split each sound signal into the plurality of narrowband signals, and the signal restoring unit uses inverse discrete Fourier transform to restore the sound signal. 3. The microphone array system of claim 1, wherein the incidence angle θ1 of the sound signal is calculated using the Rk and a multiple signal classification (MUSIC) algorithm, and the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of the sound signal. 4. The microphone array system of claim 1, wherein the signal source location detector splits each sound signal received from the input unit into the frequency components, into which the frequency splitter splits the sound signal, and performs a multiple signal classification algorithm only to frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the incidence angle of the sound signal. 5. The microphone array system of claim 4, wherein the signal source location detector comprises: a speech signal detector to split each sound signal received from the input unit into the frequency components, into which the frequency splitter further splits the sound signal, to group the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components, and to measure a speech presence probability in each group; a group selector to select a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and an arithmetic unit to perform the multiple signal classification algorithm with respect to frequency components corresponding to the respective selected groups. 6. A speech recognition system comprising: a microphone array system; a feature extractor to extract a feature of a sound signal received from the microphone array system; a reference pattern storage unit to store reference patterns to be compared with the extracted feature; a comparator to compare the extracted feature with the reference patterns stored in the reference pattern storage unit; and a determiner to determine whether a speech is recognized based on the compared result, wherein the microphone array system comprises: an input unit to receive sound signals using a plurality of microphones; a frequency splitter to split each sound signal received through the input unit into a plurality of narrowband signals; an average spatial covariance matrix estimator which uses spatial smoothing to obtain a spatial covariance matrix for each frequency component of the sound signal, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the plurality of microphones, are obtained with respect to each frequency component of the sound signal processed by the frequency splitter and then an average spatial covariance matrix is calculated; a signal source location detector to detect an incidence angle of the sound signal according to the average spatial covariance matrix calculated using the spatial smoothing; a signal distortion compensator to calculate a weight for each frequency component of the sound signal based on the incidence angle of the sound signal and multiply the calculated weight by each frequency component, thereby compensating for distortion of each frequency component; and a signal restoring unit to restore a sound signal using the distortion compensated frequency components, wherein the spatial smoothing is performed according to an equation where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 7. The speech recognition system of claim 6, wherein the incidence angle θ1 of the sound signal is calculated using the Rk and a multiple signal classification (MUSIC) algorithm, and the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of the sound signal. 8. The speech recognition system of claim 6, wherein the signal source location detector splits each sound signal received from the input unit into the frequency components, into which the frequency splitter splits the sound signal, and performs a multiple signal classification multiple signal classification algorithm only to frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the incidence angle of the sound signal. 9. The speech recognition system of claim 8, wherein the signal source location detector comprises: a speech signal detector to split each sound signal received from the input unit into the frequency components, the frequency splitter further splits the sound signal, to group the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components, and to measure a speech presence probability in each group; a group selector to select a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and an arithmetic unit to perform the multiple signal classification algorithm with respect to frequency components corresponding to the respective selected groups. 10. A microphone array method comprising: receiving a plurality of wideband sound signals from an array having a plurality of microphones; splitting each wideband sound signal into a plurality of narrowbands; obtaining spatial covariance matrices for a plurality of virtual sub-arrays, which include a plurality of microphones constituting the array of the plurality of microphones, with respect to each narrowband using a predetermined scheme and averaging the obtained spatial covariance matrices, thereby obtaining an average spatial covariance matrix for each narrowband; calculating an incidence angle of each wideband sound signal using the average spatial covariance matrix for each narrowband and a predetermined algorithm; calculating weights to be respectively multiplied with the narrowbands according to the incidence angle of the wideband sound signal and multiplying the weights by the respective narrowbands; and restoring a wideband sound signal using the narrowbands after being multiplied by the weights respectively, wherein the obtaining of the spatial covariance matrices comprises performing the spatial smoothing according to an equation: where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 11. The microphone array method of claim 10, wherein the splitting is based on discrete Fourier transform, and the restoring is based on inverse discrete Fourier transform. 12. The microphone array method of claim 10, wherein the calculating of the incidence angle θ1 of the sound signal comprises calculating using the Rk and a multiple signal classification (MUSIC) algorithm, and the calculating and multiplying of the weights comprises applying the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of the sound signal. 13. The microphone array method of claim 10, wherein the calculating of the incidence angle comprises: splitting each sound signal received from the array having the plurality of microphones into the frequency components of the split sound signal; and performing a multiple signal classification algorithm with respect to only frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the incidence angle of the sound signal. 14. The microphone array method of claim 13, wherein the calculating of the incidence angle further comprises: splitting each sound signal received from the array having the plurality of microphones into the frequency components of the split sound signal; grouping the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components to measure a speech presence probability in each group; selecting a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and performing the multiple signal classification algorithm with respect to frequency components corresponding to the respective selected groups. 15. A microphone array method comprising: receiving wideband sound signals from an array having a plurality of microphones; splitting each wideband sound signal into a plurality of narrowbands; obtaining spatial covariance matrices for a plurality of virtual sub-arrays, which include a plurality of microphones constituting the array of the plurality of microphones, with respect to each narrowband using a predetermined scheme, and averaging the obtained spatial covariance matrices, thereby obtaining an average spatial covariance matrix for each narrowband; calculating an incidence angle of each wideband sound signal using the average spatial covariance matrix for each narrowband and a predetermined algorithm; calculating weights to be respectively multiplied with the narrowbands based on the incidence angle of the wideband sound signal and multiplying the weights by the respective narrowbands; restoring a wideband sound signal using the narrowbands after being multiplied by the weights respectively; extracting a feature of a sound signal received from the microphone array system; storing reference patterns to be compared with the extracted feature; comparing the extracted feature with the reference patterns stored; and determining based on a comparison result whether a speech is recognized, wherein the obtaining of the spatial covariance matrices comprises performing the spatial smoothing according to an equation: where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 16. The microphone array method of claim 15, wherein the splitting is based on discrete Fourier transform, and the restoring is based on inverse discrete Fourier transform. 17. The microphone array method of claim 15, wherein the calculating of the incidence angle θ1 of the sound signal comprises calculating using the Rk and a multiple signal classification (MUSIC) algorithm, and the calculating and multiplying of the weights comprises applying the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of the sound signal. 18. The microphone array method of claim 15, wherein the calculating step of the incidence angle, comprises: splitting each sound signal received from the array having the plurality of microphones into the frequency components of the split sound signal; and performing a multiple signal classification algorithm with respect to only frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the incidence angle of the sound signal. 19. The microphone array method of claim 18, wherein the calculating step of the incidence angle further comprises: splitting each sound signal received from the array having the plurality of microphones into the frequency components of the split sound signal; grouping the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components and measuring a speech presence probability in each group; selecting a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and performing the MUSIC algorithm with respect to frequency components corresponding to the respective selected groups. 20. A microphone array input type speech recognition system using spatial filtering and having a microphone array to receive sound signals, the system comprising: an average spatial covariance matrix estimator which uses spatial smoothing to produce a spatial covariance matrix for each frequency component of the received sound signals, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the microphones array, are obtained with respect to each frequency component of the sound signals and an average spatial covariance matrix is calculated; a signal source location detector to detect a source location of each of the sound signals using the average spatial covariance matrices; a signal distortion compensator to calculate a weight matrix to be multiplied by each frequency component using the detected source location of each of the sound signals in order to compensate for distortion due to noise and an echo of a sound signal; and an input unit to receive each of the sound signals, the input unit having an array of M microphones and a plurality of virtual sub-arrays of L microphones, wherein the spatial smoothing is performed according to an equation where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component inanarrowband, and Rk indicates an average spatial covariance matrix. 21. The microphone array input type speech recognition system of claim 20, further comprising a signal restoring unit to restore each of the sound signals using the distortion compensated frequency components. 22. The microphone array input type speech recognition system of claim 21, further comprising a speech recognition module to obtain a speech recognition result by comparing a feature of each of the restored sound signals with a plurality of reference patterns to determine a sound most similar to the restored sound signal. 23. The microphone array input type speech recognition system of claim 22, wherein the speech recognition module further comprises: a feature extractor unit to extract a feature vector of each of the restored sound signals; a reference pattern storage unit to store the reference patterns for a plurality of sounds; a determination unit to compare the extracted feature vector with the reference patterns stored to search for a sound similar to the restored sound signal, wherein the reference pattern with a highest correlation value exceeding a predetermined value is recognized as the sound signal. 24. The microphone array input type speech recognition system of claim 20, further comprising a frequency splitter to split each of the sound signals received through the input unit into a plurality of narrowband frequency signals. 25. The microphone array input type speech recognition system of claim 20, wherein the frequency splitter uses a discrete Fourier transform to split each of the sound signals received into narrowband frequency signals. 26. The microphone array input type speech recognition system of claim 25, wherein the signal source location detector splits each of the sound signals received from the input unit into the frequency components, into which the frequency splitter splits each of the sound signals, and performs a multiple signal classification algorithm only to frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the location of each of the sound signals. 27. The microphone array input type speech recognition system of claim 26, wherein the signal source location detector detects the location of each of the sound signals using a respective incidence angle. 28. The microphone array input type speech recognition system of claim 20, further comprising a signal restoring unit to restore each of the sound signals using the distortion compensated frequency components from the signal distortion compensator. 29. The microphone array input type speech recognition system of claim 28, wherein the signal restoring unit uses inverse a discrete Fourier transform to restore each of the sound signals. 30. The microphone array input type speech recognition system of claim 20, wherein the incidence angle θ1 of each of the sound signals is calculated using the Rk and a multiple signal classification algorithm, and the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of each of the sound signals. 31. The microphone array input type speech recognition system of claim 20, wherein the signal source location detector is a wideband multiple signal classification unit and the signal distortion compensator is a wideband minimum variance unit. 32. The microphone array input type speech recognition system of claim 20, further comprising a frequency bin selector to select frequency bins likely to include a speech signal according to a predetermined reference such that the signal source location detector performs the multiple signal classification algorithm with respect to only frequency components corresponding to the respective selected frequency bins. 33. The microphone array input type speech recognition system of claim 32, further comprising a discrete Fourier transformer to perform a fast Fourier transform on each of the input sound signals. 34. The microphone array input type speech recognition system of claim 32, wherein the signal source detector further comprises a peak detector to determine a direction of each of the sound signals. 35. A microphone array input type speech recognition method of receiving sound signals and using spatial filtering to acquire a high-quality speech signal for recognizing speech, the method comprising: obtaining a spatial covariance matrix for each frequency component of the received sound signals, using spatial smoothing, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the microphones array, are obtained with respect to each frequency component of the sound signals and an average spatial covariance matrix is calculated; detecting a source location of each of the sound signals using the average spatial covariance matrices; and calculating a weight matrix to be multiplied by each frequency component using the detected source location of each of the sound signals in order to compensate for distortion due to noise and an echo of a sound signal, wherein the spatial smoothing is performed according to an equation where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 36. The microphone array input type speech recognition method of claim 35, further restoring each of the sound signals using the distortion compensated frequency components. 37. The microphone array input type speech recognition method of claim 36, further comprising obtaining a speech recognition result by comparing a feature of each of the restored sound signals with a plurality of reference patterns to determine a sound most similar to the restored sound signal. 38. The microphone array input type speech recognition method of claim 37, wherein the speech recognition module further comprises: extracting a feature vector of each of the restored sound signals; storing the reference patterns for a plurality of sounds; comparing the extracted feature vector with the reference patterns stored to search for a sound similar to the restored sound signal, wherein the reference pattern with a highest correlation value exceeding a predetermined value is recognized as the sound signal. 39. The microphone array input type speech recognition method of claim 35, further comprising splitting each of the sound signals received into a plurality of narrowband frequency signals. 40. The microphone array input type speech recognition method of claim 39, further comprising receiving each of the sound signals through an array of M microphones a plurality of virtual sub-arrays of L microphones. 41. The microphone array input type speech recognition method of claim 40, further comprising using a discrete Fourier transform to split each of the sound signals into narrowband frequency signals. 42. The microphone array input type speech recognition method of claim 39, wherein the detecting the source location of each of the sound signals, comprises: splitting each of the sound signals received into the frequency components of each of the split sound signals; and performing a multiple signal classification algorithm with respect to only frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the source location of each of the sound signals. 43. The microphone array input type speech recognition method of claim 42, wherein the detecting the source location of each of the sound signals, further comprises: splitting each of the sound signals received into the frequency components of each of the split sound signals; grouping each of the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components to measure a speech presence probability in each group; selecting a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and performing the multiple signal classification algorithm with respect to frequency components corresponding to the respective selected groups. 44. The microphone array input type speech recognition method of claim 35, further comprising restoring each of the sound signals using the distortion compensated frequency components. 45. The microphone array input type speech recognition method of claim 35, wherein the restoring is calculated using a discrete Fourier transform. 46. The microphone array input type speech recognition method of claim 35, wherein the incidence angle θ1 of each of the sound signals is calculated using the Rk and a multiple signal classification algorithm, and the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of each of the sound signals. 47. The microphone array input type speech recognition method of claim 35, further comprising selecting frequency bins likely to include a speech signal according to a predetermined reference such that the multiple signal classification algorithm is performed with respect to only frequency components corresponding to the respective selected frequency bins. 48. The microphone array input type speech recognition method of claim 47, further comprising performing a fast Fourier transform on each of the input sound signals. 49. The microphone array input type speech recognition method of claim 47, further comprising detecting a peak of the each of the sound signals to determine a direction of each of the sound signals.

이 특허에 인용된 특허 (6)

Balan,Radu Victor; Rosca,Justinian, Apparatus and method for estimating the direction of arrival of a source signal using a microphone array.
상세보기
Balan, Radu Victor; Rosca, Justinian, Method and apparatus for noise filtering.
상세보기
Robbe Franois (Herblay FRX) Dartois Luc (Carrieres Sous Poissy FRX), Method of using a dominant angle of incidence to reduce acoustic noise in a speech signal.
상세보기
Balan,Radu Victor; Rosca,Justinian; Beaugeant,Christophe, Multichannel voice detection in adverse environments.
상세보기
Yamada Youichi (Tokyo JPX) Takahashi Keiko (Tokyo JPX), Speech recognition system which avoids ambiguity when matching frequency spectra by employing an additional verbal featu.
상세보기
Marash, Joseph; Berdugo, Baruch, Super directional beamforming design and implementation.
상세보기

이 특허를 인용한 특허 (10)

Sadek, Ramy S.; Crump, Edward Dietz; Pollack, Joshua, Directed audio for speech recognition.
상세보기
Li, Bo; Lou, Shasha; Li, Song, Method and device for noise reduction control using microphone array.
상세보기
Vitte, Guillaume; Seris, Julie; Pinto, Guillaume, Method of filtering non-steady lateral noise for a multi-microphone audio device, in particular a “hands-free” telephone device for a motor vehicle.
상세보기
Cho, Hoon Young; Lee, Yun Keun; Kang, Jeom Ja; Kang, Byung Ok; Kim, Kap Kee; Lee, Sung Joo; Jung, Ho Young; Chung, Hoon; Park, Jeon Gue; Jeon, Hyung Bae, Microphone array based speech recognition system and target speech extracting method of the system.
상세보기
Ramprashad, Sean A.; Thornburg, Harvey D.; Krishnaswamy, Arvindh; Lindahl, Aram M., Multi-microphone speech recognition systems and related techniques.
상세보기
Ramprashad, Sean A.; Thornburg, Harvey D.; Krishnaswamy, Arvindh; Lindahl, Aram M., Multi-microphone speech recognition systems and related techniques.
상세보기
Gaddam, Vasanth R.; Ghosh, Monisha, Radio sensor for detecting wireless microphone signals and a method thereof.
상세보기
Kobayashi, Tetsunori; Akagiri, Kenzo; Kanba, Satoshi, Sound source separation system, sound source separation method, and acoustic signal acquisition device.
상세보기
Tamaru, Takuya, Voice conference system.
상세보기
Tamaru, Takuya, Voice conference system.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Microphone array method and system, and speech recognition method and system using the same 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (6)

이 특허를 인용한 특허 (10)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Microphone array method and system, and speech recognition method and system using the same 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (6)

이 특허를 인용한 특허 (10)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트