IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0836207
(2004-05-03)
|
등록번호 |
US-7567678
(2009-08-05)
|
우선권정보 |
KR-10-2003-0028340(2003-05-02); KR-10-2004-0013029(2004-02-26) |
발명자
/ 주소 |
- Kong, Dong geon
- Choi, Chang kyu
- Bang, Seok won
- Lee, Bon young
|
출원인 / 주소 |
- Samsung Electronics Co., Ltd.
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
10 인용 특허 :
6 |
초록
▼
A microphone array system including an input unit to receive sound signals using a plurality of microphones; a frequency splitter splitting each sound signal received into a plurality of narrowband signals; an average spatial covariance matrix estimator using spatial smoothing to obtain a spatial co
A microphone array system including an input unit to receive sound signals using a plurality of microphones; a frequency splitter splitting each sound signal received into a plurality of narrowband signals; an average spatial covariance matrix estimator using spatial smoothing to obtain a spatial covariance matrix for each frequency component of the sound signal, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the plurality of microphones, are obtained with respect to each frequency component of the sound signal and an average spatial covariance matrix is calculated; a signal source location detector to detect an incidence angle of the sound signal according to the average spatial covariance matrix calculated; a signal distortion compensator to calculates a weight for each frequency component of the sound signal based on the incidence angle of the sound signal and multiply the calculated weight by each frequency component.
대표청구항
▼
What is claimed is: 1. A microphone array system comprising: an input unit to receive sound signals using a plurality of microphones; a frequency splitter to split each sound signal received through the input unit into a plurality of narrowband signals; an average spatial covariance matrix estimato
What is claimed is: 1. A microphone array system comprising: an input unit to receive sound signals using a plurality of microphones; a frequency splitter to split each sound signal received through the input unit into a plurality of narrowband signals; an average spatial covariance matrix estimator which uses spatial smoothing to obtain a spatial covariance matrix for each frequency component of the sound signal, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the plurality of microphones, are obtained with respect to each frequency component of the sound signal processed by the frequency splitter and an average spatial covariance matrix is calculated; a signal source location detector to detect an incidence angle of the sound signal according to the average spatial covariance matrix calculated using the spatial smoothing; a signal distortion compensator to calculate a weight for each frequency component of the sound signal based on the incidence angle of the sound signal and multiply the calculated weight by each frequency component, thereby compensating for distortion of each frequency component; and a signal restoring unit to restore a sound signal using the distortion compensated frequency components, wherein the spatial smoothing is performed according to an equation where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 2. The microphone array system of claim 1, wherein the frequency splitter uses discrete Fourier transform to split each sound signal into the plurality of narrowband signals, and the signal restoring unit uses inverse discrete Fourier transform to restore the sound signal. 3. The microphone array system of claim 1, wherein the incidence angle θ1 of the sound signal is calculated using the Rk and a multiple signal classification (MUSIC) algorithm, and the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of the sound signal. 4. The microphone array system of claim 1, wherein the signal source location detector splits each sound signal received from the input unit into the frequency components, into which the frequency splitter splits the sound signal, and performs a multiple signal classification algorithm only to frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the incidence angle of the sound signal. 5. The microphone array system of claim 4, wherein the signal source location detector comprises: a speech signal detector to split each sound signal received from the input unit into the frequency components, into which the frequency splitter further splits the sound signal, to group the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components, and to measure a speech presence probability in each group; a group selector to select a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and an arithmetic unit to perform the multiple signal classification algorithm with respect to frequency components corresponding to the respective selected groups. 6. A speech recognition system comprising: a microphone array system; a feature extractor to extract a feature of a sound signal received from the microphone array system; a reference pattern storage unit to store reference patterns to be compared with the extracted feature; a comparator to compare the extracted feature with the reference patterns stored in the reference pattern storage unit; and a determiner to determine whether a speech is recognized based on the compared result, wherein the microphone array system comprises: an input unit to receive sound signals using a plurality of microphones; a frequency splitter to split each sound signal received through the input unit into a plurality of narrowband signals; an average spatial covariance matrix estimator which uses spatial smoothing to obtain a spatial covariance matrix for each frequency component of the sound signal, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the plurality of microphones, are obtained with respect to each frequency component of the sound signal processed by the frequency splitter and then an average spatial covariance matrix is calculated; a signal source location detector to detect an incidence angle of the sound signal according to the average spatial covariance matrix calculated using the spatial smoothing; a signal distortion compensator to calculate a weight for each frequency component of the sound signal based on the incidence angle of the sound signal and multiply the calculated weight by each frequency component, thereby compensating for distortion of each frequency component; and a signal restoring unit to restore a sound signal using the distortion compensated frequency components, wherein the spatial smoothing is performed according to an equation where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 7. The speech recognition system of claim 6, wherein the incidence angle θ1 of the sound signal is calculated using the Rk and a multiple signal classification (MUSIC) algorithm, and the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of the sound signal. 8. The speech recognition system of claim 6, wherein the signal source location detector splits each sound signal received from the input unit into the frequency components, into which the frequency splitter splits the sound signal, and performs a multiple signal classification multiple signal classification algorithm only to frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the incidence angle of the sound signal. 9. The speech recognition system of claim 8, wherein the signal source location detector comprises: a speech signal detector to split each sound signal received from the input unit into the frequency components, the frequency splitter further splits the sound signal, to group the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components, and to measure a speech presence probability in each group; a group selector to select a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and an arithmetic unit to perform the multiple signal classification algorithm with respect to frequency components corresponding to the respective selected groups. 10. A microphone array method comprising: receiving a plurality of wideband sound signals from an array having a plurality of microphones; splitting each wideband sound signal into a plurality of narrowbands; obtaining spatial covariance matrices for a plurality of virtual sub-arrays, which include a plurality of microphones constituting the array of the plurality of microphones, with respect to each narrowband using a predetermined scheme and averaging the obtained spatial covariance matrices, thereby obtaining an average spatial covariance matrix for each narrowband; calculating an incidence angle of each wideband sound signal using the average spatial covariance matrix for each narrowband and a predetermined algorithm; calculating weights to be respectively multiplied with the narrowbands according to the incidence angle of the wideband sound signal and multiplying the weights by the respective narrowbands; and restoring a wideband sound signal using the narrowbands after being multiplied by the weights respectively, wherein the obtaining of the spatial covariance matrices comprises performing the spatial smoothing according to an equation: where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 11. The microphone array method of claim 10, wherein the splitting is based on discrete Fourier transform, and the restoring is based on inverse discrete Fourier transform. 12. The microphone array method of claim 10, wherein the calculating of the incidence angle θ1 of the sound signal comprises calculating using the Rk and a multiple signal classification (MUSIC) algorithm, and the calculating and multiplying of the weights comprises applying the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of the sound signal. 13. The microphone array method of claim 10, wherein the calculating of the incidence angle comprises: splitting each sound signal received from the array having the plurality of microphones into the frequency components of the split sound signal; and performing a multiple signal classification algorithm with respect to only frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the incidence angle of the sound signal. 14. The microphone array method of claim 13, wherein the calculating of the incidence angle further comprises: splitting each sound signal received from the array having the plurality of microphones into the frequency components of the split sound signal; grouping the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components to measure a speech presence probability in each group; selecting a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and performing the multiple signal classification algorithm with respect to frequency components corresponding to the respective selected groups. 15. A microphone array method comprising: receiving wideband sound signals from an array having a plurality of microphones; splitting each wideband sound signal into a plurality of narrowbands; obtaining spatial covariance matrices for a plurality of virtual sub-arrays, which include a plurality of microphones constituting the array of the plurality of microphones, with respect to each narrowband using a predetermined scheme, and averaging the obtained spatial covariance matrices, thereby obtaining an average spatial covariance matrix for each narrowband; calculating an incidence angle of each wideband sound signal using the average spatial covariance matrix for each narrowband and a predetermined algorithm; calculating weights to be respectively multiplied with the narrowbands based on the incidence angle of the wideband sound signal and multiplying the weights by the respective narrowbands; restoring a wideband sound signal using the narrowbands after being multiplied by the weights respectively; extracting a feature of a sound signal received from the microphone array system; storing reference patterns to be compared with the extracted feature; comparing the extracted feature with the reference patterns stored; and determining based on a comparison result whether a speech is recognized, wherein the obtaining of the spatial covariance matrices comprises performing the spatial smoothing according to an equation: where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 16. The microphone array method of claim 15, wherein the splitting is based on discrete Fourier transform, and the restoring is based on inverse discrete Fourier transform. 17. The microphone array method of claim 15, wherein the calculating of the incidence angle θ1 of the sound signal comprises calculating using the Rk and a multiple signal classification (MUSIC) algorithm, and the calculating and multiplying of the weights comprises applying the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of the sound signal. 18. The microphone array method of claim 15, wherein the calculating step of the incidence angle, comprises: splitting each sound signal received from the array having the plurality of microphones into the frequency components of the split sound signal; and performing a multiple signal classification algorithm with respect to only frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the incidence angle of the sound signal. 19. The microphone array method of claim 18, wherein the calculating step of the incidence angle further comprises: splitting each sound signal received from the array having the plurality of microphones into the frequency components of the split sound signal; grouping the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components and measuring a speech presence probability in each group; selecting a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and performing the MUSIC algorithm with respect to frequency components corresponding to the respective selected groups. 20. A microphone array input type speech recognition system using spatial filtering and having a microphone array to receive sound signals, the system comprising: an average spatial covariance matrix estimator which uses spatial smoothing to produce a spatial covariance matrix for each frequency component of the received sound signals, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the microphones array, are obtained with respect to each frequency component of the sound signals and an average spatial covariance matrix is calculated; a signal source location detector to detect a source location of each of the sound signals using the average spatial covariance matrices; a signal distortion compensator to calculate a weight matrix to be multiplied by each frequency component using the detected source location of each of the sound signals in order to compensate for distortion due to noise and an echo of a sound signal; and an input unit to receive each of the sound signals, the input unit having an array of M microphones and a plurality of virtual sub-arrays of L microphones, wherein the spatial smoothing is performed according to an equation where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component inanarrowband, and Rk indicates an average spatial covariance matrix. 21. The microphone array input type speech recognition system of claim 20, further comprising a signal restoring unit to restore each of the sound signals using the distortion compensated frequency components. 22. The microphone array input type speech recognition system of claim 21, further comprising a speech recognition module to obtain a speech recognition result by comparing a feature of each of the restored sound signals with a plurality of reference patterns to determine a sound most similar to the restored sound signal. 23. The microphone array input type speech recognition system of claim 22, wherein the speech recognition module further comprises: a feature extractor unit to extract a feature vector of each of the restored sound signals; a reference pattern storage unit to store the reference patterns for a plurality of sounds; a determination unit to compare the extracted feature vector with the reference patterns stored to search for a sound similar to the restored sound signal, wherein the reference pattern with a highest correlation value exceeding a predetermined value is recognized as the sound signal. 24. The microphone array input type speech recognition system of claim 20, further comprising a frequency splitter to split each of the sound signals received through the input unit into a plurality of narrowband frequency signals. 25. The microphone array input type speech recognition system of claim 20, wherein the frequency splitter uses a discrete Fourier transform to split each of the sound signals received into narrowband frequency signals. 26. The microphone array input type speech recognition system of claim 25, wherein the signal source location detector splits each of the sound signals received from the input unit into the frequency components, into which the frequency splitter splits each of the sound signals, and performs a multiple signal classification algorithm only to frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the location of each of the sound signals. 27. The microphone array input type speech recognition system of claim 26, wherein the signal source location detector detects the location of each of the sound signals using a respective incidence angle. 28. The microphone array input type speech recognition system of claim 20, further comprising a signal restoring unit to restore each of the sound signals using the distortion compensated frequency components from the signal distortion compensator. 29. The microphone array input type speech recognition system of claim 28, wherein the signal restoring unit uses inverse a discrete Fourier transform to restore each of the sound signals. 30. The microphone array input type speech recognition system of claim 20, wherein the incidence angle θ1 of each of the sound signals is calculated using the Rk and a multiple signal classification algorithm, and the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of each of the sound signals. 31. The microphone array input type speech recognition system of claim 20, wherein the signal source location detector is a wideband multiple signal classification unit and the signal distortion compensator is a wideband minimum variance unit. 32. The microphone array input type speech recognition system of claim 20, further comprising a frequency bin selector to select frequency bins likely to include a speech signal according to a predetermined reference such that the signal source location detector performs the multiple signal classification algorithm with respect to only frequency components corresponding to the respective selected frequency bins. 33. The microphone array input type speech recognition system of claim 32, further comprising a discrete Fourier transformer to perform a fast Fourier transform on each of the input sound signals. 34. The microphone array input type speech recognition system of claim 32, wherein the signal source detector further comprises a peak detector to determine a direction of each of the sound signals. 35. A microphone array input type speech recognition method of receiving sound signals and using spatial filtering to acquire a high-quality speech signal for recognizing speech, the method comprising: obtaining a spatial covariance matrix for each frequency component of the received sound signals, using spatial smoothing, by which spatial covariance matrices for a plurality of virtual sub-arrays, which are configured in the microphones array, are obtained with respect to each frequency component of the sound signals and an average spatial covariance matrix is calculated; detecting a source location of each of the sound signals using the average spatial covariance matrices; and calculating a weight matrix to be multiplied by each frequency component using the detected source location of each of the sound signals in order to compensate for distortion due to noise and an echo of a sound signal, wherein the spatial smoothing is performed according to an equation where "p" indicates a number of the virtual sub-arrays, xk(i) indicates a vector of an i-th sub-array microphone input signal, "k" indicates a k-th frequency component in a narrowband, and Rk indicates an average spatial covariance matrix. 36. The microphone array input type speech recognition method of claim 35, further restoring each of the sound signals using the distortion compensated frequency components. 37. The microphone array input type speech recognition method of claim 36, further comprising obtaining a speech recognition result by comparing a feature of each of the restored sound signals with a plurality of reference patterns to determine a sound most similar to the restored sound signal. 38. The microphone array input type speech recognition method of claim 37, wherein the speech recognition module further comprises: extracting a feature vector of each of the restored sound signals; storing the reference patterns for a plurality of sounds; comparing the extracted feature vector with the reference patterns stored to search for a sound similar to the restored sound signal, wherein the reference pattern with a highest correlation value exceeding a predetermined value is recognized as the sound signal. 39. The microphone array input type speech recognition method of claim 35, further comprising splitting each of the sound signals received into a plurality of narrowband frequency signals. 40. The microphone array input type speech recognition method of claim 39, further comprising receiving each of the sound signals through an array of M microphones a plurality of virtual sub-arrays of L microphones. 41. The microphone array input type speech recognition method of claim 40, further comprising using a discrete Fourier transform to split each of the sound signals into narrowband frequency signals. 42. The microphone array input type speech recognition method of claim 39, wherein the detecting the source location of each of the sound signals, comprises: splitting each of the sound signals received into the frequency components of each of the split sound signals; and performing a multiple signal classification algorithm with respect to only frequency components selected according to a predetermined reference from among the split frequency components, thereby determining the source location of each of the sound signals. 43. The microphone array input type speech recognition method of claim 42, wherein the detecting the source location of each of the sound signals, further comprises: splitting each of the sound signals received into the frequency components of each of the split sound signals; grouping each of the sound signals having the same frequency component, thereby generating a plurality of groups for the respective frequency components to measure a speech presence probability in each group; selecting a predetermined number of groups in descending order of speech presence probability from among the plurality of groups; and performing the multiple signal classification algorithm with respect to frequency components corresponding to the respective selected groups. 44. The microphone array input type speech recognition method of claim 35, further comprising restoring each of the sound signals using the distortion compensated frequency components. 45. The microphone array input type speech recognition method of claim 35, wherein the restoring is calculated using a discrete Fourier transform. 46. The microphone array input type speech recognition method of claim 35, wherein the incidence angle θ1 of each of the sound signals is calculated using the Rk and a multiple signal classification algorithm, and the calculated incidence angle is applied to to calculate a weight to be multiplied by each frequency component of each of the sound signals. 47. The microphone array input type speech recognition method of claim 35, further comprising selecting frequency bins likely to include a speech signal according to a predetermined reference such that the multiple signal classification algorithm is performed with respect to only frequency components corresponding to the respective selected frequency bins. 48. The microphone array input type speech recognition method of claim 47, further comprising performing a fast Fourier transform on each of the input sound signals. 49. The microphone array input type speech recognition method of claim 47, further comprising detecting a peak of the each of the sound signals to determine a direction of each of the sound signals.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.