IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0734716
(2000-12-13)
|
우선권정보 |
JP-0354182 (1999-12-14) |
발명자
/ 주소 |
|
출원인 / 주소 |
- Matsushita Electric Industrial Co., Ltd.
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
27 인용 특허 :
7 |
초록
▼
A method and apparatus enabling information including respective angular directions to be obtained for one or more sound sources includes a sound source direction estimation section for frequency-domain and time-domain processing of sets of output signals from a microphone array to derive successive
A method and apparatus enabling information including respective angular directions to be obtained for one or more sound sources includes a sound source direction estimation section for frequency-domain and time-domain processing of sets of output signals from a microphone array to derive successive estimated angular directions of each of the sound sources. The estimated directions can be utilized by a passage detection section to detect when a sound source is currently moving past the microphone array and the direction of the sound source at the time point when such passage detection is achieved, and a motion velocity detection section which is triggered by such passage detection to calculate the velocity of the passing sound source by using successively obtained estimated directions. In addition it becomes possible to produce directivity of the microphone array, oriented along the direction of a sound source which is moving past the microphone array, enabling accurate monitoring of sound levels of respective sound sources.
대표청구항
▼
1. A method of estimating a direction of a sound source, as an angular value in relation to a fixed position, comprising steps of:in a fixed-length time window, operating on respective microphone output signals resulting from reception of sound emitted from said sound source during said time window,
1. A method of estimating a direction of a sound source, as an angular value in relation to a fixed position, comprising steps of:in a fixed-length time window, operating on respective microphone output signals resulting from reception of sound emitted from said sound source during said time window, said microphone output signals produced from an array of M microphones, where M is a plural integer, to thereby extract from each of said microphone output signals a time-axis signal portion and thereby obtain a set of M audio signal portions with said set corresponding to said time window; applying frequency analysis to separate each said signal portion into a plurality of components corresponding to respectively different ones of a fixed set of frequencies; for each frequency of said fixed set, processing said components to obtain data expressing a frequency-based direction of a sound source with respect to a position in said microphone array, and calculating an average of respective frequency-based directions obtained for all frequencies of said fixed set, to thereby obtain an estimated direction corresponding to one time window; and successively repeating said succession of steps for each of a plurality of time windows that are of respectively identical time duration, to obtain a plurality of estimated directions respectively corresponding to said plurality of time windows. 2. The method according to claim 1, further comprising a step of:for each of said time windows, calculating an average direction as an average of an estimated direction corresponding to said each time window and respective estimated directions corresponding to a fixed plurality of time windows which directly precede said each time window, and outputting said average direction as a finally obtained estimated direction corresponding to said each time window. 3. The method according to claim 1, wherein said processing applied for each frequency of said set of frequencies comprises deriving a plurality of values of received signal power with said values corresponding to respectively different directions in relation to said position in the microphone array, and finding a one of said directions for which said received signal power has a maximum value, and wherein said method further comprises a step of:judging said direction for which said signal power has a maximum value, to determine whether said direction is within a predetermined range, and when said direction is found to be outside said range, excluding said direction from calculations performed to obtain said estimated direction of said sound source. 4. The method according to claim 1, further comprising a step of:judging when a sound source has passed through a specific direction, by comparing said successive estimated directions obtained for said sound source with a predetermined passage detection range of directions, and generating data expressing a passage detection result when said sound source is found to have passed through said specific direction. 5. The method according to claim 4, wherein said judgement step is based upon:detecting a number of times for which estimated directions obtained for said sound source are within said passage detection range of directions; and, determining that said sound source has passed through at least an initial direction of said passage detection range of directions when it is found that said number of times attains a predetermined threshold number within a fixed time interval which commences after said sound source has entered said passage detection range of directions. 6. The method according to claim 5, wherein said judgement step is performed by successive steps of:detecting an initial time window as a time window at which an estimated direction obtained for said sound source is within a predetermined initial part of said passage detection range of directions; thereafter, while obtaining successive count values of said time windows, obtaining successive count values of occurrences of said estimated directions obtained for said sound source being within said passage detection range of directions and comparing each said occurrence count value with said threshold number; when said occurrence count values are found to attain said threshold number before said time window count values attain a predetermined maximum count value, generating output data as a passage detection result, to indicate that said sound source has passed through at least said initial part of said passage detection range of directions. 7. The method according to claim 4, further comprising a step of initiating recording of a microphone output signal from at least one of said microphones when a sound source is detected as having passed through said specific directions as indicated by generation of a passage detection result.8. The method according to claim 7, wherein a time-axis portion of said microphone output signal which commenced prior to the time at which said sound source passed through said specific direction is recorded.9. The method according to claim 8, comprising steps of:temporarily storing each of successively obtained sets of audio data derived from an audio output signal of at least one of said microphones; and, when a passage detection result is generated, reading out a currently stored one of said sets of audio data and recording said set of audio data. 10. The method according to claim 4, further comprising steps of:judging when a sound source has passed through a specific direction, by comparing said successive estimated directions obtained for said sound source with a predetermined passage detection range of directions, and generating data expressing a passage detection result when said sound source is found to have passed through said specific direction; and when said passage detection result is generated, judging a direction of motion of said sound source, based upon successively obtained estimated directions obtained for said sound source. 11. The method according to claim 10, wherein said judgement of direction is performed based upon a difference between an estimated direction obtained prior to a time of generating said passage detection result and an estimated direction estimated direction obtained at or subsequent to said time of generating the passage detection result.12. The method according to claim 11, wherein said step of judging direction comprises:temporarily registering each of successively obtained sets of said estimated directions in a buffer; when a passage detection result is generated, reading out from said buffer a first estimated direction which was obtained at a point in time preceding a time of generating said passage detection result; calculating the sign of the difference between said first estimated direction and an estimated direction obtained subsequent to said first estimated direction, with said direction of motion being indicated by said sign. 13. The method according to claim 10, wherein said judgement of direction is performed based upon a difference between an estimated direction obtained at a time of generating said passage detection result and an estimated direction obtained subsequent to said time of generating the passage detection result.14. The method according to claim 13, wherein said step of judging direction comprises:when a passage detection result is generated, temporarily registering a first estimated direction, which is obtained at that time; and, after a predetermined number of said time windows have elapsed following generation of said passage detection result, calculating the sign of a difference between said first estimated direction and a currently obtained one of said estimated directions, with said direction of motion being indicated by said sign. 15. The method according to claim 1, further comprising a step of judging whether a sound source is stationary, based upon successively obtained ones of said estimated directions of said sound source.16. The method according to claim 15, wherein said step of judging whether a sound source is stationary comprises calculating the variance of said successively obtained estimated directions of said sound source within each of respective fixed observation intervals, and judging that the sound source is stationary if said variance is found to be lower than a predetermined threshold value.17. The method according to claim 16, further comprising:calculating an average of said estimated directions within each of said observation intervals; and judging that the sound source is stationary if said variance is found to be lower than a predetermined threshold value and also said average direction is within a predetermined range of directions. 18. The method according to claim 1 wherein said microphone array is disposed at a known distance from a motion path of said sound source, further comprising steps of:judging when a sound source has passed through a specific direction, by comparing said successive estimated directions obtained for said sound source with a predetermined passage detection range of directions, and generating data expressing a passage detection result when said sound source is found to have passed through said specific direction; when said passage detection result is generated, judging the linear velocity of said sound source based upon successively obtained estimated directions obtained for said sound source. 19. The method according to claim 18, wherein said step of judgement of linear velocity comprises:measuring an amount of time required for successive estimated directions obtained for said sound source to change by a predetermined angular amount; calculating the angular velocity of said sound source based on said amount of time and said predetermined angular amount; and calculating an approximate value of linear velocity of said sound source based on said angular velocity and said known distance of said microphone array from said motion path. 20. The method according to claim 19, wherein said amount of time is measured from a time point preceding the generation of said passage detection result up to the time point at which said passage detection result is generated.21. The method according to claim 19, wherein said amount of time is measured from the time point at which said passage detection result is generated up to a subsequent time point.22. The method according to claim 19, wherein said amount of time is measured from a time point preceding the generation of said passage detection result up to a time point subsequent to the time point at which said passage detection result is generated.23. The method according to claim 18, wherein said step of judgement of linear velocity comprises:measuring an amount of change of successive estimated directions obtained for said sound source, expressed as an angular amount, which occurs within a predetermined time interval; calculating the angular velocity of said sound source based on the duration of said predetermined time interval and said angular amount; and calculating an approximate value of linear velocity of said sound source based on said angular velocity and said known distance of said microphone array from said motion path. 24. The method according to claim 23, wherein said amount of change of estimated directions is measured from an estimated direction obtained prior to the time point at which said passage detection result is generated up to an estimated direction obtained at the time point at which said passage detection result is generated.25. The method according to claim 23, wherein said amount of change of estimated directions is measured from an estimated direction obtained at the time point when said passage detection result is generated up to an estimated direction obtained at a time point subsequent to that at which said passage detection result is generated.26. The method according to claim 23, wherein said amount of change of estimated directions is measured from an estimated direction obtained prior to the time point at which said passage detection result is generated up to an estimated direction obtained subsequent to the time point at which said passage detection result is generated.27. The method according to claim 1, further comprising a step of utilizing said estimated directions obtained for a sound source to orient a directivity of said microphone array along a current direction of said sound source.28. The method according to claim 27, wherein a single directivity of said microphone array is oriented along said current direction of said sound source by applying specific degrees of phase shift processing to respective output signals produced from said microphones and summing resultant phase-shifted signals.29. The method according to claim 27, comprising steps of:judging when a sound source has passed through a specific direction, based on said successive estimated directions obtained for said sound source, and generating data expressing a passage detection result when said sound source is found to have passed through said specific direction; orienting said microphone array directivity along a specific one of said estimated directions, said specific estimated direction being obtained at a time point substantially close to a time point at which said passage detection result is generated; and obtaining a monitoring signal expressing a sound being emitted from said sound source, as a combination of said microphone output signals with said directivity applied. 30. The method according to claim 1, further comprising steps of:establishing a plurality of fixedly predetermined directivities for said microphone array; judging when a sound source has passed through a specific direction, based on said successive estimated directions obtained for said sound source, and generating data expressing a passage detection result when said sound source is found to have passed through said specific direction; when said passage detection result is obtained for said sound source, selecting one of said plurality of directivities based upon an estimated direction obtained for said sound source at a time point substantially close to a time point at which said passage detection result is generated; and, obtaining a monitoring signal expressing a sound being emitted from said sound source, as a combination of said microphone output signals with said selected one of the directivities applied. 31. An apparatus for estimating a direction of a sound source, comprising:waveform extraction means (103) for operating during a fixed-length time window on respective microphone output signals produced from an array of M microphones, where M is a plural integer, to extract from each of said microphone output signals a time-axis signal portion within said time window, and thereby obtain a set of M audio signal portions corresponding to said time window; frequency analyzer means (104) for applying frequency analysis to said set of M audio signal portions to separate each said signal portion into a plurality of components corresponding to respectively different ones of a fixed set of frequencies; and processing means (107, 108, 109, 110, 106) for operating on said components corresponding to said set of M audio signal portions to obtain, for each frequency of said fixed set of frequencies, data expressing an estimated direction of said sound source with respect to a position in said microphone array, to thereby obtain an estimated direction of said sound source corresponding to said time window; wherein said apparatus operates during successive plurality of time windows that are of respectively identical duration, to obtain a plurality of estimate directions of said sound source respectively corresponding to said plurality of time windows. 32. The apparatus according to claim 31, further comprising: frequency-based averaging means (114) for obtaining an average of respective estimated directions obtained for said fixed set of frequencies within each of said time windows, to thereby obtain successive frequency-average estimated directions of said sound source corresponding to respective ones of said time windows.33. The apparatus according to claim 32, further comprising means for obtaining respective averages of fixed-length sets of said frequency-averaged estimated directions obtained in successive time windows, to thereby obtain successive time-averaged estimated directions of said sound source.34. The apparatus according to claim 31, wherein said processing applied by said processing means for each frequency of said set of frequencies comprises deriving a plurality of values of received signal power with said values corresponding to respectively different directions in relation to said position in the microphone array, and finding a one of said directions for which said received signal power has a maximum value, and wherein said processing means further comprises out-of range value exclusion means (112, 111) for:judging said direction for which said signal power has a maximum value, to determine whether said direction is within a predetermined range, and when said direction is found to be outside said range, excluding said direction from calculations performed to obtain said estimated direction of said sound source. 35. The apparatus according to claim 31, further comprising passage detection means (216) including judgement means for operating on said successive estimated directions obtained for a sound source in relation to a predetermined passage detection range of directions, to generate data expressing a passage detection result when said sound source is found to have passed through a specific direction.36. The apparatus according to claim 35, wherein said passage detection means comprises:direction range setting means (211) for specifying said passage detection range of directions; in-range occurrence number calculation means (212) for detecting a number of times for which estimated directions obtained for said sound source are within said passage detection range of directions; and, passage detection judgement means (213) for determining that said sound source has passed through at least an initial direction of said passage detection range of directions when said number of times attains a predetermined threshold number within a fixed time interval which commences after said sound source has entered said passage detection range of directions. 37. The apparatus according to claim 36, wherein said passage detection judgement means (213) comprises means for:detecting an initial time window as a time window at which an estimated direction obtained for said sound source is within a predetermined initial part of said passage detection range of directions; thereafter, while obtaining successive count values of said time windows, obtaining successive count values of occurrences of said estimated directions obtained for said sound source being within said passage detection range of directions and comparing each said occurrence count value with said threshold number; when said occurrence count values are found to attain said threshold number before said time window count values attain a predetermined maximum count value, generating output data as a passage detection result, to indicate that said sound source has passed through at least said initial direction of said passage detection range of directions. 38. The apparatus according to claim 35, further comprising means for initiating recording of a microphone output signal from at least one of said microphones when a sound source is detected as having passed through said specific direction, as indicated by generation of a passage detection result.39. The apparatus according to claim 38, comprising:buffer means (307) for temporarily storing each of successively obtained sets of audio data derived from an output signal of at least one of said microphones; data extraction means (308) responsive to generation of a passage detection result for reading out a currently stored one of said sets of audio data; and, recording means (309) for recording said sets of audio data. 40. The apparatus according to claim 31, further comprising means for determining whether a sound source is stationary, based upon successively obtained ones of said estimated directions of said sound source.41. The apparatus according to claim 40, wherein said means for determining whether a sound source is stationary comprises:variance calculating means (406) for calculating the variance of respective sets of said successively obtained estimated directions within each of fixed observation intervals; and, stationary sound source detection means (407) for judging said variances, and for determining that a sound source is stationary when a variance of estimated directions obtained for said sound source is found to be lower than a predetermined threshold value. 42. The apparatus according to claim 41, further comprising moving average calculation means (405) for calculating respective averages of said sets of estimated directions within each of said observation intervals;wherein said stationary sound source detection means (407) judges that said sound source is stationary when said variance is found to be lower than said predetermined threshold value and also said average of the estimated directions is within a predetermined range of directions. 43. The apparatus according to claim 31, further comprising:passage detection means (216) including judgement means for operating on said successive estimated directions obtained for a sound source in relation to a predetermined passage detection range of directions, to generate data expressing a passage detection result when said sound source is found to have passed through a specific direction; and, motion direction derivation means (509) responsive to generation of said passage detection result in relation to a sound source for determining a direction of motion of a sound source, based upon successively obtained estimated directions obtained for said sound source. 44. The apparatus according to claim 43, wherein said motion direction derivation means (509) comprises:buffer means (505) for temporarily registering each of successively obtained sets of said estimated directions; prior-to-passage direction derivation means (506) responsive to generation of said passage detection result in relation to a sound source for reading out from said buffer means a one of said estimated directions which had been registered in said buffer means at a point in time preceding a time point of generating said passage detection result, as a first estimated direction; subsequent-to-passage direction derivation means (507) responsive to said generation of a passage detection result in relation to said sound source for selecting a one of said estimated directions which is obtained at a time point identical to or subsequent to a time point at which said passage detection result is generated, as a second estimated direction; and motion direction detection means (508) for calculating the sign of a difference between said first estimated direction and second estimated direction, with said direction of motion being indicated by said sign of the difference. 45. The apparatus according to claim 31 wherein said microphone array is disposed at a known distance from a motion path of said sound source, further comprising:passage detection means (216) including judgement means for operating on said successive estimated directions obtained for a sound source in relation to a predetermined passage detection range of directions, to generate data expressing a passage detection result when said sound source is found to have passed through a specific direction; and velocity derivation means (609) responsive to generation of said passage detection result in relation to a sound source for estimating the linear velocity of said sound source, based upon successively obtained estimated directions obtained for said sound source. 46. The apparatus according to claim 45, wherein said velocity derivation means (609) comprises:buffer means(605) for temporarily registering each of successively obtained sets of said estimated directions; angular amount determining means (607) for specifying a predetermined angular amount; motion interval calculation means (606) responsive to generation of said passage detection result in relation to a sound source for reading out a set of estimated directions currently held in said buffer means and calculating, based on said set of estimated directions, an amount of time required for said sound source to move through a range of directions equal to said predetermined angular amount, and velocity detection means (608) for calculating the angular velocity of said sound source based on said amount of time and said predetermined angular amount, and for calculating an approximate value of linear velocity of said sound source, based upon said angular velocity and said known distance of said microphone array from said motion path. 47. The apparatus according to claim 31, further comprising directivity control means (706) for orienting a directivity of said microphone array along an estimated direction obtained for said sound source to thereby derive, as a combination of said microphone output signals with said directivity applied, a monitoring signal expressing a sound being emitted from said sound source.48. The apparatus according to claim 47, further comprising passage detection means (216) for detecting that a sound source has passed through a specific direction, based on said successive estimated directions obtained for said sound source, and generating data expressing a passage detection result when said sound source is found to have passed through said specific direction, and wherein said directivity control means (706) comprises:directivity setting means (704, 703) responsive to generation of said passage detection result in relation to a sound source for orienting said microphone array directivity along a specific one of said estimated directions, said specific estimated direction being obtained at a time point substantially close to a time point at which said passage detection result is generated. 49. The apparatus according to claim 31, further comprising:passage detection means (216) for detecting that a sound source has passed through a specific direction, based on said successive estimated directions obtained for said sound source, and generating data expressing a passage detection result when said sound source is found to have passed through said specific direction; directivity control means (706A, 706B) for concurrently establishing a plurality of fixedly predetermined directivities for said microphone array; and selection control means (814, 817) responsive to generation of a passage detection result for selecting one of said plurality of directivities, with said selection based upon an estimated direction obtained at a time point substantially close to a time point at which said passage detection result is generated. 50. The apparatus according to claim 49, further comprising a plurality of data buffers (813, 816) respectively corresponding to said plurality of directivities, each such data buffer being adapted to store successive time-axis portions of a monitoring signal which is obtained with the directivity corresponding to said data buffer, wherein said selection control means (814, 817) responds to generation of a passage detection result by reading out the current contents of a data buffer corresponding to said selected one of the plurality of directivities.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.