IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0381725
(2006-05-04)
|
등록번호 |
US-7783061
(2010-09-13)
|
발명자
/ 주소 |
- Zalewski, Gary M.
- Marks, Richard L.
- Mao, Xiadong
|
출원인 / 주소 |
- Sony Computer Entertainment Inc.
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
25 인용 특허 :
31 |
초록
▼
Targeted sound detection methods and apparatus are disclosed. A microphone array has two or more microphones M0 . . . MM. Each microphone is coupled to a plurality of filters. The filters are configured to filter input signals corresponding to sounds detected by the microphones thereby generating a
Targeted sound detection methods and apparatus are disclosed. A microphone array has two or more microphones M0 . . . MM. Each microphone is coupled to a plurality of filters. The filters are configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output. One or more sets of filter parameters for the plurality of filters are pre-calibrated to determine one or more corresponding pre-calibrated listening zones. Each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filter out sounds originating outside the given listening zone. A particular pre-calibrated listening zone is selected at a runtime by applying to the plurality of filters a set of filter coefficients corresponding to the particular pre-calibrated listening zone. As a result, the microphone array may detect sounds originating within the particular listening sector and filter out sounds originating outside the particular listening zone.
대표청구항
▼
What is claimed is: 1. A method for targeted sound detection using a microphone array having two or more microphones M0 . . . MM, each microphone being coupled to a plurality of filters, the filters being configured to filter input signals corresponding to sounds detected by the microphones thereby
What is claimed is: 1. A method for targeted sound detection using a microphone array having two or more microphones M0 . . . MM, each microphone being coupled to a plurality of filters, the filters being configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output, the method comprising: pre-calibrating a plurality of sets of filter parameters for the plurality of filters to determine a corresponding plurality of pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filter out sounds originating outside the given listening zone; and selecting a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters, whereby the microphone array may detect sounds originating within the particular listening zone and filters out sounds originating outside the particular listening zone. 2. The method of claim 1 wherein pre-calibrating the plurality of sets of the filter parameters includes using blind source separation to determine sets of finite impulse response (FIR) filter parameters. 3. The method of claim 1 wherein the plurality of listening zones includes a listening zone that corresponds to a field of view of an image capture unit, whereby the microphone array may detect sounds originating within the field of view of the image capture unit and filter out sounds originating outside the field of view of the image capture unit. 4. The method of claim 1 wherein the plurality of listening zones include a plurality of different listening zones. 5. The method of claim 4 wherein the plurality of pre-calibrated listening zones includes about 18 sectors, wherein each sector has an angular width of about 20 degrees, whereby the plurality of pre-calibrated sectors encompasses about 360 degrees surrounding the microphone array. 6. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes selecting a pre-calibrated listening zone that contains a source of sound. 7. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes selecting an initial zone of a plurality of listening zones; determining whether a source of sound lies within the initial zone or on a particular side of the initial zone; and, if the source of sound does not lie within the initial zone, selecting a different listening zone on the particular side of the initial zone, wherein the different listening zone is characterized by an attenuation of the input signals that is closest to an optimum value. 8. The method of claim 7 wherein determining whether a source of sound lies within the initial zone or on a particular side of the initial zone includes calculating from the input signals and the output signal an attenuation of the input signals and comparing the attenuation to the optimum value. 9. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes determining whether, for a given listening zone, an attenuation of the input signals is below a threshold. 10. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes selecting a pre-calibrated listening zone that contains a source of sound, the method further comprising robotically pointing an image capture unit toward the pre-calibrated listening zone that contains the source of sound. 11. The method of claim 1 wherein the electronic device is a video game unit having a joystick controller, the method further comprising generating at least one control signal for the purpose of controlling at least one aspect of the video game unit if it is determined that the sound or the source of sound has one or more predetermined characteristics; and generating one or more additional control signals with the joystick controller. 12. The method of claim 11 wherein generating one or more additional control signals with the joystick controller includes generating an optical signal with one or more light sources located on the joystick controller and receiving the optical signal with an image capture unit. 13. The method of claim 12 wherein receiving an optical signal includes capturing one or more images containing one or more light sources and analyzing the one or more images to determine a position or an orientation of the joystick controller and/or decode a telemetry signal from the joystick controller. 14. The method of claim 11, wherein generating one or more additional control signals with the joystick controller includes generating a position and/or orientation signal with an inertial sensor located on the joystick controller. 15. The method of claim 14, further comprising compensating for a drift in a position and/or orientation determined from the position and/or orientation signal. 16. The method of claim 15 wherein compensating for a drift includes setting a value of an initial position to a value of a current calculated position determined from the position and/or orientation signal. 17. The method of claim 15 wherein compensating for a drift includes capturing an image of the joystick controller with an image capture unit, analyzing the image to determine a position of the joystick controller and setting a current value of the position of the joystick controller to the position of the joystick controller determined from analyzing the image. 18. The method of claim 15, further comprising compensating for spurious data in a signal from the inertial sensor. 19. A targeted sound detection apparatus a microphone array having two or more microphones M0 . . . MM; a plurality of filters coupled to each microphone, the filters being configured to filter input signals corresponding to sounds detected by the microphones and generate a filtered output; a processor coupled to the microphone array and the plurality of filters; a memory coupled to the processor; one or more sets of the filter parameters embodied in the memory, corresponding to one or more pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filters out sounds originating outside the given listening zone; the memory containing a set of processor executable instructions that, when executed, cause the apparatus to select a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters, whereby the apparatus may detect sounds originating within the particular pre-calibrated listening zone and filter out sounds originating outside the particular pre-calibrated listening zone. 20. The apparatus of claim 19 wherein the plurality of pre-calibrated listening zones includes about 18 sectors, wherein each sector has an angular width of about 20 degrees, whereby the plurality of pre-calibrated sectors encompasses about 360 degrees surrounding the microphone array. 21. The apparatus of claim 19 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to select a pre-calibrated listening zone that contains a source of sound. 22. The apparatus of claim 19 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to determine whether a source of sound lies within an initial listening zone or on a particular side of the initial listening zone; and, if the source of sound does not lie within the initial listening zone, select a different listening zone on the particular side of the initial listening zone, wherein the different listening zone is characterized by an attenuation of the input signals that is closest to an optimum value. 23. The apparatus of claim 22, wherein the one or more instructions which, when executed, cause the apparatus to determine whether a source of sound lies within the initial listening zone or on a particular side of the initial listening zone include one or more instructions which, when executed calculate from the input signals and the output signal an attenuation of the input signals and compare the attenuation to the optimum value. 24. The apparatus of claim 19 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to determine a value of an attenuation of the input signals for one or more zones and select a listening zone for which the attenuation is closest to an optimum value. 25. The apparatus of claim 19 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to determine whether, for a given listening zone, an attenuation of the input signals is below a threshold. 26. The apparatus of claim 19, further comprising an image capture unit coupled to the processor, wherein the one or more listening zones include a listening zone that corresponds to a field of view of the image capture unit. 27. The apparatus of claim 19, further comprising a image capture unit coupled to the processor, and one or more pointing actuators coupled to the processor, the pointing actuators being adapted to point the image capture unit in a viewing direction in response to signals generated by the processor, the memory containing a set of processor executable instructions that, when executed, cause the actuators to point the image capture unit in a direction of the particular pre-calibrated listening zone. 28. The apparatus of claim 19 wherein the instructions that cause the apparatus to characterize the sound or the source of the sound include instructions which, when executed, cause the apparatus to analyze the sound to determine whether or not it has one or more predetermined characteristics. 29. The method of claim 28 wherein the set of processor executable instructions further include one or more instructions which, when executed, cause the apparatus to generate at least one control signal may be generated for the purpose of controlling at least one aspect of the apparatus if it is determined that the sound does have one or more predetermined characteristics. 30. The apparatus of claim 29 wherein the apparatus is a video game controller and the control signal causes the video game controller to execute game instructions in response to sounds from the source of sound. 31. The apparatus of claim 19 wherein the apparatus is a baby monitor. 32. The apparatus of claim 19, further comprising a joystick controller coupled to the processor. 33. The apparatus of claim 32 wherein the joystick controller includes an inertial sensor coupled to the processor. 34. The apparatus of claim 33 wherein the processor executable instructions include one or more instructions which, when executed compensate for spurious data in a signal from the inertial sensor. 35. The apparatus of claim 33 wherein signals from the inertial sensor and signals generated from the image capture unit from tracking one or more light sources mounted to the joystick controller are used as inputs to a game system. 36. The apparatus of claim 33 wherein the inertial sensor includes an accelerometer or gyroscope. 37. The apparatus of claim 36 wherein the processor executable instructions include one or more instructions which, when executed compensate for a drift in a position and/or orientation determined from a position and/or orientation signal from the inertial sensor. 38. The apparatus of claim 37 wherein compensating for a drift includes setting a value of an initial position to a value of a current calculated position determined from the position and/or orientation signal. 39. The apparatus of claim 38 wherein compensating for a drift includes capturing an image of the joystick controller with an image capture unit, analyzing the image to determine a position of the joystick controller and setting a current value of the position of the joystick controller to the position of the joystick controller determined from analyzing the image. 40. The apparatus of claim 39 wherein the joystick controller includes one or more light sources, the apparatus further comprising an image capture unit, wherein the processor executable instructions including one or more instructions which, when executed cause the image capture unit to monitor a field of view in front of the image capture unit, identify the light source within the field of view; detect a change in light emitted from the light source; and in response to detecting the change, triggering an input command to the processor. 41. The apparatus of claim 32 wherein the joystick controller includes one or more light sources, the apparatus further comprising an image capture unit, wherein the processor executable instructions including one or more instructions which, when executed cause the image capture unit to capture one or more images containing the light sources and analyze the image to determine a position or an orientation of the joystick controller and/or decode a telemetry signal from the joystick controller. 42. The apparatus of claim 41 wherein the light sources include two or more light sources in a linear array. 43. The apparatus of claim 41 wherein the light sources include rectangular or arcuate configuration of a plurality of light sources. 44. The apparatus of claim 41 wherein the light sources are disposed on two or more different sides of the joystick controller to facilitate viewing of the light sources by the image capture unit. 45. The apparatus of claim 41, further comprising an inertial sensor mounted to the joystick controller, wherein a signal from the inertial sensor provides part of a tracking information input and signals generated from the image capture unit from tracking the one or more light sources provides another part of the tracking information input. 46. A computer-readable medium having embodied therein computer executable instructions for performing a method for targeted sound detection using a microphone array having two or more microphones M0 . . . MM, each microphone being coupled to a plurality of filters, the filters being configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output, the method comprising: pre-calibrating a one or more sets of filter parameters for the plurality of filters to determine one or more corresponding pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filter out sounds originating outside the given listening zone; and selecting a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters, whereby the microphone array may detect sounds originating within the particular listening zone and filters out sounds originating outside the particular listening zone.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.