Off-axis audio suppressions in an automobile cabin
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G10L-021/02
H04B-015/00
출원번호
US-0194120
(2011-07-29)
등록번호
US-8818800
(2014-08-26)
발명자
/ 주소
Fallat, Mark Ryan
Hetherington, Phillip Alan
Percy, Michael Andrew
출원인 / 주소
2236008 Ontario Inc.
대리인 / 주소
Gowling Lafleur Henderson LLP
인용정보
피인용 횟수 :
1인용 특허 :
8
초록▼
The suppression of off-axis audio in an audio environment is provided. Off-axis audio may be considered audio that does not originate from a region of interest. The off-axis audio is suppressed by comparing a phase difference between signals from two microphones to a target slope of the phase differ
The suppression of off-axis audio in an audio environment is provided. Off-axis audio may be considered audio that does not originate from a region of interest. The off-axis audio is suppressed by comparing a phase difference between signals from two microphones to a target slope of the phase difference between signals originating from the region of interest. The target slope can be adapted to allow the region of interest to move with the location of a human speaker such as a driver.
대표청구항▼
1. A method of off-axis audio suppression in an audio environment comprising: receiving first and second audio signals from first and second microphones positioned within the audio environment;calculating a phase difference between the first and second audio signals;adjusting a target slope based on
1. A method of off-axis audio suppression in an audio environment comprising: receiving first and second audio signals from first and second microphones positioned within the audio environment;calculating a phase difference between the first and second audio signals;adjusting a target slope based on the calculated phase difference between the first and second audio signals to adapt the region of interest based on a location of a human speaker within the audio environment, the target slope defining a desired phase difference between signals from the first and second microphones corresponding to audio originating from a region of interest, and where adjusting the target slope includes: unwrapping the calculated phase difference;calculating a slope of the unwrapped phase difference;calculating a difference between the slope and the target slope;determining if the calculated difference is larger than a defined tolerance;adjusting the target slope based on the slope of the unwrapped phase difference;calculating a direction error between the calculated phase difference and the target slope; andprocessing the first and second audio signals based on the calculated direction error to suppress off-axis audio relative to the positions of the first and second microphones and the region of interest. 2. The method of claim 1, further comprising: calculating a slope confidence value when unwrapping the calculated phase difference, the slope confidence value determined as a sum of a signal-to-noise ratio of each of a plurality of frequency ranges of the calculated phase difference; andadjusting the target slope based on the slope of the unwrapped phase difference and the slope confidence value. 3. The method of claim 1, further comprising: smoothing the calculated slope;determining if an initial value for the target slope has been set;determining if the smoothed slope has been stable for a time interval;determining if the smoothed slope is in a desired direction based on the sign of the smoothed slope and the location of the human speaker in the audio environment;determining if the first and second audio signals correspond to voice audio; andsetting an initial value for the target slope based to the smoothed slope when the initial value has not been set, the smoothed slope has been stable for the time interval, the smoothed slope is in the desired direction and the first and second audio signals correspond to voice audio. 4. The method of claim 1, further comprising: smoothing the calculated slope;determining if the smoothed slope has been stable for a time interval;determining if the smoothed slope is in a desired direction based on the sign of the smoothed slope and the location of the human speaker in the audio environment;determining if the first and second audio signals correspond to voice audio;adjusting the target slope based on the slope of the unwrapped phase difference using a leaky integrator when the smoothed slope has been stable for the time interval, the smoothed slope is in the desired direction and the first and second audio signals correspond to voice audio; andkeeping the target slope unchanged when the smoothed slope has been stable for the time interval or the smoothed slope is not in the desired direction or the first and second audio signals do not correspond to voice audio. 5. The method of claim 1, wherein unwrapping comprises: calculating a moving average of the phase difference;locating zero-crossings of the moving average;confirming the zero crossing actually represent direction changes; andunwrapping the phase difference based on the located confirmed zero-crossings and the direction of a low-frequency phase difference. 6. The method of claim 1, wherein unwrapping comprises: determining if a difference between the phase difference and the target slope is greater than pi or less than −pi;subtracting n*pi from the phase difference when the difference between the phase difference and the target slope is greater than pi, where: target slope+pi>phase difference−n*pi>target slope−pi; andadding m*pi to the phase difference when the difference between the phase difference and the target slope is less than −pi, where: target slope+pi>phase difference+m pi>target slope−pi. 7. The method of claim 1, wherein the first and second audio signals are frequency domain representations of a frame of audio received at the corresponding microphone over a time interval, and wherein the method is repeated for subsequent frames of audio. 8. The method of claim 7, wherein processing the first and second audio signals comprises: determining if the direction error is less than an on-axis threshold, indicating that the frame of audio represented by the first and second audio signals corresponds to voice audio originating from the region of interest; andcombining the first and second audio signals to enhance the frame of audio when the direction error is less than the on-axis threshold. 9. The method of claim 7, wherein processing the first and second audio signals comprises: determining if the direction error is greater than an off-axis threshold, indicating that the frame of audio represented by the first and second audio signals corresponds to noise audio or to voice audio originating from outside the region of interest; andcombining the first and second audio signals to suppress the frame of audio when the direction error is greater than the off-axis threshold. 10. The method of claim 7, wherein processing the first and second audio signals comprises: determining if the direction error is between an off-axis threshold and an on-axis threshold, indicating that the frame of audio represented by the first and second audio signals corresponds to a combination of voice audio originating from the region of interest and noise audio or to voice audio originating from outside the region of interest;calculating a mixing mask as a function of frequency; andcombining the first and second audio signals using the mixing mask when the direction error is between than the off-axis threshold and the on-axis threshold. 11. An apparatus for performing off-axis audio suppression in an audio environment comprising: a processor and memory configuring the apparatus to provide: a target slope stored in memory defining a desired phase difference between signals from first and second microphones corresponding to audio originating from a region of interest;a target adaptation component adjusting the target slope based on the calculated phase difference between the first and second audio signals to adapt the region of interest based on a location of a human speaker within the audio environment;a source-locating component calculating a direction error between the target slope and a phase difference between first and second audio signals received from the first and second microphones; andan audio mixer processing the first and second audio signals based on the calculated direction error to suppress off-axis audio relative to the positions of the first and second microphones and the region of interest;wherein the target adaptation component unwraps the calculated phase difference, calculates a slope of the unwrapped phase difference, calculates a difference between the slope and the target slope, determines if the calculated difference is larger than a defined tolerance, and adjusts the target slope based on the slope of the unwrapped phase difference. 12. The apparatus of claim 11, further comprising: calculating a slope confidence value when unwrapping the calculated phase difference, the slope confidence value determined as a sum of a signal-to-noise ratio of each of a plurality of frequency ranges in the calculated phase difference; andadjusting the target slope based on the slope of the unwrapped phase difference and the slope confidence value. 13. The apparatus of claim 11, wherein the target adaptation component further sets an initial value for the target slope by: smoothing the calculated slope;determining if an initial value for the target slope has been set;determining if the smoothed slope has been stable for a time interval;determining if the smoothed slope is in a desired direction based on the sign of the smoothed slope and the location of the human speaker in the audio environment;determining if the first and second audio signals correspond to voice audio; andsetting the initial value for the target slope based on the smoothed slope when the initial value has not been set, the smoothed slope has been stable for the time interval, the smoothed slope is in the desired direction and the first and second audio signals correspond to voice audio. 14. The apparatus of claim 11, wherein the target adaptation component adjusts the target slope by: smoothing the calculated slope;determining if the smoothed slope has been stable for a time interval;determining if the smoothed slope is in a desired direction based on the sign of the smoothed slope and the location of the human speaker in the audio environment;determining if the first and second audio signals correspond to voice audio;adjusting the target slope based on the slope of the unwrapped phase difference using a leaky integrator when the smoothed slope has been stable for the time interval, the smoothed slope is in the desired direction and the first and second audio signals correspond to voice audio; andkeeping the target slope unchanged when the smoothed slope has not been stable for the time interval or the smoothed slope is not in the desired direction or the first and second audio signals do not correspond to voice audio. 15. The apparatus of claim 11, wherein unwrapping comprises: calculating a moving average of the phase difference;locating zero-crossings of the moving average;confirming the zero crossing actually represent direction changes; andunwrapping the phase difference based on the located confirmed zero-crossings and a direction of the low-frequency phase difference. 16. The apparatus of claim 11, wherein unwrapping comprises: determining if a difference between the phase difference and the target slope is greater than pi or less than −pi;subtracting n*pi from the phase difference when the difference between the phase difference and the target slope is greater than pi, where: target slope+pi>phase difference−n*pi>target slope−pi; andadding m*pi to the phase difference when the difference between the phase difference and the target slope is less than −pi, where: target slope+pi>phase difference+m*pi>target slope−pi. 17. The apparatus of claim 11, further comprising a signal processing component to convert the first and second audio signals to frequency domain representations of a frame of audio received at the corresponding microphone over a time interval. 18. The apparatus of claim 17, wherein the audio mixer determines if the direction error is less than an on-axis threshold, indicating that the frame of audio represented by the first and second audio signals corresponds to voice audio originating from the region of interest and combines the first and second audio signals to enhance the frame of audio when the direction error is less than the on-axis threshold. 19. The apparatus of claim 17, wherein the audio mixer determines if the direction error is greater than an off-axis threshold, indicating that the frame of audio represented by the first and second audio signals corresponds to noise audio or to voice audio originating from outside the region of interest and combines the first and second audio signals to suppress the frame of audio when the direction error is greater than the off-axis threshold. 20. The apparatus of claim 17, wherein the audio mixer determines if the direction error is between an off-axis threshold and an on-axis threshold, indicating that the frame of audio represented by the first and second audio signals corresponds to a combination of voice audio originating from the region of interest and noise audio or to voice audio originating from outside the region of interest and combines the first and second audio signals using a mixing mask calculated as a function of frequency when the direction error is between the off-axis threshold and the on-axis threshold. 21. A computer readable non-transitory memory containing instructions which when executed by a processor perform a method of off-axis audio suppression in an audio environment comprising: receiving first and second audio signals from first and second microphones positioned within the audio environment;calculating a phase difference between the first and second audio signals;adjusting a target slope based on the calculated phase difference between the first and second audio signals to adapt the region of interest based on a location of a human speaker within the audio environment, the target slope defining a desired phase difference between signals from the first and second microphones corresponding to audio originating from a region of interest, and where adjusting the target slope includes: unwrapping the calculated phase difference;calculating a slope of the unwrapped phase difference;calculating a difference between the slope and the target slope;determining if the calculated difference is larger than a defined tolerance;adjusting the target slope based on the unwrapped phase difference;calculating a direction error between the calculated phase difference and a the target slope; andprocessing the first and second audio signals based on the calculated direction error to suppress off-axis audio relative to the positions of the first and second microphones and the region of interest.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (8)
Mathews Lemuel P. (Rancho Palos Verdes CA) Lohman Charles A. (Fullerton CA) Armstrong Paul R. (Yorba Linda CA), Acoustical detection and tracking system.
Soli Sigfrid D. (Sierra Madre CA) Jayaraman Sriram (Los Angeles CA) Gao Shawn (Cerritos CA) Sullivan Jean (Murrieta CA), Method of signal processing for maintaining directional hearing with hearing aids.
Feng, Albert S.; Lockwood, Michael E.; Jones, Douglas L.; Bilger, legal representative, Carolyn J.; Lansing, Charissa R.; O'Brien, William D.; Wheeler, Bruce C.; Bilger, Robert C., Systems and methods for interference suppression with directional sensing patterns.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.