[특허]Systems, methods, and apparatus for speech feature detection

Systems, methods, and apparatus for speech feature detection 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G10L-021/00 G10L-025/78
출원번호	US-0092502 (2011-04-22)
등록번호	US-9165567 (2015-10-20)
발명자 / 주소	Visser, Erik Liu, Ian Ernan Shin, Jongwon
출원인 / 주소	QUALCOMM Incorporated
대리인 / 주소	Barker, Scott A.
인용정보	피인용 횟수 : 9 인용 특허 : 14

초록 ▼

Implementations and applications are disclosed for detection of a transition in a voice activity state of an audio signal, based on a change in energy that is consistent in time across a range of frequencies of the signal. For example, such detection may be based on a time derivative of energy for e

대표청구항 ▼

1. A method of processing an audio signal, said method comprising: for each of a first plurality of consecutive segments of the audio signal, determining that voice activity is present in the segment;for each of a second plurality of consecutive segments of the audio signal that occurs immediately after the first plurality of consecutive segments in the audio signal, determining that voice activity is not present in the segment;using at least one array of logic elements, detecting that a transition in a voice activity state of the audio signal occurs during one among the second plurality of consecutive segments that is not the first segment to occur among the second plurality; andproducing a voice activity detection signal that has, for each segment in the first plurality and for each segment in the second plurality, a corresponding value that indicates one among activity and lack of activity,wherein, for each of the first plurality of consecutive segments, the corresponding value of the voice activity detection signal indicates activity, andwherein, for each of the second plurality of consecutive segments that occurs before the segment in which the detected transition occurs, and based on said determining, for at least one segment of the first plurality, that voice activity is present in the segment, the corresponding value of the voice activity detection signal indicates activity, andwherein, for each of the second plurality of consecutive segments that occurs after the segment in which the detected transition occurs, and in response to said detecting that a transition in the speech activity state of the audio signal occurs, the corresponding value of the voice activity detection signal indicates a lack of activity. 2. The method according to claim 1, wherein said method comprises calculating a time derivative of energy for each of a plurality of different frequency components of the audio signal during said one among the second plurality of segments, and wherein said detecting that the transition occurs during said one among the second plurality of segments is based on the calculated time derivatives of energy. 3. The method according to claim 2, wherein said detecting that the transition occurs includes, for each of the plurality of different frequency components, and based on the corresponding calculated time derivative of energy, producing a corresponding indication of whether the frequency component is active, and wherein said detecting that the transition occurs is based on a relation between the number of said indications that indicate that the corresponding frequency component is active and a first threshold value. 4. The method according to claim 3, wherein said method comprises, for a segment that occurs prior to the first plurality of consecutive segments in the audio signal: calculating a time derivative of energy for each of a plurality of different frequency components of the audio signal during the segment;for each of the plurality of different frequency components, and based on the corresponding calculated time derivative of energy, producing a corresponding indication of whether the frequency component is active; anddetermining that a transition in a voice activity state of the audio signal does not occur during the segment, based on a relation between (A) the number of said indications that indicate that the corresponding frequency component is active and (B) a second threshold value that is higher than said first threshold value. 5. The method according to claim 3, wherein said method comprises, for a segment that occurs prior to the first plurality of consecutive segments in the audio signal: calculating, for each of a plurality of different frequency components of the audio signal during the segment, a second derivative of energy with respect to time;for each of the plurality of different frequency components, and based on the corresponding calculated second derivative of energy with respect to time, producing a corresponding indication of whether the frequency component is impulsive; anddetermining that a transition in a voice activity state of the audio signal does not occur during the segment, based on a relation between the number of said indications that indicate that the corresponding frequency component is impulsive and a threshold value. 6. The method according to claim 3, wherein said method comprises, for a segment that occurs prior to the first plurality of consecutive segments in the audio signal: calculating, for each of a plurality of different frequency components of the audio signal during the segment, a second-order derivative of energy with respect to time;for each of the plurality of different frequency components, and based on the corresponding calculated second-order derivative of energy with respect to time, producing a corresponding indication of whether the frequency component is impulsive; anddetermining that a transition in a voice activity state of the audio signal does not occur during the segment, based on a relation between the number of said indications that indicate that the corresponding frequency component is impulsive and a threshold value. 7. The method according to claim 1, wherein, for each of the first plurality of consecutive segments of the audio signal, said determining that voice activity is present in the segment is based on a difference between a first channel of the audio signal during the segment and a second channel of the audio signal during the segment, and wherein, for each of the second plurality of consecutive segments of the audio signal, said determining that voice activity is not present in the segment is based on a difference between a first channel of the audio signal during the segment and a second channel of the audio signal during the segment. 8. The method according to claim 7, wherein, for each segment of said first plurality and for each segment of said second plurality, said difference is a difference between a level of the first channel and a level of the second channel during the segment. 9. The method according to claim 7, wherein, for each segment of said first plurality and for each segment of said second plurality, said difference is a difference in time between an instance of a signal in the first channel during the segment and an instance of said signal in the second channel during the segment. 10. The method according to claim 7, wherein, for each segment of said first plurality, said determining that voice activity is present in the segment comprises calculating, for each of a first plurality of different frequency components of the audio signal during the segment, a difference between a phase of the frequency component in the first channel and a phase of the frequency component in the second channel, wherein said difference between the first channel during the segment and the second channel during the segment is one of said calculated phase differences, and wherein, for each segment of said second plurality, said determining that voice activity is not present in the segment comprises calculating, for each of the first plurality of different frequency components of the audio signal during the segment, a difference between a phase of the frequency component in the first channel and a phase of the frequency component in the second channel, wherein said difference between the first channel during the segment and the second channel during the segment is one of said calculated phase differences. 11. The method according to claim 10, wherein said method comprises calculating a time derivative of energy for each of a second plurality of different frequency components of the first channel during said one among the second plurality of segments, and wherein said detecting that the transition occurs during said one among the second plurality of segments is based on the calculated time derivatives of energy, andwherein a frequency band that includes the first plurality of frequency components is separate from a frequency band that includes the second plurality of frequency components. 12. The method according to claim 10, wherein, for each segment of said first plurality, said determining that voice activity is present in the segment is based on a corresponding value of a coherency measure that indicates a degree of coherence among the directions of arrival of at least the plurality of different frequency components, wherein said value is based on information from the corresponding plurality of calculated phase differences, and wherein, for each segment of said second plurality, said determining that voice activity is not present in the segment is based on a corresponding value of the coherency measure that indicates a degree of coherence among the directions of arrival of at least the plurality of different frequency components, wherein said value is based on information from the corresponding plurality of calculated phase differences. 13. The method according to claim 1, wherein said method comprises: calculating a time derivative of energy for each of a plurality of different frequency components of the audio signal during a segment of one of the first and second pluralities of segments; andproducing a voice activity detection indication for said segment of one of the first and second pluralities,wherein said producing the voice activity detection indication includes comparing a value of a test statistic for the segment to a value of a threshold, andwherein said producing the voice activity detection indication includes modifying a relation between the test statistic and the threshold, based on said calculated plurality of time derivatives of energy, andwherein a value of said voice activity detection signal for said segment of one of the first and second pluralities is based on said voice activity detection indication. 14. The method according to claim 1, wherein said method is performed by a communications device. 15. An apparatus for processing an audio signal, said apparatus comprising: means for determining, for each of a first plurality of consecutive segments of the audio signal, that voice activity is present in the segment;means for determining, for each of a second plurality of consecutive segments of the audio signal that occurs immediately after the first plurality of consecutive segments in the audio signal, that voice activity is not present in the segment;means for detecting that a transition in a voice activity state of the audio signal occurs during one among the second plurality of consecutive segments; andmeans for producing a voice activity detection signal that has, for each segment in the first plurality and for each segment in the second plurality, a corresponding value that indicates one among activity and lack of activity, andwherein, for each of the first plurality of consecutive segments, the corresponding value of the voice activity detection signal indicates activity, andwherein, for each of the second plurality of consecutive segments that occurs before the segment in which the detected transition occurs, and based on said determining, for at least one segment of the first plurality, that voice activity is present in the segment, the corresponding value of the voice activity detection signal indicates activity, andwherein, for each of the second plurality of consecutive segments that occurs after the segment in which the detected transition occurs, and in response to said detecting that a transition in the speech activity state of the audio signal occurs, the corresponding value of the voice activity detection signal indicates a lack of activity. 16. The apparatus according to claim 15, wherein said apparatus comprises means for calculating a time derivative of energy for each of a plurality of different frequency components of the audio signal during said one among the second plurality of segments, and wherein said means for detecting that the transition occurs during said one among the second plurality of segments is configured to detect the transition based on the calculated time derivatives of energy. 17. The apparatus according to claim 16, wherein said means for detecting that the transition occurs includes means for producing, for each of the plurality of different frequency components, and based on the corresponding calculated time derivative of energy, a corresponding indication of whether the frequency component is active, and wherein said means for detecting that the transition occurs is configured to detect the transition based on a relation between the number of said indications that indicate that the corresponding frequency component is active and a first threshold value. 18. The apparatus according to claim 17, wherein said apparatus comprises: means for calculating, for a segment that occurs prior to the first plurality of consecutive segments in the audio signal, a time derivative of energy for each of a plurality of different frequency components of the audio signal during the segment;means for producing, for each of said plurality of different frequency components of said segment that occurs prior to the first plurality of consecutive segments in the audio signal, and based on the corresponding calculated time derivative of energy, a corresponding indication of whether the frequency component is active; andmeans for determining that a transition in a voice activity state of the audio signal does not occur during said segment that occurs prior to the first plurality of consecutive segments in the audio signal, based on a relation between (A) the number of said indications that indicate that the corresponding frequency component is active and (B) a second threshold value that is higher than said first threshold value. 19. The apparatus according to claim 17, wherein said apparatus comprises: means for calculating, for a segment that occurs prior to the first plurality of consecutive segments in the audio signal, a second derivative of energy with respect to time for each of a plurality of different frequency components of the audio signal during the segment;means for producing, for each of the plurality of different frequency components of said segment that occurs prior to the first plurality of consecutive segments in the audio signal, and based on the corresponding calculated second derivative of energy with respect to time, a corresponding indication of whether the frequency component is impulsive; andmeans for determining that a transition in a voice activity state of the audio signal does not occur during said segment that occurs prior to the first plurality of consecutive segments in the audio signal, based on a relation between the number of said indications that indicate that the corresponding frequency component is impulsive and a threshold value. 20. The apparatus according to claim 15, wherein, for each of the first plurality of consecutive segments of the audio signal, said means for determining that voice activity is present in the segment is configured to perform said determining based on a difference between a first channel of the audio signal during the segment and a second channel of the audio signal during the segment, and wherein, for each of the second plurality of consecutive segments of the audio signal, said means for determining that voice activity is not present in the segment is configured to perform said determining based on a difference between a first channel of the audio signal during the segment and a second channel of the audio signal during the segment. 21. The apparatus according to claim 20, wherein, for each segment of said first plurality and for each segment of said second plurality, said difference is a difference between a level of the first channel and a level of the second channel during the segment. 22. The apparatus according to claim 20, wherein, for each segment of said first plurality and for each segment of said second plurality, said difference is a difference in time between an instance of a signal in the first channel during the segment and an instance of said signal in the second channel during the segment. 23. The apparatus according to claim 20, wherein said means for determining that voice activity is present in the segment comprises means for calculating, for each segment of said first plurality and for each segment of said second plurality, and for each of a first plurality of different frequency components of the audio signal during the segment, a difference between a phase of the frequency component in the first channel and a phase of the frequency component in the second channel, wherein said difference between the first channel during the segment and the second channel during the segment is one of said calculated phase differences. 24. The apparatus according to claim 23, wherein said apparatus comprises means for calculating a time derivative of energy for each of a second plurality of different frequency components of the first channel during said one among the second plurality of segments, and wherein said means for detecting that the transition occurs during said one among the second plurality of segments is configured to detect that the transition occurs based on the calculated time derivatives of energy, andwherein a frequency band that includes the first plurality of frequency components is separate from a frequency band that includes the second plurality of frequency components. 25. The apparatus according to claim 23, wherein said means for determining, for each segment of said first plurality, that voice activity is present in the segment is configured to determine that said voice activity is present based on a corresponding value of a coherency measure that indicates a degree of coherence among the directions of arrival of at least the plurality of different frequency components, wherein said value is based on information from the corresponding plurality of calculated phase differences, and wherein said means for determining, for each segment of said second plurality, that voice activity is not present in the segment is configured to determine that voice activity is not present based on a corresponding value of the coherency measure that indicates a degree of coherence among the directions of arrival of at least the plurality of different frequency components, wherein said value is based on information from the corresponding plurality of calculated phase differences. 26. The apparatus according to claim 15, wherein said apparatus comprises: means for calculating a time derivative of energy for each of a plurality of different frequency components of the audio signal during a segment of one of the first and second pluralities of segments; andmeans for producing a voice activity detection indication for said segment of one of the first and second pluralities,wherein said means for producing the voice activity detection indication includes means for comparing a value of a test statistic for the segment to a threshold value, andwherein said means for producing the voice activity detection indication includes means for modifying a relation between the test statistic and the threshold, based on said calculated plurality of time derivatives of energy, andwherein a value of said voice activity detection signal for said segment of one of the first and second pluralities is based on said voice activity detection indication. 27. An apparatus for processing an audio signal, said apparatus comprising: a first voice activity detector configured to determine:for each of a first plurality of consecutive segments of the audio signal, that voice activity is present in the segment, andfor each of a second plurality of consecutive segments of the audio signal that occurs immediately after the first plurality of consecutive segments in the audio signal, that voice activity is not present in the segment;a second voice activity detector configured to detect that a transition in a voice activity state of the audio signal occurs during one among the second plurality of consecutive segments; anda signal generator configured to produce a voice activity detection signal that has, for each segment in the first plurality and for each segment in the second plurality, a corresponding value that indicates one among activity and lack of activity,wherein, for each of the first plurality of consecutive segments, the corresponding value of the voice activity detection signal indicates activity, andwherein, for each of the second plurality of consecutive segments that occurs before the segment in which the detected transition occurs, and based on said determining, for at least one segment of the first plurality, that voice activity is present in the segment, the corresponding value of the voice activity detection signal indicates activity, andwherein, for each of the second plurality of consecutive segments that occurs after the segment in which the detected transition occurs, and in response to said detecting that a transition in the speech activity state of the audio signal occurs, the corresponding value of the voice activity detection signal indicates a lack of activity. 28. The apparatus according to claim 27, wherein said apparatus comprises a calculator configured to calculate a time derivative of energy for each of a plurality of different frequency components of the audio signal during said one among the second plurality of segments, and wherein said second voice activity detector is configured to detect said transition based on the calculated time derivatives of energy. 29. The apparatus according to claim 28, wherein said second voice activity detector includes a comparator configured to produce, for each of the plurality of different frequency components, and based on the corresponding calculated time derivative of energy, a corresponding indication of whether the frequency component is active, and wherein said second voice activity detector is configured to detect the transition based on a relation between the number of said indications that indicate that the corresponding frequency component is active and a first threshold value. 30. The apparatus according to claim 29, wherein said apparatus comprises: a calculator configured to calculate, for a segment that occurs prior to the first plurality of consecutive segments in the audio signal, a time derivative of energy for each of a plurality of different frequency components of the audio signal during the segment; anda comparator configured to produce, for each of said plurality of different frequency components of said segment that occurs prior to the first plurality of consecutive segments in the audio signal, and based on the corresponding calculated time derivative of energy, a corresponding indication of whether the frequency component is active,wherein said second voice activity detector is configured to determine that a transition in a voice activity state of the audio signal does not occur during said segment that occurs prior to the first plurality of consecutive segments in the audio signal, based on a relation between (A) the number of said indications that indicate that the corresponding frequency component is active and (B) a second threshold value that is higher than said first threshold value. 31. The apparatus according to claim 29, wherein said apparatus comprises: a calculator configured to calculate, for a segment that occurs prior to the first plurality of consecutive segments in the audio signal, a second derivative of energy with respect to time for each of a plurality of different frequency components of the audio signal during the segment; anda comparator configured to produce, for each of the plurality of different frequency components of said segment that occurs prior to the first plurality of consecutive segments in the audio signal, and based on the corresponding calculated second derivative of energy with respect to time, a corresponding indication of whether the frequency component is impulsive,wherein said second voice activity detector is configured to determine that a transition in a voice activity state of the audio signal does not occur during said segment that occurs prior to the first plurality of consecutive segments in the audio signal, based on a relation between the number of said indications that indicate that the corresponding frequency component is impulsive and a threshold value. 32. The apparatus according to claim 27, wherein said first voice activity detector is configured to determine, for each of the first plurality of consecutive segments of the audio signal, that voice activity is present in the segment, based on a difference between a first channel of the audio signal during the segment and a second channel of the audio signal during the segment, and wherein said first voice activity detector is configured to determine, for each of the second plurality of consecutive segments of the audio signal, that voice activity is not present in the segment, based on a difference between a first channel of the audio signal during the segment and a second channel of the audio signal during the segment. 33. The apparatus according to claim 32, wherein, for each segment of said first plurality and for each segment of said second plurality, said difference is a difference between a level of the first channel and a level of the second channel during the segment. 34. The apparatus according to claim 32, wherein, for each segment of said first plurality and for each segment of said second plurality, said difference is a difference in time between an instance of a signal in the first channel during the segment and an instance of said signal in the second channel during the segment. 35. The apparatus according to claim 32, wherein said first voice activity detector includes a calculator configured to calculate, for each segment of said first plurality and for each segment of said second plurality, and for each of a first plurality of different frequency components of the audio signal during the segment, a difference between a phase of the frequency component in the first channel and a phase of the frequency component in the second channel, wherein said difference between the first channel during the segment and the second channel during the segment is one of said calculated phase differences. 36. The apparatus according to claim 35, wherein said apparatus comprises a calculator configured to calculate a time derivative of energy for each of a second plurality of different frequency components of the first channel during said one among the second plurality of segments, and wherein said second voice activity detector is configured to detect that the transition occurs based on the calculated time derivatives of energy, andwherein a frequency band that includes the first plurality of frequency components is separate from a frequency band that includes the second plurality of frequency components. 37. The apparatus according to claim 35, wherein said first voice activity detector is configured to determine, for each segment of said first plurality, that said voice activity is present in the segment based on a corresponding value of a coherency measure that indicates a degree of coherence among the directions of arrival of at least the plurality of different frequency components, wherein said value is based on information from the corresponding plurality of calculated phase differences, and wherein said first voice activity detector is configured to determine, for each segment of said second plurality, that voice activity is not present in the segment based on a corresponding value of the coherency measure that indicates a degree of coherence among the directions of arrival of at least the plurality of different frequency components, wherein said value is based on information from the corresponding plurality of calculated phase differences. 38. The apparatus according to claim 27, wherein said apparatus comprises: a third voice activity detector configured to calculate a time derivative of energy for each of a plurality of different frequency components of the audio signal during a segment of one of the first and second pluralities of segments; anda fourth voice activity detector configured to produce a voice activity detection indication for said segment of one of the first and second pluralities, based on a result of comparing a value of a test statistic for the segment to a threshold value,wherein said fourth voice activity detector is configured to modify a relation between the test statistic and the threshold, based on said calculated plurality of time derivatives of energy, andwherein a value of said voice activity detection signal for said segment of one of the first and second pluralities is based on said voice activity detection indication. 39. The apparatus according to claim 38, wherein the fourth voice activity detector is the first voice activity detector, and wherein said determining that voice activity is present or not present in the segment includes producing said voice activity detection indication. 40. A non-transitory computer-readable medium that stores machine-executable instructions that when executed by one or more processors cause the one or more processors to: determine, for each of a first plurality of consecutive segments of a multichannel signal, and based on a difference between a first channel of the multichannel signal during the segment and a second channel of the multichannel signal during the segment, that voice activity is present in the segment;determine, for each of a second plurality of consecutive segments of the multichannel signal that occurs immediately after the first plurality of consecutive segments in the multichannel signal, and based on a difference between a first channel of the multichannel signal during the segment and a second channel of the multichannel signal during the segment, that voice activity is not present in the segment;detect that a transition in a voice activity state of the multichannel signal occurs during one among the second plurality of consecutive segments that is not the first segment to occur among the second plurality; andproduce a voice activity detection signal that has, for each segment in the first plurality and for each segment in the second plurality, a corresponding value that indicates one among activity and lack of activity,wherein, for each of the first plurality of consecutive segments, the corresponding value of the voice activity detection signal indicates activity, andwherein, for each of the second plurality of consecutive segments that occurs before the segment in which the detected transition occurs, and based on said determining, for at least one segment of the first plurality, that voice activity is present in the segment, the corresponding value of the voice activity detection signal indicates activity, andwherein, for each of the second plurality of consecutive segments that occurs after the segment in which the detected transition occurs, and in response to said detecting that a transition in the speech activity state of the multichannel signal occurs, the corresponding value of the voice activity detection signal indicates a lack of activity. 41. The medium according to claim 40, wherein said instructions when executed by the one or more processors cause the one or more processors to calculate a time derivative of energy for each of a plurality of different frequency components of the first channel during said one among the second plurality of segments, and wherein said detecting that the transition occurs during said one among the second plurality of segments is based on the calculated time derivatives of energy. 42. The medium according to claim 41, wherein said detecting that the transition occurs includes, for each of the plurality of different frequency components, and based on the corresponding calculated time derivative of energy, producing a corresponding indication of whether the frequency component is active, and wherein said detecting that the transition occurs is based on a relation between the number of said indications that indicate that the corresponding frequency component is active and a first threshold value. 43. The medium according to claim 42, wherein said instructions when executed by one or more processors cause the one or more processors, for a segment that occurs prior to the first plurality of consecutive segments in the multichannel signal: to calculate a time derivative of energy for each of a plurality of different frequency components of the first channel during the segment;to produce, for each of the plurality of different frequency components, and based on the corresponding calculated time derivative of energy, a corresponding indication of whether the frequency component is active; andto determine that a transition in a voice activity state of the multichannel signal does not occur during the segment, based on a relation between (A) the number of said indications that indicate that the corresponding frequency component is active and (B) a second threshold value that is higher than said first threshold value. 44. The medium according to claim 42, wherein said instructions when executed by one or more processors cause the one or more processors, for a segment that occurs prior to the first plurality of consecutive segments in the multichannel signal: to calculate, for each of a plurality of different frequency components of the first channel during the segment, a second derivative of energy with respect to time;to produce, for each of the plurality of different frequency components, and based on the corresponding calculated second derivative of energy with respect to time, a corresponding indication of whether the frequency component is impulsive; andto determine that a transition in a voice activity state of the multichannel signal does not occur during the segment, based on a relation between the number of said indications that indicate that the corresponding frequency component is impulsive and a threshold value. 45. The medium according to claim 40, wherein, for each of the first plurality of consecutive segments of the audio signal, said determining that voice activity is present in the segment is based on a difference between a first channel of the audio signal during the segment and a second channel of the audio signal during the segment, and wherein, for each of the second plurality of consecutive segments of the audio signal, said determining that voice activity is not present in the segment is based on a difference between a first channel of the audio signal during the segment and a second channel of the audio signal during the segment. 46. The medium according to claim 45, wherein, for each segment of said first plurality and for each segment of said second plurality, said difference is a difference between a level of the first channel and a level of the second channel during the segment. 47. The medium according to claim 45, wherein, for each segment of said first plurality and for each segment of said second plurality, said difference is a difference in time between an instance of a signal in the first channel during the segment and an instance of said signal in the second channel during the segment. 48. The medium according to claim 45, wherein, for each segment of said first plurality, said determining that voice activity is present in the segment comprises calculating, for each of a first plurality of different frequency components of the multichannel signal during the segment, a difference between a phase of the frequency component in the first channel and a phase of the frequency component in the second channel, wherein said difference between the first channel during the segment and the second channel during the segment is one of said calculated phase differences, and wherein, for each segment of said second plurality, said determining that voice activity is not present in the segment comprises calculating, for each of the first plurality of different frequency components of the multichannel signal during the segment, a difference between a phase of the frequency component in the first channel and a phase of the frequency component in the second channel, wherein said difference between the first channel during the segment and the second channel during the segment is one of said calculated phase differences. 49. The medium according to claim 48, wherein said instructions when executed by one or more processors cause the one or more processors to calculate a time derivative of energy for each of a second plurality of different frequency components of the first channel during said one among the second plurality of segments, and wherein said detecting that the transition occurs during said one among the second plurality of segments is based on the calculated time derivatives of energy, andwherein a frequency band that includes the first plurality of frequency components is separate from a frequency band that includes the second plurality of frequency components. 50. The medium according to claim 48, wherein, for each segment of said first plurality, said determining that voice activity is present in the segment is based on a corresponding value of a coherency measure that indicates a degree of coherence among the directions of arrival of at least the plurality of different frequency components, wherein said value is based on information from the corresponding plurality of calculated phase differences, and wherein, for each segment of said second plurality, said determining that voice activity is not present in the segment is based on a corresponding value of the coherency measure that indicates a degree of coherence among the directions of arrival of at least the plurality of different frequency components, wherein said value is based on information from the corresponding plurality of calculated phase differences.

이 특허에 인용된 특허 (14)

Ramabadran,Tenkasi, Distributed speech recognition with back-end voice activity detection apparatus and method.
상세보기
Wu, Wen-Rong; Lin, Shih-Chen; Chen, Po-Cheng; Kuo, Chun-Hung, Double-talk detector.
상세보기
Benyassine Adil ; Shlomot Eyal, Method and apparatus for generating frame voicing decisions of an incoming speech signal.
상세보기
Fanty, Mark; Phillips, Michael S., Segmentation approach for speech recognition systems.
상세보기
Preuss, Robert David; Fabbri, Darren Ross; Cruthirds, Daniel Ramsay, Speech analyzing system with speech codebook.
상세보기
Epstein, Edward; Lewis, Burn; Marcheret, Etienne, Speech recognition in noisy environments.
상세보기
Muroi Tetsuya,JPX, Speech segment detection and word recognition.
상세보기
Chan, Kwok-Leung; Visser, Erik; Park, Hyun Jin; Toman, Jeremy, Systems, methods, and apparatus for multi-microphone based speech enhancement.
상세보기
Rajendran, Vivek; Kandhadai, Ananthapadmanabhan A., Systems, methods, and apparatus for wideband encoding and decoding of inactive frames.
상세보기
Visser, Erik; Liu, Ian Ernan, Systems, methods, apparatus, and computer-readable media for coherence detection.
상세보기
Unno, Takahiro; Kragh, Jesper Gormsen; Ober, Fabien; Ertan, Ali Erdem, Voice activity detector and method.
상세보기
Gupta Prabhat K. (Germantown MD) Jangi Shrirang (Germantown MD) Lamkin Allan B. (Arlington VA) Kepley ; III W. Robert (Gaithersburg MD) Morris Adrian J. (Gaithersburg MD), Voice activity detector for speech signals in variable background noise.
상세보기
Boland,Simon Daniel, Voice-activity detection using energy ratios and periodicity.
상세보기
Choi,Yong Soo, Voiced/unvoiced information estimation system and method therefor.
상세보기

이 특허를 인용한 특허 (9)

Disch, Sascha; Geiger, Ralf; Helmrich, Christian; Multrus, Markus; Schmidt, Konstantin, Apparatus and method for generating a frequency enhanced signal using shaping of the enhancement signal.
상세보기
Disch, Sascha; Geiger, Ralf; Helmrich, Christian; Multrus, Markus; Schmidt, Konstantin, Apparatus and method for generating a frequency enhanced signal using temporal smoothing of subbands.
상세보기
Disch, Sascha; Geiger, Ralf; Helmrich, Christian; Multrus, Markus; Schmidt, Konstantin, Apparatus and method for generating a frequency enhancement signal using an energy limitation operation.
상세보기
Kim, Mi-young; Porov, Anton Victorovich; Oh, Eun-mi, Bit allocating, audio encoding and decoding.
상세보기
Kim, Mi-young; Porov, Anton Victorovich; Oh, Eun-mi, Bit allocating, audio encoding and decoding.
상세보기
Kim, Mi-young; Porov, Anton Victorovich; Oh, Eun-mi, Bit allocating, audio encoding and decoding.
상세보기
Kim, Mi-young; Oh, Eun-mi, Noise filling and audio decoding.
상세보기
Chen, Jixu; Tu, Peter Henry; Chang, Ming-Ching; Kim, Yelin; Lyu, Siwei, Systems and methods for analyzing time series data based on event transitions.
상세보기
Thomsen, Henrik; Nandy, Dibyendu, VAD detection apparatus and method of operation the same.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Systems, methods, and apparatus for speech feature detection 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (14)

이 특허를 인용한 특허 (9)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Systems, methods, and apparatus for speech feature detection 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (14)

이 특허를 인용한 특허 (9)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트