System and method for identifying audio command prompts for use in a voice response environment
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G10L-015/00
G10L-015/06
G10L-017/00
H04M-001/64
출원번호
US-0252185
(2011-10-03)
등록번호
US-8265932
(2012-09-11)
발명자
/ 주소
Dunsmuir, Martin R. M.
출원인 / 주소
Intellisist, Inc.
대리인 / 주소
Inouye, Patrick J. S.
인용정보
피인용 횟수 :
5인용 특허 :
49
초록▼
A system and method for identifying audio command prompts for use in a voice response environment is provided. A signature is generated for audio samples each having preceding audio, reference phrase audio, and trailing audio segments. The trailing segment is removed and each of the preceding and re
A system and method for identifying audio command prompts for use in a voice response environment is provided. A signature is generated for audio samples each having preceding audio, reference phrase audio, and trailing audio segments. The trailing segment is removed and each of the preceding and reference phrase segments are divided into buffers. The buffers are transformed into discrete fourier transform buffers. One of the discrete fourier transform buffers from the reference phrase segment that is dissimilar to each of the discrete fourier transform buffers from the preceding segment is selected as the signature. Audio command prompts are processed to generate a discrete fourier transform. Each discrete fourier transform for the audio command prompts is compared with each of the signatures and a correlation value is determined. One such audio command prompt matches one such signature when the correlation value for that audio command prompt satisfies a threshold.
대표청구항▼
1. A system for identifying audio command prompts for use in a voice response environment, comprising: a signature module to generate a signature for one or more received audio samples each having preceding audio, reference phrase audio, and trailing audio segments, comprising: a removal module to r
1. A system for identifying audio command prompts for use in a voice response environment, comprising: a signature module to generate a signature for one or more received audio samples each having preceding audio, reference phrase audio, and trailing audio segments, comprising: a removal module to remove the trailing audio segment and to divide each of the preceding audio and reference phrase audio segments into buffers;a transformation module to transform the buffers into discrete fourier transform buffers; anda selection module to select one of the discrete fourier transform buffers from the reference phrase audio segment that is least like any of the discrete fourier transform buffers from the preceding audio segment as the signature that identifies an audio phrase under the reference phrase audio segment for that audio sample, comprising: a preceding audio correlation module to determine a preceding audio correlation coefficient between each of the discrete fourier transform buffers from the reference phrase audio segment and each of the discrete fourier transform buffers from the preceding audio segment, and to select for each of the discrete fourier transform buffers from the reference phrase audio segment, a maximum value of the preceding audio correlation coefficients;a reference audio correlation module to determine a reference audio correlation coefficient between each of the discrete fourier transform buffers in the reference phrase audio segment and the remaining discrete fourier transform buffers in the reference phrase audio segment, and to select for each of the discrete fourier transform buffers from the reference phrase audio segment, a maximum value of the reference audio correlation coefficients;a distance module to determine a distance for each of the discrete fourier transform buffers in the reference phrase audio segment based on the maximum value for the preceding audio correlation coefficient and the maximum value for the reference audio correlation coefficient; anda selection module to select the one discrete fourier transform buffer from the reference phrase audio segment with the greatest distance as the signature;a audio command processor to receive audio command prompts and to process each of the audio command prompts to generate a discrete fourier transform;a comparison module to compare each discrete fourier transform for the audio command prompts with each of the signatures and to determine a correlation value of each comparison;a determination module to determine that one such audio command prompt matches one such signature when the correlation value for that audio command prompt and signature satisfies a threshold; anda processor to execute the modules. 2. A system according to claim 1, further comprising: an identification module to identify a host script associated with the matching signature, wherein the host script comprises at least one action; andan action module to perform the action. 3. A system according to claim 2, wherein the action comprises one of initiating a telephone call, inputting a password, playing a message, returning messages, terminating the telephone call, recording a message, and saving a message. 4. A system according to claim 1, further comprising: a phrase selection module to select the audio phrase represented by the signature by reviewing similar audio samples and by identifying a distinguished portion of at least one of the audio samples. 5. A system according to claim 4, wherein the phrase selection module selects for a remaining similar audio sample, a common portion of the similar audio samples that occurs later than the distinguished portion of the at least one audio sample as the audio phrase represented by the signature. 6. A system according to claim 1, further comprising: a signature generator to generate multiple signatures for a common audio sample. 7. A system according to claim 1, wherein the discrete fourier transforms in the signature and the discrete fourier transforms of the audio command prompts are based on samples of comparable size. 8. A system according to claim 1, further comprising: a reference phrase audio segment module to receive the reference phrase audio segment from a user. 9. A system according to claim 1, further comprising: a naming module to generate a host name for the signature. 10. A system according to claim 1, wherein the discrete fourier transform buffer from the reference phrase audio segment relates to each of the other discrete fourier transform buffers in the reference phrase audio segment. 11. A method for identifying audio command prompts for use in a voice response environment, comprising: generating a signature for one or more received audio samples each having preceding audio, reference phrase audio, and trailing audio segments, comprising: removing the trailing audio segment and dividing each of the preceding audio and reference phrase audio segments into buffers;transforming the buffers into discrete fourier transform buffers; andselecting one of the discrete fourier transform buffers from the reference phrase audio segment that is least like any of the discrete fourier transform buffers from the preceding audio segment as the signature that identifies an audio phrase under the reference phrase audio segment for that audio sample, comprising:determining a preceding audio correlation coefficient between each of the discrete fourier transform buffers from the reference phrase audio segment and each of the discrete fourier transform buffers from the preceding audio segment;selecting for each of the discrete fourier transform buffers from the reference phrase audio segment, a maximum value of the preceding audio correlation coefficients;determining a reference audio correlation coefficient between each of the discrete fourier transform buffers in the reference phrase audio segment and the remaining discrete fourier transform buffers in the reference phrase audio segment;selecting for each of the discrete fourier transform buffers from the reference phrase audio segment, a maximum value of the reference audio correlation coefficients;determining a distance for each of the discrete fourier transform buffers in the reference phrase audio segment based on the maximum values of the preceding audio correlation coefficient and the maximum value of the reference audio correlation coefficient; andselecting the one discrete fourier transform buffer from the reference phrase audio segment with the greatest distance as the signature;receiving audio command prompts and processing each of the audio command prompts to generate a discrete fourier transform;comparing each discrete fourier transform for the audio command prompts with each of the signatures and determining a correlation value of each comparison; anddetermining that one such audio command prompt matches one such signature when the correlation value for that audio command prompt and signature satisfies a threshold. 12. A method according to claim 11, further comprising: identifying a host script associated with the matching signature, wherein the host script comprises at least one action; andperforming the action. 13. A method according to claim 12, wherein the action comprises one of initiating a telephone call, inputting a password, playing a message, returning messages, terminating the telephone call, recording a message, and saving a message. 14. A method according to claim 11, further comprising: selecting the audio phrase represented by the signature for similar audio samples, comprising: reviewing the similar audio samples; andidentifying a distinguished portion of at least one of the audio samples. 15. A method according to claim 14, further comprising: selecting for a remaining similar audio sample, a common portion of the similar audio samples that occurs later than the distinguished portion of the at least one audio sample as the audio phrase represented by the signature. 16. A method according to claim 11, further comprising: generating multiple signatures for a common audio sample. 17. A method according to claim 11, wherein the discrete fourier transforms in the signature and the discrete fourier transforms of the audio command prompts are based on audio samples of comparable size. 18. A method according to claim 11, further comprising: receiving the reference phrase audio segment from a user. 19. A method according to claim 11, further comprising: generating a host name for the signature. 20. A method according to claim 11, wherein the discrete fourier transform buffer from the reference phrase audio segment relates to each of the other discrete fourier transform buffers in the reference phrase audio segment.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (49)
Roland Kuhn ; Jean-Claude Junqua, Adaptation system and method for E-commerce and V-commerce applications.
Syed S. Ali ; Joseph M. Cannon ; James A. Johanson ; Joseph A. Sopko, Apparatus and method for grouping and prioritizing voice messages for convenient playback.
Emerson William D. (Boulder CO) Hill Deborah J. (Denver CO) Loeb Karen C. (Englewood CO) Mizrahi Albert (Boulder CO) Schlegel Charles T. (Boulder CO) Scott Lowell C. (Old Bridge NJ), Integrated message service system.
James R. Lewis ; Kerry A. Ortega ; Ronald E. Van Buskirk ; Huifang Wang ; Amado Nassiff ; Barbara E. Ballard, Method and apparatus for improving speech command recognition accuracy using event-based constraints.
Bruckner Markus (Basel CH) Guanella Gustav (Zurich CH) Vouga Claude Andre (Baden CH), Method and apparatus for the secret transmission of speech signals.
Julia Skladman ; Robert J. Thornberry, Jr. ; Bruce A. Chatterley ; Alexander Siu-Kay Ng CA; Bruce L. Peterson, Method and system for interfacing systems unified messaging with legacy systems located behind corporate firewalls.
Grajski,Kamil, Method of and apparatus for improving productivity of human reviewers of automatically transcribed documents generated by media conversion systems.
Matsuura Yoshihiro (Funabashi OR JPX) Skinner Toby (Beaverton OR), Speaker independent speech recognition system and method using neural network and DTW matching technique.
Suzuki Matsumi (Ebina JA) Morino Tetsuro (Ebina JA) Yokota Shozo (Ebina JA), Speech recognition method and apparatus adapted to a plurality of different speakers.
Cheston ; III Frank C. ; Hatton Patricia V., Voice mail system for obtaining forwarding number information from directory assistance systems having speech recognition.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.