A distributed voice user interface system includes a local device which receives speech input issued from a user. Such speech input may specify a command or a request by the user. The local device performs preliminary processing of the speech input and determines whether it is able to respond to the
A distributed voice user interface system includes a local device which receives speech input issued from a user. Such speech input may specify a command or a request by the user. The local device performs preliminary processing of the speech input and determines whether it is able to respond to the command or request by itself. If not, the local device initiates communication with a remote system for further processing of the speech input.
대표청구항▼
What is claimed is: 1. A local device comprising: a primary functionality component; an input component configured to receive speech input; a processing component coupled to the input component, the processing component configured to: identify keywords in the speech input, determine whether the loc
What is claimed is: 1. A local device comprising: a primary functionality component; an input component configured to receive speech input; a processing component coupled to the input component, the processing component configured to: identify keywords in the speech input, determine whether the local device is capable of processing the speech input based on whether one or more keywords are identified in the speech input, if the local device is capable of processing the speech input, process the speech input, generate corresponding local control signals, and transmit the local control signals to the primary functionality component to direct an action in the primary functionality component, and if the local device is not capable of processing the speech input, extract feature parameters from the speech input for processing at a remote system, receive remote control signals from the remote system responsive to the remote system performing speech recognition on the feature parameters by storing an acoustic model of the feature parameters and recognizing a command based on a previously stored acoustic model associated with the local device to address specific characteristics of the feature parameters, and send the remote control signals to the primary functionality component; and a transceiver coupled to the processing component and configured to establish communications between the local device and the remote system, wherein the communications comprise: high bandwidth communications configured to return data supporting audio or video output at the local device, and low bandwidth communications configured to return data supporting the remote control signals. 2. The local device of claim 1, further comprising a manual input component configured to allow manual initiation of the communications. 3. The local device of claim 1, wherein the processing component is configured to transmit the feature parameters to the remote system for speech recognition processing. 4. The local device of claim 1, further comprising a recording component configured to record the speech input. 5. The local device of claim 4, wherein the recording component is configured to play back the recorded speech input for transmission to the remote system. 6. The local device of claim 1, wherein the processing component comprises a speech generation engine configured to generate speech output. 7. The local device of claim 6, wherein the speech output generated by the speech generation engine is consistent with speech output generated by the remote system. 8. The local device of claim 1, wherein the local device is configured to analyze the speech input to extract feature parameters and to send a signal based upon the extracted feature parameters to the remote system. 9. The local device of claim 1, wherein both the local device and the remote system analyze the speech input in order to extract feature parameters. 10. The local device of claim 1, wherein the local device comprises at least one of a personal digital assistant, a smart telephone, a remote control device, a household appliance, an entertainment system, a security system, and a climate control system. 11. The local device of claim 1, wherein the processing component is further configured to replace a keyword in the set of known keywords based on the update from the remote system. 12. The local device of claim 1, wherein the processing component is further configured to add a keyword to the set of known keywords based on the update from the remote system. 13. A method of providing a voice interface for a primary functionality component in a local device, comprising: receiving, at the local device, a speech input; identifying keywords in the speech input; establishing communications between the local device and a remote system, wherein the communications comprise: high bandwidth communications configured to return data supporting audio or video output at the local device, and low bandwidth communications configured to return data supporting the remote control signals; determining, at the local device, whether the local device is capable of processing the speech input based on whether one or more keywords are identified in the speech input; if the local device is capable of processing the speech input, then processing the speech input at the local device, generating corresponding local control signals, and transmitting the local control signals to the primary functionality component to direct an action in the primary functionality component; and if the local device is not capable of processing the speech input, then extracting feature parameters from the speech input for processing at the remote system, sending the feature parameters to the remote system for processing by storing an acoustic model of the feature parameters and recognizing a command based on a previously stored acoustic model associated with the local device to address specific characteristics of the feature parameters, receiving remote control signals from the remote system responsive to the feature parameters via the low bandwidth communications, and sending the remote control signals to the primary functionality component. 14. The method of claim 13, wherein establishing communications between the local device and the remote system comprises: establishing the communications between the local device and the remote system upon manual initiation. 15. The method of claim 13, wherein establishing communications between the local device and a remote system comprises: establishing the communications between the local device and the remote system upon identification of a wake up command. 16. The method of claim 13, further comprising: transmitting the feature parameters to the remote system for processing. 17. The method of claim 13, further comprising: recording the speech input. 18. The method of claim 17, further comprising: playing back the recorded speech input for transmission to the remote system. 19. The method of claim 13, further comprising: generating a first speech output. 20. The method of claim 19, further comprising: receiving a second speech output from the remote system, wherein the first speech output is consistent with the second speech output. 21. The method of claim 13, further comprising: analyzing the speech input for the feature parameters; and sending a signal based upon the feature parameters to the remote system. 22. The method of claim 13, wherein modifying the set of known keywords based at least in part on the update from the remote system comprises replacing a keyword in the set of known keywords. 23. The method of claim 13, wherein modifying the set of known keywords based at least in part on the update from the remote system comprises adding a keyword to the set of known keywords. 24. The method of claim 13, further comprising: storing a previous enunciation of a certain word. 25. A tangible computer readable medium having stored thereon computer-executable instructions that, if executed by a computing device, cause the computing device to perform a method comprising: receiving, at the local device, a speech input; identifying keywords in the speech input; establishing communications between the local device and a remote system, wherein the communications comprise: high bandwidth communications configured to return data supporting audio or video output at the local device, and low bandwidth communications configured to return data supporting the remote control signals; determining, at the local device, whether the local device is capable of processing the speech input based on whether one or more keywords are identified in the speech input; if the local device is capable of processing the speech input, then processing the speech input at the local device, generating corresponding local control signals, and transmitting the local control signals to the primary functionality component to direct an action in the primary functionality component; and if the local device is not capable of processing the speech input, then extracting feature parameters from the speech input for processing at the remote system, sending the feature parameters to the remote system for processing by storing an acoustic model of the feature parameters and recognizing a command based on a previously stored acoustic model associated with the local device to address specific characteristics of the feature parameter, receiving remote control signals from the remote system responsive to the feature parameters via the low bandwidth communications, and sending the remote control signals to the primary functionality component. 26. The computer program product of claim 25, wherein establishing communications between the local device and the remote system comprises: establishing the communications between the local device and the remote system upon manual initiation. 27. The computer program product of claim 25, wherein establishing communications between the local device and the remote system comprises: establishing the communications between the local device and the remote system upon identification of a wake up command. 28. The computer program product of claim 25, wherein the method further comprises: transmitting the feature parameters to the remote system for processing. 29. The computer program product of claim 25, wherein the method further comprises: recording the speech input. 30. The computer program product of claim 29, wherein the method further comprises: playing back the recorded speech input for transmission to the remote system. 31. The computer program product of claim 25, wherein the method further comprises: generating a first speech output. 32. The computer program product of claim 31, wherein the method further comprises: receiving a second speech output from the remote system, wherein the first speech output is consistent with the second speech output. 33. The computer program product of claim 25, wherein the method further comprises: analyzing the speech input in order to extract feature parameters; and sending a signal based upon the extracted feature parameters to the remote system. storing a previous enunciation of a certain word. 34. The computer program product of claim 25, wherein modifying the set of known keywords based at least in part on the update from the remote system comprises replacing a keyword in the set of known keywords. 35. The computer program product of claim 25, wherein modifying the set of known keywords based at least in part on the update from the remote system comprises adding a keyword to the set of known keywords. 36. The computer program product of claim 25, wherein the method further comprises: storing a previous enunciation of a certain word. 37. A local device comprising: receiving means to receive a speech input; identifying means to identify keywords in the speech input; establishing means to establish communications between the local device and a remote system, wherein the communications comprise: high bandwidth communications configured to return data supporting audio or video output at the local device, and low bandwidth communications configured to return data supporting the remote control signals; determining means to determine whether a local device is capable of processing the speech input based on whether one or more keywords are identified in the speech input; first responding means to respond to the speech input if the determining means determines that the local device is capable of processing the speech input, comprising: processing means to process the speech input at the local device, generating means to generate corresponding local control signals, and transmitting means to transmit the local control signals to a primary functionality component to direct an action in the primary functionality component; and second responding means to respond to the speech input if the determining means determines that the local device is not capable of processing the speech input, comprising: extracting means to extract feature parameters from the speech input for processing at the remote system, first sending means to send the feature parameters to the remote system for processing by storing an acoustic model of the feature parameters and recognizing a command based on a previously stored acoustic model associated with the local device to address specific characteristics of the feature parameter, receiving means to receive remote control signals from the remote system responsive to the feature parameters via the low bandwidth communications, and second sending means to send the remote control signals to the primary functionality component. 38. A local device comprising: a primary functionality component; an input component configured to receive speech input; a processing component coupled to the input component, the processing component configured to: identify keywords in the speech input, determine whether the local device is capable of processing the speech input based on whether one or more keywords are identified in the speech input, if the local device is capable of processing the speech input, process the speech input, generate corresponding local control signals, and transmit the local control signals to the primary functionality component to direct an action in the primary functionality component, and if the local device is not capable of processing the speech input, extract feature parameters from the speech input for processing at a remote system, receive remote control signals from the remote system responsive to the remote system performing speech recognition on the feature parameters by storing an acoustic model of the feature parameters and recognizing a command based on a previously stored acoustic model associated with the local device to address specific characteristics of the feature parameters, and send the remote control signals to the primary functionality component; and a recording component configured to record the speech input and to play back the recorded speech input for transmission to the remote system. 39. A method of providing a voice interface for a primary functionality component in a local device, comprising: receiving, at the local device, a speech input; identifying keywords in the speech input; determining, at the local device, whether the local device is capable of processing the speech input based on whether one or more keywords are identified in the speech input; if the local device is capable of processing the speech input, then processing the speech input at the local device, generating corresponding local control signals, and transmitting the local control signals to the primary functionality component to direct an action in the primary functionality component; if the local device is not capable of processing the speech input, then extracting feature parameters from the speech input for processing at a remote system, sending the feature parameters to the remote system for processing by storing an acoustic model of the feature parameters and recognizing a command based on a previously stored acoustic model associated with the local device to address specific characteristics of the feature parameters, receiving remote control signals from the remote system responsive to the feature parameters, and sending the remote control signals to the primary functionality component; recording the speech input; and playing back the recorded speech input for transmission to the remote system. 40. A tangible computer readable medium having stored thereon computer-executable instructions that, if executed by a computing device, cause the computing device to perform a method comprising: receiving, at the local device, a speech input; identifying keywords in the speech input; determining, at the local device, whether the local device is capable of processing the speech input based on whether one or more keywords are identified in the speech input; if the local device is capable of processing the speech input, then processing the speech input at the local device, generating corresponding local control signals, and transmitting the local control signals to the primary functionality component to direct an action in the primary functionality component; if the local device is not capable of processing the speech input, then extracting feature parameters from the speech input for processing at a remote system, sending the feature parameters to the remote system for processing by storing an acoustic model of the feature parameters and recognizing a command based on a previously stored acoustic model associated with the local device to address specific characteristics of the feature parameter, receiving remote control signals from the remote system responsive to the feature parameters, and sending the remote control signals to the primary functionality component; recording the speech input; and playing back the recorded speech input for transmission to the remote system. 41. A local device comprising: receiving means to receive a speech input; identifying means to identify keywords in the speech input; determining means to determine whether a local device is capable of processing the speech input based on whether one or more keywords are identified in the speech input; first responding means to respond to the speech input if the determining means determines that the local device is capable of processing the speech input, comprising: processing means to process the speech input at the local device, generating means to generate corresponding local control signals, and transmitting means to transmit the local control signals to a primary functionality component to direct an action in the primary functionality component; second responding means to respond to the speech input if the determining means determines that the local device is not capable of processing the speech input, comprising: extracting means to extract feature parameters from the speech input for processing at a remote system, first sending means to send the feature parameters to the remote system for processing by storing an acoustic model of the feature parameters and recognizing a command based on a previously stored acoustic model associated with the local device to address specific characteristics of the feature parameter, receiving means to receive remote control signals from the remote system responsive to the feature parameters, and second sending means to send the remote control signals to the primary functionality component; recording means to record the speech input; and playback means to play back the recorded speech input for transmission to the remote system.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (95)
Cook ; deceased Donald A. ; Lukas George ; Lukas Andrew V. ; Padwa David J., Agent based instruction system and method.
Eting Leon (352 Agur Street Maccabim ILX) Gelfer Yair (367 Pashosh Street Maccabim ILX 71908 ), Apparatus and methods for conveying telephone numbers and other information via communication devices.
Barnett Christopher J. (Little Silver NJ) Dyche Mark P. (Chandler\s Ford GB2) Golding Victor G. (Ower GB2) Ng Yat H. (Cuppernham GB2), Audio response terminal for use with data processing systems.
Edward Horowitz ; Mark Ambrose ; Michael Fetta ; Steve Hakim ; Michael Kaufman ; David Robinson, Automated system and method for customized and personalized presentation of products and services of a financial institution.
Dodrill, Lewis Dean; Joshi, Satish; Danner, Ryan Alan; Barban, Susan Harrow; Martin, Steven J., Browser-based arrangement for developing voice enabled web applications using extensible markup language documents.
Doll ; Jr. William J. (San Diego CA) Judy Murray S. (Carlsbad CA) Kirchner ; III Albert H. (Great Falls VA) Krier Thomas J. (San Diego CA) McVicker Rudolph M. (San Diego CA) Monroe Brian E. (San Dieg, Digital/audio interactive communication network.
Osder Barbara E. ; Landolt David R. ; Freiman Alex ; Capriotti Steven J. ; Luzeski Steven, Enhanced multi-lingual prompt management in a voice messaging system with support for speech recognition.
Perine Michael C. (101 Ocean Lane Dr. ; Ste. 4011 Key Biscayne FL 33149) Han Gregory C. (685 Curtiswood Dr. Key Biscayne FL 33149), Interactive data retrieval system for producing facsimile reports.
Monaco Peter C. ; Ehrlich Steven C. ; Ghosh Debajit ; Klenk Mark ; Sinai Julian ; Thirumalai Madhavan ; Gupta Sundeep, Method and apparatus for creating modifiable and combinable speech objects for acquiring information from a speaker in an interactive voice response system.
Brode Holger W.,DEX ; Schroer Olaf,DEX ; Marschner Jens F.,DEX ; Aust Harald,DEX ; Marti Del Olmo Enrique,DEX, Method and apparatus for executing a human-machine dialogue in the form of two-sided speech as based on a modular dialogue structure.
Britton James T. (Aberdeen NJ) Figueroa Lorraine (Brooklyn NY) Patterson John F. (Morristown NJ) Rosenthal Robert I. (Wayside NJ) Rosinski Richard R. (Middletown NJ), Method and apparatus for generating computer-controlled interactive voice services.
Forest Serge,CAX ; Forgues Pierre M.,CAX ; Cruickshank Brian,CAX, Method and apparatus for providing an improved user interface in speech recognition systems.
Sparks Randall B. (Louisville CO) Meiskey Lori (Broomfield CO) Brunner Hans (Denver CO), Method and system for interactive object-oriented dialogue management.
Katz Ronald A. ; West Gary L. ; Barker Thomas B., Methods and apparatus for intelligent selection of goods and services in telephonic and electronic commerce.
Osder Barbara E. (Erdenheim PA) Elrod Edwin M. (Downingtown PA) Freiman Alex C. (New Britain PA) Hogan Timothy J. (Wellington NZX), Multi-lingual prompt management system for a network applications platform.
Kanevsky Dimitri ; Maes Stephane Herman ; Poon Peter S. ; Prochilo Carl, Portable acoustic interface for remote access to automatic speech/speaker recognition server.
Bernard Warren E. ; Jacobson Philip A., System and method for automated remote previewing and purchasing of music, video, software, and other multimedia product.
Marx Matthew T. ; Carter Jerry K. ; Phillips Michael S. ; Holthouse Mark A. ; Seabury Stephen D. ; Elizondo-Cecenas Jose L. ; Phaneuf Brett D., System and method for developing interactive speech applications.
Adams ; Jr. Hugh Williams ; Das Subrata Kumar ; Fairweather Peter Gustav ; Nix Don Holmes, System and method for interactive reading and language instruction.
Dragosh Pamela Leigh ; Roe David Bjorn ; Sharp Robert Douglas, System and method for providing remote automatic speech recognition services via a packet network.
Groner Gabriel F. (Palo Alto CA) Dorsey Eric A. (Palo Alto CA) Williams Keith M. (San Francisco CA) Vyas Harihar J. (San Jose CA), Verbal computer terminal system.
Hughes Jeremy Peter James,GBX ; Hulse Brian,GBX ; Jordan Robert Michael,GBX ; Maynard Caroline Edith,GBX ; Pickering John Brian,GBX ; Ritchie Andrew,GBX, Voice processing system.
Surace Kevin J. ; White George M. ; Reeves Byron B. ; Nass Clifford I. ; Campbell Mark D. ; Albert Roy D. ; Giangola James P., Voice user interface with personality.
Kowalkowski Mark Anthony ; Koziel Ronald Charles ; Kuch Robert Joseph ; Shanmugham Varudiyam P., Voice-control integrated field support data communications system for maintenance, repair and emergency services.
Epstein, Mark Edward; Mengibar, Pedro J. Moreno; Biadsy, Fadi, Methods and systems for determining instructions for applications that are recognizable by a voice interface.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.