Method for processing the output of a speech recognizer
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G10L-015/18
G10L-015/22
출원번호
US-0445096
(2014-07-29)
등록번호
US-9502027
(2016-11-22)
발명자
/ 주소
Roy, Philippe
Lagassey, Paul J.
출원인 / 주소
Great Northern Research, LLC
대리인 / 주소
Hoffberg, Esq., Steven M.
인용정보
피인용 횟수 :
0인용 특허 :
126
초록▼
A method for processing speech, comprising semantically parsing a received natural language speech input with respect to a plurality of predetermined command grammars in an automated speech processing system; determining if the parsed speech input unambiguously corresponds to a command and is suffic
A method for processing speech, comprising semantically parsing a received natural language speech input with respect to a plurality of predetermined command grammars in an automated speech processing system; determining if the parsed speech input unambiguously corresponds to a command and is sufficiently complete for reliable processing, then processing the command; if the speech input ambiguously corresponds to a single command or is not sufficiently complete for reliable processing, then prompting a user for further speech input to reduce ambiguity or increase completeness, in dependence on a relationship of previously received speech input and at least one command grammar of the plurality of predetermined command grammars, reparsing the further speech input in conjunction with previously parsed speech input, and iterating as necessary. The system also monitors abort, fail or cancel conditions in the speech input.
대표청구항▼
1. A method for processing speech, comprising: providing an automated speech processing system having a command processor and at least one memory;receiving a natural language speech input generated by a microphone;semantically parsing the received natural language speech input using an automated sta
1. A method for processing speech, comprising: providing an automated speech processing system having a command processor and at least one memory;receiving a natural language speech input generated by a microphone;semantically parsing the received natural language speech input using an automated statistical processor with respect to a plurality of predetermined command grammars in the automated speech processing system, the plurality of predetermined grammars defining mutually inconsistent command outcomes from the command processor, said semantically parsing selectively excluding predetermined command grammars having respective command outcomes inconsistent with the previously received natural language speech;determining:if the semantic parsing of the received natural language speech input corresponds to a single non-excluded command outcome according to a respective command grammar, and is complete for reliable processing by the command processor, then processing the command with the command processor, according to the single non-excluded command grammar and exiting said determining;if the received natural language speech input corresponds to a plurality of command outcomes, or is not complete for reliable processing according to a plurality of non-excluded predetermined command grammars, then: prompting a user for further natural language speech input dependent on at least one of the plurality of non-excluded predetermined command grammars, the prompting comprising feedback representing an identification of at least one command type putatively recognized, and information required to reduce correspondence to a plurality of non-excluded predetermined command grammars or to increase completeness, in dependence on a relationship of the previously received natural language speech input and at least one command grammar of the plurality of non-excluded predetermined command grammars,reparsing the further natural language speech input with the automated statistical processor in conjunction with previously parsed natural language speech input, anditerating said determining; andif an abort, fail or cancel condition is detected in the natural language speech input, exiting said determining. 2. The method according to claim 1, wherein at least one predetermined command grammar is context sensitive, further comprising determining a context of the natural language speech input, and said determining comprises determining if the parsed natural language speech input corresponds to the single command within the determined context, and is complete for reliable processing. 3. The method according to claim 1, wherein the natural language speech input is processed with a statistical processor, and said determining is based on statistical probability of competing outcomes. 4. The method according to claim 1, wherein the automated speech processing system is adaptive. 5. The method according to claim 1, wherein the automated speech processing system employs both a hierarchal Markov model and the plurality of predetermined command grammars comprise a plurality of independent context free grammars. 6. The method according to claim 1, wherein said determining does not prompt a user for natural language speech input unnecessary for reliable processing. 7. The method according to claim 1, wherein said determining is responsive to at least one non-linguistic user input. 8. The method according to claim 1, wherein the natural language speech input is processed by concurrent execution in parallel in at least one virtual workspace. 9. The method according to claim 1, wherein the natural language speech input is processed by a distributed pool comprising multiple logical command processors. 10. The method according to claim 1, further comprising determining a dialect of the user, wherein the natural language speech input is selectively processed in dependence on the determined dialect of the user. 11. A system for processing speech, comprising: a port configured to receive a natural language speech input from a user through a microphone;a memory configured to store the received natural language speech input;an automated statistical speech processor configured to semantically parse the natural language speech input with respect to a plurality of predetermined command grammars defining mutually inconsistent command outcomes; an automated speech processor configured to:exclude predetermined command grammars of the plurality of predetermined grammars having respective command outcomes inconsistent with the previously received natural language speech;determine: if the semantically parsed natural language speech input corresponds to a single non-excluded command grammar and is complete for reliable processing, then storing the command grammar for execution and exiting the determining;if the previously received natural language speech input corresponds to a plurality of non-excluded predetermined command grammars or is not complete for reliable processing, then: prompting a user for further natural language speech input to further exclude predetermined command grammars based on inconsistency with the previously received natural language speech input or increase completeness with respect to reliable processing,present the further natural language speech input to the automated statistical speech processor for reparsing in conjunction with previously parsed natural language speech input, anditerate the determination; andif an abort, fail or cancel condition is detected in the natural language speech input, then exiting the determination said determining. 12. The system according to claim 11, wherein at least one predetermined command grammar is context sensitive, and the automated speech processor is further configured to determine the context, and the automated statistical speech processor is configured to determine if the parsed natural language speech input corresponds to the single non-excluded command within the determined context, and is complete with respect to the single non-excluded command for reliable processing. 13. The system according to claim 11, wherein the automated speech processor employs both a hierarchal Markov model and a plurality of independent context free grammars. 14. The system according to claim 11, wherein the automated speech processor is further configured to receive at least one non-linguistic user input, and to process the natural language speech input selectively in dependence on the at least one non-linguistic user input. 15. The system according to claim 11, wherein the automated speech processor comprises a distributed pool concurrently operative logical command processors. 16. The system according to claim 11, wherein the automated speech processor is further configured to automatically determine a dialect of a user, and to selective process the natural language speech input in dependence on the determined dialect. 17. A method for processing speech, comprising: receiving a speech input from a user through a microphone, and a context of the received speech input;testing an output of a speech recognizer to determine: whether it probabilistically indicates that a command to be processed is present,if the command to be processed is probabilistically present, parsing the command with respect to a plurality of predetermined command grammars representing a plurality of alternate command grammars consistent with the received speech input, andwhether the probabilistically present command to be processed is unambiguous and complete subject to the context with high statistical reliability;in dependence on the testing, either:processing the probabilistically present command if unambiguous and complete subject to the context with high statistical reliability orprompting the user to provide specific inputs to reduce the ambiguity or increase completeness subject to the context of the probabilistically present command, to increase the statistical reliability of the probabilistically present command; anddetermining whether the user seeks to abort processing of an incomplete or ambiguous command, and in dependence on said determining, halting processing of the probabilistically present command. 18. The method according to claim 17, wherein said testing is performed with a statistical processor, and said determining whether the probabilistically present command is based on comparison of statistical probabilities of competing outcomes. 19. The method according to claim 17, wherein said testing employs both a hierarchal Markov model. 20. The method according to claim 17, wherein said testing is further responsive to at least one non-linguistic user input.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (126)
Julia, Luc; Voutsas, Dimitris; Cheyer, Adam, Accessing network-based electronic information through scripted online interfaces using spoken input.
Shieber Stuart M. ; Armstrong John ; Baptista Rafael Jose ; Bentz Bryan A. ; Ganong ; III William F. ; Selesky Donald Bryant, Command parsing and rewrite system.
Bennett, Ian M.; Babu, Bandi Ramesh; Morkhandikar, Kishor; Gururaj, Pallaki, Interactive speech based learning/training system formulating search queries based on natural language parsing of recognized user queries.
Colier Ronald L. (T/L Apartments ; 16 Cheshire Dr. ; Apt. 121 Pittsfield MA 01201), Method and apparatus adapted for an audibly-driven, handheld, keyless and mouseless computer for performing a user-cente.
Monaco Peter C. ; Ehrlich Steven C. ; Ghosh Debajit ; Klenk Mark ; Sinai Julian ; Thirumalai Madhavan ; Gupta Sundeep, Method and apparatus for creating modifiable and combinable speech objects for acquiring information from a speaker in an interactive voice response system.
Tal Peter (53 Driftwood Dr. Port Washington NY 11050), Method and apparatus for uniquely identifying individuals by particular physical characteristics and security system uti.
Stephanick, James; Eyraud, Richard; Kay, David Jon; Meurs, Pim Van; Bradford, Ethan; Longe, Michael R., Method and apparatus utilizing voice input to resolve ambiguous manually entered text input.
Michael S. Phillips ; Mark A. Fanty ; Krishna K. Govindarajan, Method and system of reviewing the behavior of an interactive speech recognition application.
Noyes Dallas B. (2500 George Washington Way ; #124 Richland WA 99352), Method for representation of knowledge in a computer as a network database system.
Hernandez-Abrego, Gustavo A., Methods and system for evaluating potential confusion within grammar structure for set of statements to be used in speech recognition during computing event.
Battle James Thomas ; Hung Andy C. ; Purcell Stephen C., Multimedia processor using variable length instructions with opcode specification of source operand as result of prior i.
Loatman Robert B. (Vienna VA) Post Stephen D. (McLean VA) Yang Chih-King (Rockville MD) Hermansen John C. (Catharpin VA), Natural language understanding system.
Janek Gabor,HUX ; Wutte Heribert,AUX ; Grabherr Manfred, Product including a speech recognition device and method of generating a command lexicon for a speech recognition device.
Morgan Scott Anthony ; Roberts David John,GBX ; Swearingen Craig Ardner ; Tannenbaum Alan Richard, Speech command input recognition system for interactive computer display with term weighting means used in interpreting potential commands from relevant speech terms.
Gagnon, Jean; Roy, Philippe; Lagassey, Paul J., Speech interface system and method for control and interaction with applications on a computing system.
Gagnon, Jean; Roy, Philippe; Lagassey, Paul J., Speech interface system and method for control and interaction with applications on a computing system.
Gagnon, Jean; Roy, Philippe; Lagassey, Paul J., Speech interface system and method for control and interaction with applications on a computing system.
Ohmori,Kumiko; Higashida,Masanobu; Mizusawa,Noriko, Speech recognition based interactive information retrieval scheme using dialogue control to reduce user stress.
Marx Matthew T. ; Carter Jerry K. ; Phillips Michael S. ; Holthouse Mark A. ; Seabury Stephen D. ; Elizondo-Cecenas Jose L. ; Phaneuf Brett D., System and method for developing interactive speech applications.
Dantzig,Paul M.; Filepp,Robert; Liu,Yew Huey, System and method for generating and presenting multi-modal applications from intent-based markup scripts.
Maes, Stephane Herman; Neti, Chalapathy Venkata, System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input.
Coffman, Daniel M.; Hosn, Rafah A.; Kleindienst, Jan; Maes, Stephane H.; Raman, Thiruvilwamalai V., System and method for providing dialog management and arbitration in a multi-modal environment.
Eberle, Hannes; Leon, Christopher S.; Maass, Bodo; Patnaik, Anurag; Santa Ana, Alberto; Zirngibl, Michael, System and method for the creation and automatic deployment of personalized, dynamic and interactive inbound and outbound voice services, with real-time interactive voice database queries.
Maes, Stephane H.; Lubensky, David M.; Sakrajda, Andrzej, Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources.
Ackley H. Sprague ; Maltsev Pavel A. ; Ohanian Michael, Universal data input and processing device, such as universal point-of-sale device for inputting and processing bar code symbols, document images, and other data.
Ball, Thomas J.; Cox, Kenneth Charles; Grinter, Rebecca Elizabeth; Hibino, Stacie Lynn; Jagadeesan, Lalita Jategaonkar; Mantilla, David Alejandro, User interface for translating natural language inquiries into database queries and data presentations.
Brant Arthur ; Mandell Kenneth ; Rader R. Scott ; Walsh Alexander ; deJuan ; Jr. Eugene ; Greenberg Robert, Voice command and control medical care system.
Alpdemir, Ahmet; James, Arthur, Voice-interactive marketplace providing promotion and promotion tracking, loyalty reward and redemption, and other features.
Lim, Kang S.; Nguyen, Joseph A., Voice-recognition-based methods for establishing outbound communication through a unified messaging system including intelligent calendar interface.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.