Speech-enabled web content searching using a multimodal browser
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G10L-015/22
G10L-015/00
G06F-017/30
출원번호
US-0685350
(2007-03-13)
등록번호
US-8843376
(2014-09-23)
발명자
/ 주소
Cross, Jr., Charles W.
출원인 / 주소
Nuance Communications, Inc.
대리인 / 주소
Wolf, Greenfield & Sacks, P.C.
인용정보
피인용 횟수 :
5인용 특허 :
184
초록▼
Speech-enabled web content searching using a multimodal browser implemented with one or more grammars in an automatic speech recognition (‘ASR’) engine, with the multimodal browser operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-vo
Speech-enabled web content searching using a multimodal browser implemented with one or more grammars in an automatic speech recognition (‘ASR’) engine, with the multimodal browser operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal browser operatively coupled to the ASR engine, includes: rendering, by the multimodal browser, web content; searching, by the multimodal browser, the web content for a search phrase, including yielding a matched search result, the search phrase specified by a first voice utterance received from a user and a search grammar; and performing, by the multimodal browser, an action in dependence upon the matched search result, the action specified by a second voice utterance received from the user and an action grammar.
대표청구항▼
1. A method of speech-enabled searching of web content using a multimodal browser, the method implemented with one or more grammars in an automatic speech recognition (‘ASR’) engine, with the multimodal browser operating on a multimodal device supporting multiple modes of interaction including a voi
1. A method of speech-enabled searching of web content using a multimodal browser, the method implemented with one or more grammars in an automatic speech recognition (‘ASR’) engine, with the multimodal browser operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal browser operatively coupled to the ASR engine, the method comprising: rendering, by the multimodal browser, web content;searching, by the multimodal browser, the rendered web content for a search phrase, including matching the search phrase to at least one portion of the rendered web content, yielding a matched search result, the search phrase specified by a first voice utterance received from a user and a search grammar; andin response to a second voice utterance received from the user: using an action grammar comprising one or more entries to recognize the second voice utterance as corresponding to a first entry of the one or more entries, the action grammar specifying, for the first entry of the one or more entries, an associated first action to be taken in dependence upon the matched search result, andfor a second entry of the one or more entries, an associated second action to be taken in dependence upon the same matched search result, the second action being different from the first action, andperforming, by the multimodal browser, the first action in dependence upon the matched search result associated with the first entry. 2. The method of claim 1 wherein searching, by the multimodal browser, the web content for a search phrase, including yielding a matched search result further comprises: creating the search grammar in dependence upon the web content;receiving the first voice utterance from a user; anddetermining, using the ASR engine, the search phrase in dependence upon the first voice utterance and the search grammar. 3. The method of claim 2 wherein matching the search phrase to at least one portion of the web content, yielding a matched search result further comprises identifying a node of a Document Object Model (‘DOM’) representing the web content that contains the search phrase. 4. The method of claim 1 wherein performing, by the multimodal browser, an action in dependence upon the matched search result further comprises: creating the action grammar in dependence upon the matched search result;receiving the second voice utterance from the user;determining, using the ASR engine, an action identifier in dependence upon the second voice utterance and the action grammar; andperforming the specified action in dependence upon the action identifier. 5. The method of claim 1 further comprising augmenting, by the multimodal browser, the matched search result with additional web content. 6. The method of claim 5 wherein augmenting, by the multimodal browser, the matched search result with additional web content further comprises inserting the additional web content into a node of a Document Object Model (‘DOM’) representing the web content that contains the matched search result. 7. The method of claim 1 wherein the web content is not speech-enabled. 8. Apparatus for speech-enabled searching of web content using a multimodal browser operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal browser operatively coupled to an automatic speech recognition (‘ASR’) engine, the apparatus comprising: a computer processor; anda computer memory operatively coupled to the computer processor, the computer memory having stored thereon computer program instructions that, when executed by the computer processor, perform a method comprising acts of:rendering, by the multimodal browser, web content;searching, by the multimodal browser, the rendered web content for a search phrase, including matching the search phrase to at least one portion of the rendered web content, yielding a matched search result, the search phrase specified by a first voice utterance received from a user and a search grammar; andin response to a second voice utterance received from the user: using an action grammar comprising one or more entries to recognize the second voice utterance as corresponding to a first entry of the one or more entries, the action grammar specifying, for the first entry of the one or more entries, an associated first action to be taken in dependence upon the matched search result, andfor a second entry of the one or more entries, an associated second action to be taken in dependence upon the same matched search result, the second action being different from the first action, andperforming, by the multimodal browser, the first action in dependence upon the matched search result associated with the first entry. 9. The apparatus of claim 8 wherein searching, by the multimodal browser, the web content for a search phrase, including yielding a matched search result further comprises: creating the search grammar in dependence upon the web content;receiving the first voice utterance from a user; anddetermining, using the ASR engine, the search phrase in dependence upon the first voice utterance and the search grammar. 10. The apparatus of claim 9 wherein matching the search phrase to at least one portion of the web content, yielding a matched search result further comprises identifying a node of a Document Object Model (‘DOM’) representing the web content that contains the search phrase. 11. The apparatus of claim 8 wherein performing, by the multimodal browser, an action in dependence upon the matched search result further comprises: creating the action grammar in dependence upon the matched search result;receiving the second voice utterance from the user;determining, using the ASR engine, an action identifier in dependence upon the second voice utterance and the action grammar; andperforming the specified action in dependence upon the action identifier. 12. The apparatus of claim 8 further comprising computer program instructions capable of augmenting, by the multimodal browser, the matched search result with additional web content. 13. The apparatus of claim 12 wherein augmenting, by the multimodal browser, the matched search result with additional web content further comprises inserting the additional web content into a node of a Document Object Model (‘DOM’) representing the web content that contains the matched search result. 14. A computer-readable recordable medium encoded with instructions that, when executed, perform a method for speech-enabled searching of web content using a multimodal browser operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal browser operatively coupled to an automatic speech recognition (‘ASR’) engine, the method comprising acts of: rendering, by the multimodal browser, web content;searching, by the multimodal browser, the rendered web content for a search phrase, including matching the search phrase to at least one portion of the rendered web content, yielding a matched search result, the search phrase specified by a first voice utterance received from a user and a search grammar; andin response to a second voice utterance received from the user: using an action grammar comprising one or more entries to recognize the second voice utterance as corresponding to a first entry of the one or more entries, the action grammar specifying, for the first entry of the one or more entries, an associated first action to be taken in dependence upon the matched search result, andfor a second entry of the one or more entries, an associated second action to be taken in dependence upon the same matched search result, the second action being different from the first action, andperforming, by the multimodal browser, the first action in dependence upon the matched search result associated with the first entry. 15. The computer-readable recordable medium of claim 14 wherein searching, by the multimodal browser, the web content for a search phrase, including yielding a matched search result further comprises: creating the search grammar in dependence upon the web content;receiving the first voice utterance from a user; anddetermining, using the ASR engine, the search phrase in dependence upon the first voice utterance and the search grammar. 16. The computer-readable recordable medium of claim 15 wherein matching the search phrase to at least one portion of the web content, yielding a matched search result further comprises identifying a node of a Document Object Model (‘DOM’) representing the web content that contains the search phrase. 17. The computer-readable recordable medium of claim 14 wherein performing, by the multimodal browser, an action in dependence upon the matched search result further comprises: creating the action grammar in dependence upon the matched search result;receiving the second voice utterance from the user;determining, using the ASR engine, an action identifier in dependence upon the second voice utterance and the action grammar; andperforming the specified action in dependence upon the action identifier. 18. The computer-readable recordable medium of claim 14 further comprising computer program instructions capable of augmenting, by the multimodal browser, the matched search result with additional web content. 19. The computer-readable recordable medium of claim 18 wherein augmenting, by the multimodal browser, the matched search result with additional web content further comprises inserting the additional web content into a node of a Document Object Model (‘DOM’) representing the web content that contains the matched search result. 20. The computer-readable recordable medium of claim 14 wherein the web content is not speech-enabled.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (184)
Knapp, John R.; Snyders, Edward K. E., APPARATUS FOR DISTRIBUTING INFORMATION OVER A NETWORK-BASED ENVIRONMENT, METHOD OF DISTRIBUTING INFORMATION TO USERS, AND METHOD FOR ASSOCIATING CONTENT OBJECTS WITH A DATABASE WHEREIN THE CONTENT OB.
Agapi, Ciprian; Bodin, William K.; Cross, Jr., Charles W.; Patel, Paritosh D., Adjusting a speech engine for a mobile computing device based on background noise.
Bodin,William K.; Jaramillo,David; Redman,Jerry W.; Thorson,Derral C., Aggregating content of disparate data types from disparate data sources for single point access.
Hanmann, Jonathan Lee; Sareem, Anil; Smith, Kenneth J., Caching advertising information in a mobile terminal to enhance remote synchronization and wireless internet browsing.
Bolduc Raymond L. ; Rosen Kenneth H. ; Salimando Steven Charles ; Stuntebeck Peter H. ; Weber Roy Philip, Cellular phone network that provides location-based information.
May, Darrell Reginald; Gagne, Alain R., Communications system providing text-to-speech message conversion features using audio filter parameters and related methods.
Jost, Uwe Helmut; Shao, Yuan, Control apparatus, method and computer readable memory medium for enabling a user to communicate by speech with a processor-controlled apparatus.
Giangarra Paul Placido ; Taylor James Lynn ; Tracey ; II William Joseph, Data processing system and method for navigating a network using a voice command.
Kerimovska, Nercivan; Klinghult, Gunnar; Tomasson, Anna, Device for generating speech, apparatus connectable to or incorporating such a device, and computer program product therefor.
Agapi, Ciprian; Bodin, William K.; Cross, Jr., Charles W.; Wang, Fang, Dynamically publishing directory information for a plurality of interactive voice response systems.
Machiraju Nagabhushan Rao ; Graves Michael James ; Vemuri Sunil ; Chandhok Ravinder Paul ; Lofgren Catherine Abbott, FAQ link creation between user's questions and answers.
Appelt, Douglas E.; Arnold, James Frederick; Bear, John S.; Hobbs, Jerry Robert; Israel, David J.; Kameyama, Megumi; Martin, David L.; Myers, Karen Louise; Ravichandran, Gopalan; Stickel, Mark Edward, Information retrieval by natural language querying.
Gulau David B. (Livonia MI) Huget Jeffrey P. (Westland MI) Varilone Robert L. (Farmington Hills MI) Edwards Mark D. (Canton MI), Integrated microphone/pushbutton housing for voice activated cellular phone.
Slaughter, Gregory L.; Saulpaugh, Thomas E.; Traversat, Bernard A.; Abdelaziz, Mohamed M., Mechanism and apparatus for web-based searching of URI-addressable repositories in a distributed computing environment.
Granovetter,Randy Phyllis; Sinclair,Michael J.; Zhang,Zhengyou; Liu,Zicheng, Method and apparatus for multi-sensory speech enhancement on a mobile device.
De Moerloose, Jan; Godon, Marc; Overmeire, Luk; Westerhuis, Frans, Method and apparatus for providing a user of a mobile communication terminal or a group of users with an information message with an adaptive content.
Vora,Ashish; Sprague,Kara Lynn; Tuckey,Curtis; Gupta,Arvind, Method and apparatus for providing speech recognition resolution on an application server.
Görtz, Udo; Haberland Schlösser, Knut; Rateitschek, Klaus; Theimer, Wolfgang; Weingart, Peter; Serafat, Reza; Lück, Matthias; Mäkelä, Jakke, Method and device for automatically changing a digital content on a mobile device according to sensor data.
Robert Rennard ; Sean Quan Du ; Sami Fawzi Nasser ; Yi-Chung Chao ; Ruslan Adikovich Meshenberg ; Haiping Jin ; Chung Benjamin Yip, Method and system for an efficient operating environment in a real-time navigation system.
Harb,Joseph; George,David; Haven,Chris; Ferry,Dennis; Lee,Wen Hsin; Srinivasan,Jaya, Method and system for building/updating grammars in voice access systems.
Ativanichayaphong,Soonthorn; Cross, Jr.,Charles W.; Muschett,Brien H., Method and system of building a grammar rule with baseforms generated dynamically from user utterances.
Jenniges, Nathaniel J.; Smith, Paul J.; Jobling, Jeremy T.; Gupta, Sanjay; Glintz, Michael A.; Ng, Scott H., Method and wireless device for establishing a communication interface for a communication session.
Low Colin,GBX ; Seaborne Andrew Franklin,GBX ; Bouthors Nicolas,FRX ; Beyschlag Ulf,FRX ; Raguideau Nicolas,FRX, Method of making available content resources to users of a telephone network.
Ball, Thomas J.; Danielsen, Peter John; Mataga, Peter Andrew; Rehor, Kenneth G., Method of providing transfer capability on web-based interactive voice response services.
Veluppillai, Mahinthan; Sangary, Nagula Tharma; Simmons, Sean Bartholomew; Jarmuszewski, Perry, Method, device and system for detecting the mobility of a mobile device.
Himanen,Teemu; Ylinen,Pasi, Method, system and network entity for providing text telephone enhancement for voice, tone and sound-based network services.
Bladon, Anthony; Giannini, David; Hofstatter, David F.; Kelley, Colin; McClintock, David C.; Smith, Robert F.; Trandal, David S.; Kirchhoff, Leland W., Methods and systems for managing telecommunications and for translating voice messages to text messages.
Lee, Won Young, Mobile terminal, method of managing schedule using the mobile terminal, and method of managing position information using the mobile terminal.
Vander Veen, Raymond; Klassen, Gerhard Dietrich, Motion-based disabling of messaging on a wireless communications device by differentiating a driver from a passenger.
Agapi, Ciprian; Bodin, William K.; Cross, Jr., Charles W.; Goodman, Brian D.; Jania, Frank L.; Shaw, Darren M., Signaling correspondence between a meeting agenda and a meeting discussion.
Calder, Gary James; Clelland, George Murdoch; Farrell, Anthony Timothy; Mann, Robert; Pickering, John Brian; Reilly, Paul, Speech encoding in a client server system.
Rissanen,Jussi; Tanskanene,Erkki; Makipaa,Mikko; Pakkala,Timo; Hannula,Esko, System and method for displaying information included in predetermined messages automatically.
Saylor, Michael J.; Trundle, Steven S; Zirngibl, Michael X.; Brown, Steven R.; Patnaik, Anurag; Garr, David A.; Lindsey, Benjamin M.; Mahowald, Josh; Inanoglu, Zeynap, System and method for generating voice pages with included audio files for use in a voice page delivery system.
Pasquali Sandro, System and method for providing a dynamic advertising content window within a window based content manifestation environment provided in a browser.
Dragosh, Pamela Leigh; Roe, David Bjorn; Sharp, Robert Douglas, System and method for providing remote automatic speech recognition and text-to-speech services via a packet network.
Lohtia, Sunit; James, Wilfred Martin; Hwang, Boon Chong, System and method for providing subscriber-initiated information over the short message service (SMS) or a microbrowser.
Gursahaney Suresh K. (Gaithersburg MD) Helm Daniel J. (McLean VA) Lee Dana R. (Laurel MD) Madrid Richard J. (Gaithersburg MD) McKenzie Valerie S. (Germantown MD) Miller Steven K. (Germantown MD), System for integrating telephony data with data processing systems.
Partovi,Hadi; Brathwaite,Roderick Steven; Davis,Angus Macdonald; McCue,Michael S.; Porter,Brandon William; Giannandrea,John; Walther,Eckart; Accardi,Anthony; Li,Zhe, System for providing personalized content over a telephone interface to a user according to the corresponding personalization profile including the record of user actions or the record of user behavior.
Pertrushin Valery A., System method and article of manufacture for detecting emotion in voice signals by utilizing statistics for voice signal parameters.
Alpdemir, Ahmet, System, method, and business model for speech-interactive information system having business self-promotion, audio coupon and rating features.
Agarwal, Sheetal K.; Chakraborty, Dipanjan; Kumar, Arun; Mukherjea, Sougata; Nanavati, Amit Anil; Rajput, Nitendra, Systems and methods to index and search voice sites.
Bandera Daniel Quinto ; Bregman Mark F. ; Gopal Ajei S. ; Singhal Sandeep, Systems, methods and computer program products for providing time and location specific advertising via the internet.
Chutorash, Richard J.; Anderson, Elisabet; Eich, Rodger W.; Golden, Jeffrey; Vanderwall, Philip J.; Sims, Michael J., Vehicle user interface systems and methods.
Hinde,Stephen John; Brittan,Paul St John; Hickey,Marianne; Wilcock,Lawrence; Belrose,Guillaume; Thomas,Andrew, Voice communication concerning a local entity.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.