최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
DataON 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Edison 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Kafe 바로가기국가/구분 | United States(US) Patent 등록 |
---|---|
국제특허분류(IPC7판) |
|
출원번호 | US-0500723 (2014-09-29) |
등록번호 | US-9263039 (2016-02-16) |
발명자 / 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 | 피인용 횟수 : 9 인용 특허 : 448 |
Systems and methods are provided for receiving speech and non-speech communications of natural language questions and/or commands, transcribing the speech and non-speech communications to textual messages, and executing the questions and/or commands. The invention applies context, prior information,
Systems and methods are provided for receiving speech and non-speech communications of natural language questions and/or commands, transcribing the speech and non-speech communications to textual messages, and executing the questions and/or commands. The invention applies context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users presenting questions or commands across multiple domains. The systems and methods creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech and non-speech communications and presenting the expected results for a particular question or command.
1. A system for processing speech and non-speech communications, comprising: a terminal device that receives the speech and the non-speech communications;a transcription module that transcribes the speech and the non-speech communications to create a speech-based textual message and a non-speech-bas
1. A system for processing speech and non-speech communications, comprising: a terminal device that receives the speech and the non-speech communications;a transcription module that transcribes the speech and the non-speech communications to create a speech-based textual message and a non-speech-based textual message;a merging module that merges the speech-based textual message and the non-speech-based textual message to generate a query;a search module that searches the query for text combinations;a comparison module that compares the text combinations to entries in a context description grammar;a plurality of domain agents that are associated with the context description grammar;a scoring module that provides relevance scores based on results from the comparison module;a domain agent selector that selects domain agents based on results from the scoring module; anda response generating module that communicates with the selected domain agents to obtain content that is gathered by the selected domain agents and that generates a response from the content, wherein the content is arranged in a selected order based on results from the scoring module. 2. The system according to claim 1, wherein the response generating module generates an aggregate response that includes the content gathered by the selected domain agents. 3. The system according to claim 1, wherein the terminal device includes (i) a personal digital assistant, (ii) a cellular telephone, (iii) a portable computer, (iv) a desktop computer, or any combination of (i) to (iv). 4. The system according to claim 1, wherein the terminal device receives follow-up speech and non-speech communications and wherein the transcription module transcribes the follow-up speech and non-speech communications to create a follow-up speech-based textual message and a follow-up non-speech-based textual message. 5. The system according to claim 4, wherein the merging module merges the follow-up speech-based textual message and the follow-up non-speech-based textual message to generate a follow-up query. 6. The system according to claim 1, further comprising a personality module that facilitates formatting the response. 7. The system according to claim 1, further comprising a context stack that includes one or more contexts that are selected based on the query. 8. The system according to claim 7, wherein the scoring module determines the one or more contexts based on at least applying prior probabilities or fuzzy possibilities to (i) keyword matching, (ii) user profiles, (iii) a dialog history, or any combination of (i) to (iii). 9. The system according to claim 1, wherein at least one of the domain agents creates and directs a request to at least one of a local information source and a network information source. 10. The system according to claim 1, wherein at least one of the domain agents creates and directs a command to a remote or local device. 11. The system according to claim 5, wherein at least one of the domain agents evaluates multiple queries from multiple sources. 12. The system according to claim 5, wherein the follow-up query is associated with a same context as the query. 13. A method of processing speech and non-speech communications, comprising: receiving the speech and non-speech communications;transcribing the speech and non-speech communications to create a speech-based textual message and a non-speech-based textual message;merging the speech-based textual message and the non-speech-based textual message to generate a query;searching the query for text combinations;comparing the text combinations to entries in a context description grammar;accessing a plurality of domain agents that are associated with the context description grammar;generating a relevance score based on results from comparing the text combinations to entries in the context description grammar;selecting one or more domain agents based on results from the relevance score;obtaining content that is gathered by the selected domain agents; andgenerating a response from the content, wherein the content is arranged in a selected order based on results from the relevance score. 14. The method according to claim 13, further comprising generating an aggregate response that includes the content that is gathered by the selected domain agents. 15. The method according to claim 13, further comprising: receiving a follow-up speech and non-speech communications;transcribing the follow-up speech and non-speech communications to create a follow-up speech-based textual message and a follow-up non-speech-based textual message; andmerging the follow-up speech-based textual message and the follow-up non-speech-based textual message to generate a follow-up query. 16. The method according to claim 13, further comprising a personality module that communicates the response to a user. 17. The method according to claim 13, further comprising generating a context stack that includes one or more contexts that are selected based on the query. 18. The method according to claim 17, wherein the one or more contexts are generated based on applying prior probabilities or fuzzy possibilities to (i) keyword matching, (ii) user profiles, (iii) a dialog history, or any combination of (i) to (iii). 19. A multimodal system for processing speech and non-speech communications, comprising: a terminal device that receives one or more types of input;a transcription module that transcribes the one or more types of input into one or more textual messages;a merging module that merges the one or more textual messages to generate a query;a search module that searches the query for text combinations;a comparison module that compares the text combinations to entries in a context description grammar;a plurality of domain agents that are associated with the context description grammar;a scoring module that provides relevance scores based on results from the comparison module;a domain agent selector that selects domain agents based on results from the scoring module; anda response generating module that communicates with the selected domain agents to obtain content that is gathered by the selected domain agents and that generates a response from the content, wherein the content is arranged in a selected order based on results from the scoring module;wherein the terminal device delivers the response using one or more types of output. 20. The multimodal system according to claim 19, wherein the one or more types of input includes (i) speech, (ii) text, (iii) digital audio files, or any combination of (i) to (iii).
Copyright KISTI. All Rights Reserved.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.