최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
DataON 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Edison 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Kafe 바로가기국가/구분 | United States(US) Patent 등록 |
---|---|
국제특허분류(IPC7판) |
|
출원번호 | US-0855346 (2015-09-15) |
등록번호 | US-9898459 (2018-02-20) |
발명자 / 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 | 피인용 횟수 : 0 인용 특허 : 524 |
The invention relates to a system and method for integrating domain information into state transitions of a Finite State Transducer (“FST”) for natural language processing. A system may integrate semantic parsing and information retrieval from an information domain to generate an FST parser that rep
The invention relates to a system and method for integrating domain information into state transitions of a Finite State Transducer (“FST”) for natural language processing. A system may integrate semantic parsing and information retrieval from an information domain to generate an FST parser that represents the information domain. The FST parser may include a plurality of FST paths, at least one of which may be used to generate a meaning representation from a natural language input. As such, the system may perform domain-based semantic parsing of a natural language input, generating more robust meaning representations using domain information. The system may be applied to a wide range of natural language applications that use natural language input from a user such as, for example, natural language interfaces to computing systems, communication with robots in natural language, personalized digital assistants, question-answer query systems, and/or other natural language processing applications.
1. A computer implemented method for integrating domain information and semantic parsing to generate meaning representations from natural language input, the method being implemented on a computer system having one or more physical processors programmed with computer program instructions to perform
1. A computer implemented method for integrating domain information and semantic parsing to generate meaning representations from natural language input, the method being implemented on a computer system having one or more physical processors programmed with computer program instructions to perform the method, the method comprising: receiving, by the computer system, a natural language input of a user comprising a natural language utterance in which at least a first input token has been uttered;providing, by the computer system, the natural language input to a speech-to-text recognizer;obtaining, by the computer system, one or more words of the natural language input as an output of the speech-to-text recognition recognizer, wherein the one or more words includes the first input token;obtaining, by the computer system, a semantic grammar that includes word combinations for intent processing and integrates a plurality of domain tokens relating to an information domain, wherein the semantic grammar integrates the plurality of domain tokens structured into a domain information Finite State Transducer (FST) parser that includes at least a first FST path comprising a first set of domain tokens and a second FST path comprising a second set of domain tokens;comparing, by the computer system, the plurality of domain tokens that match the first input token;generating, by the computer system, a first score for the first FST path and a second score for the second FST path based on the comparison;selecting, by the computer system, the first FST path based on the first score and the second score;determining, by the computer system, a semantic structure of the one or more words based on the selected first FST path; andgenerating, by the computer system, a representation of an intention of the user based on the semantic structure, wherein the representation is used to execute a natural language based search request or a natural language based command. 2. The method of claim 1, wherein the first score is based on a first sum of weights of each domain token among the first set of domain tokens and the second score is based on a second sum of weights of each domain token among the second set of domain tokens, and wherein a given weight for a domain token is based on a level of frequency that the domain token appears in the information domain. 3. The method of claim 2, the method further comprising: initializing an input FST based on the first token and the semantic structure; and composing the input FST based on the first FST path and the second FST path, wherein the first FST path and the second FST path are integrated with the input FST, and wherein the first FST is selected from the input FST. 4. The method of claim 3, wherein selecting the first FST path comprises selecting a shortest path in the input FST. 5. The method of claim 3, wherein comparing the plurality of domain tokens with the first input token comprises: performing fuzzy or exact matching between the plurality of domain tokens from the information domain and the first token, wherein the plurality of domain tokens comprises fuzzy or exact matches to the first token. 6. The method of claim 1, the method further comprising: identifying a second token that is relevant to the first token and the information domain, wherein the second token is not initially included in the natural language input; andadding the second token to the meaning representation. 7. The method of claim 1, wherein the natural language input comprises at least a second token, the method further comprising: determining that the second token is not relevant to the information domain; and omitting the second token from the meaning representation responsive to the determination that the second token is not relevant. 8. The method of claim 1, the method further comprising: obtaining a phoneme confusion matrix comprising at least two similar sounding words that are disambiguated based on previous training from one or more user utterances; anddisambiguating the first token based on the phoneme confusion matrix. 9. The method of claim 1, the method further comprising: obtaining one or more dynamic data tokens from a dynamic data source; andintegrating the one or more dynamic data tokens with the plurality of tokens from the information domain, wherein the meaning representation is determined based on the integrated dynamic data tokens. 10. The method of claim 9, wherein the plurality of domain tokens are structured into a domain information FST parser that includes at least a first FST path comprising a first set of domain tokens and a second FST path comprising a second set of domain tokens, and wherein integrating the one or more dynamic data tokens comprises: generating a dynamic FST based on the one or more dynamic data tokens; andinserting the dynamic FST into a slot of the domain information FST parser reserved for dynamic data. 11. The method of claim 1, wherein the computer executable action comprises an execution of: a natural language-based search request or a natural language-based command. 12. The method of claim 1, wherein the information domain comprises a plurality of entries of searchable information, and wherein retrieving the plurality of domain tokens that match the first token comprises: determining at least one entry, which includes the plurality of domain tokens, that is likely being searched for based on the first token. 13. A system for integrating domain information and semantic parsing to generate meaning representations from natural language input, the system comprising: a computer system comprising one or more physical processors programmed with computer program instructions to:receive a natural language input of a user comprising a natural language utterance in which at least a first input token has been uttered;provide the natural language input to a speech-to-text recognizer;obtain one or more words of the natural language input as an output of the speech-to-text recognizer, wherein the one or more words includes the first input token;obtain a semantic grammar that includes word combinations for intent processing and integrates a plurality of domain tokens relating to an information domain, wherein the semantic grammar integrates the plurality of domain tokens structured into a domain information Finite State Transducer (FST) parser that includes at least a first FST path comprising a first set of domain tokens and a second FST path comprising a second set of domain tokens;compare the plurality of domain tokens with the first input token; andgenerate a first score for the first FST path and a second score for the second FST path based on the comparison;select the first FST path based on the first score and the second score;determine a semantic structure of the one or more words based on the selected first FST path;generate a representation of an intention of the user based on the semantic structure, wherein the representation is used to execute a natural language based search request or a natural language based command. 14. The system of claim 13, wherein the first score is based on a first sum of weights of each domain token among the first set of domain tokens and the second score is based on a second sum of weights of each domain token among the second set of domain tokens, and wherein a given weight for a domain token is based on a level of frequency that the domain token appears in the information domain. 15. The system of claim 14, wherein the computer system is further programmed to: initialize an input FST based on the first token and the semantic structure; andcompose the input FST based on the first FST path and the second FST path, wherein the first FST path and the second FST path are integrated with the input FST, and wherein the first FST is selected from the input FST. 16. The system of claim 15, wherein to select the first FST path, the computer system is further programmed to: select a shortest path in the input FST. 17. The system of claim 15, wherein to compare the plurality of domain tokens with the first input token, the computer system is further programmed to: perform fuzzy or exact matching between the plurality of domain tokens from the information domain and the first token, wherein the plurality of domain tokens comprises fuzzy or exact matches to the first token. 18. The system of claim 13, wherein the computer system is further programmed to: identify a second token that is relevant to the first token and the information domain, wherein the second token is not initially included in the natural language input; andadd the second token to the meaning representation. 19. The system of claim 13, wherein the natural language input comprises at least a second token, and wherein the computer system is further programmed to: determine that the second token is not relevant to the information domain; andomit the second token from the meaning representation responsive to the determination that the second token is not relevant. 20. The system of claim 13, wherein the computer system is further programmed to: obtain a phoneme confusion matrix comprising at least two similar sounding words that are disambiguated based on previous training from one or more user utterances; anddisambiguate the first token based on the phoneme confusion matrix. 21. The system of claim 13, wherein the computer system is further programmed to: obtain one or more dynamic data tokens from a dynamic data source; andintegrate the one or more dynamic data tokens with the plurality of tokens from the information domain, wherein the meaning representation is determined based on the integrated dynamic data tokens. 22. The system of claim 21, wherein the plurality of domain tokens are structured into a domain information FST parser that includes at least a first FST path comprising a first set of domain tokens and a second FST path comprising a second set of domain tokens, and wherein to integrate the one or more dynamic data tokens, the computer system is further programmed to: generate a dynamic FST based on the one or more dynamic data tokens; andinsert the dynamic FST into a slot of the domain information FST parser reserved for dynamic data. 23. The system of claim 13, wherein the computer executable action comprises an execution of: a natural language-based search request or a natural language-based command. 24. The system of claim 13, wherein the information domain comprises a plurality of entries of searchable information, and wherein to retrieve the plurality of domain tokens that match the first token, the computer system is further programmed to: determine at least one entry, which includes the plurality of domain tokens, that is likely being searched for based on the first token. 25. The method of claim 1, wherein the first FST path is associated with a first action to be performed and the second FST path is associated with a second action to be performed, the method further comprising: recognizing an action to be performed based on one or more tokens of the one or more words, wherein the first FST path is selected based further on the recognized action to be performed and the first action to be performed.
Copyright KISTI. All Rights Reserved.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.