최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
DataON 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Edison 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Kafe 바로가기국가/구분 | United States(US) Patent 등록 |
---|---|
국제특허분류(IPC7판) |
|
출원번호 | US-0692451 (2012-12-03) |
등록번호 | US-8738380 (2014-05-27) |
발명자 / 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 | 피인용 횟수 : 28 인용 특허 : 429 |
A system and method for processing multi-modal device interactions in a natural language voice services environment may be provided. In particular, one or more multi-modal device interactions may be received in a natural language voice services environment that includes one or more electronic device
A system and method for processing multi-modal device interactions in a natural language voice services environment may be provided. In particular, one or more multi-modal device interactions may be received in a natural language voice services environment that includes one or more electronic devices. The multi-modal device interactions may include a non-voice interaction with at least one of the electronic devices or an application associated therewith, and may further include a natural language utterance relating to the non-voice interaction. Context relating to the non-voice interaction and the natural language utterance may be extracted and combined to determine an intent of the multi-modal device interaction, and a request may then be routed to one or more of the electronic devices based on the determined intent of the multi-modal device interaction.
1. A computer-implemented method of facilitating natural language utterance processing via multiple input modes, the method being implemented on a computer that includes one or more physical processors executing one or more computer program instructions which, when executed, perform the method, the
1. A computer-implemented method of facilitating natural language utterance processing via multiple input modes, the method being implemented on a computer that includes one or more physical processors executing one or more computer program instructions which, when executed, perform the method, the method comprising: receiving, from a user by the one or more physical processors via a first input mode, a first input;receiving, from the user by the one or more physical processors via a second input mode that is different from the first input mode, a second input that relates to the first input;determining, by the one or more physical processors, a request type from a plurality of request types based on the first input or the second input;determining, by the one or more physical processors, a request associated with the request type;determining, by the one or more physical processors based on the first input, first context information for the request;determining, by the one or more physical processors based on the second input, second context information for the request; andprocessing, by the one or more physical processors, the request based on the first context information and the second context information. 2. The method of claim 1 wherein determining the request comprises determining an action, a query, a command, or a task. 3. The method of claim 1, wherein receiving the first input comprises receiving a natural language utterance via a voice input mode, and wherein receiving the second input comprises receiving a non-voice input via a non-voice input mode. 4. The method of claim 1, wherein receiving the first input comprises receiving a non-voice input via a non-voice input mode, and wherein receiving the second input comprises receiving a natural language utterance via a voice input mode. 5. The method of claim 4, wherein receiving the non-voice input comprises receiving the non-voice input via a first device having the non-voice input mode, and wherein receiving the natural language utterance comprises receiving the natural language utterance via a second device having the voice input mode. 6. The method of claim 4, wherein receiving the non-voice input comprises receiving the non-voice input via a first device having the non-voice input mode and the voice input mode, and wherein receiving the natural language utterance comprises receiving the natural language utterance via the first device. 7. The method of claim 4, wherein the non-voice input indicates an item, selected segment, operation, or point of focus on a first device, and wherein determining the first context information comprises determining the first context information based on the item, selected segment, operation, or point of focus on the first device. 8. The method of claim 4, wherein the non-voice input indicates a point of focus on a touch screen of a first device, and wherein determining the first context information comprises determining a location associated with the point of focus on the touch screen. 9. The method of claim 8, wherein processing the request comprises determining a product or service based on the determined location and the second context information, the method further comprising: generating, by the one or more physical processors, a response to the request based on the determined product or service. 10. The method of claim 8, wherein the point of focus includes a point or area on a display rendered on the touch screen, and wherein determining the location comprises determining a physical location based on the point or area on the display. 11. The method of claim 10, wherein the voice input indicates a product or service type, and wherein processing the request comprises determining a product or service based on the determined physical location and the product or service type, the method further comprising: generating, by the one or more physical processors, a response to the request based on the determined product or service. 12. The method of claim 4, wherein the natural language utterance indicates a domain associated with the request, and wherein processing the request comprises processing the request based on the domain, the first context information, and the second context information. 13. The method of claim 4, further comprising: determining, by the one or more physical processors, a domain associated with the request based on the natural language utterance,wherein processing the request comprises processing the request at a domain agent associated with the determined domain responsive to the determination of the domain. 14. The method of claim 4, further comprising: determining, by the one or more physical processors, prior context information for a prior request, wherein the prior context information relates to one or more prior natural language utterances or prior non-voice inputs associated with the user,wherein determining the first context information or the second context information comprises determining the first context information or the second context information further based on the prior context information. 15. The method of claim 1, further comprising: determining, based on a context stack having a plurality of entries that individually are indicative of context, an entry in the context stack that corresponds to the first input or the second input; anddetermining a domain agent associated with the entry in the context stack, wherein processing the request comprises providing the request to the determined domain agent to process the request, wherein the determined domain agent is configured to update the context stack responsive to the processing of the request by the determined domain agent. 16. The method of claim 1, further comprising: determining, by the one or more physical processors, promotional content for the user based on the first context information and the second context information;presenting, by the one or more physical processors, the promotional content to the user;receiving, from the user by the one or more physical processors, a third input relating to the promotional content, wherein the third input comprises a natural language utterance relating to the promotional content; anddetermining, by the one or more physical processors, a second request relating to the promotional content based on the third input. 17. A system for facilitating natural language utterance processing via multiple input modes, the system comprising: one or more physical processors programmed to execute one or more computer program instructions which, when executed, cause the system to: receive, from a user via a first input mode, a first input;receive, from the user via a second input mode that is different from the first input mode, a second input that relates to the first input;determine a request type from a plurality of request types based on the first input or the second input;determine a request associated with the request type;determine, based on the first input, first context information for the request;determine, based on the second input, second context information for the request; andprocess the request based on the first context information and the second context information. 18. The system of claim 17, wherein determining the request comprises determining an action, a query, a command, or a task. 19. The system of claim 17, wherein receiving the first input comprises receiving a natural language utterance via a voice input mode, and wherein receiving the second input comprises receiving a non-voice input via a non-voice input mode. 20. The system of claim 17, wherein receiving the first input comprises receiving a non-voice input via a non-voice input mode, and wherein receiving the second input comprises receiving a natural language utterance via a voice input mode. 21. The system of claim 20, wherein receiving the non-voice input comprises receiving the non-voice input via a first device having the non-voice input mode, and wherein receiving the natural language utterance comprises receiving the natural language utterance via a second device having the voice input mode. 22. The system of claim 20, wherein receiving the non-voice input comprises receiving the non-voice input via a first device having the non-voice input mode and the voice input mode, and wherein receiving the natural language utterance comprises receiving the natural language utterance via the first device. 23. The system of claim 20, wherein the non-voice input indicates an item, selected segment, operation, or point of focus on a first device, and wherein determining the first context information comprises determining the first context information based on the item, selected segment, operation, or point of focus on the first device. 24. The system of claim 20, wherein the non-voice input indicates a point of focus on a touch screen of a first device, and wherein determining the first context information comprises determining a location associated with the point of focus on the touch screen. 25. The system of claim 24, wherein processing the request comprises determining a product or service based on the determined location and the second context information, and wherein the system is caused to: generate a response to the request based on the determined product or service. 26. The system of claim 24, wherein the point of focus includes a point or area on a display rendered on the touch screen, and wherein determining the location comprises determining a physical location based on the point or area on the display. 27. The system of claim 26, wherein the voice input indicates a product or service type, and wherein processing the request comprises determining a product or service based on the determined physical location and the product or service type, and wherein the system is caused to: generate a response to the request based on the determined product or service. 28. The system of claim 20, wherein the natural language utterance indicates a domain associated with the request, and wherein processing the request comprises processing the request based on the domain, the first context information, and the second context information. 29. The system of claim 20, wherein the system is caused to: determine a domain associated with the request based on the natural language utterance, andwherein processing the request comprises processing the request at a domain agent associated with the determined domain responsive to the determination of the domain. 30. The system of claim 20, wherein the system is caused to: determine prior context information for a prior request, wherein the prior context information relates to one or more prior natural language utterances or prior non-voice inputs associated with the user, andwherein determining the first context information or the second context information comprises determining the first context information or the second context information further based on the prior context information. 31. The system of claim 17, wherein the system is caused to: determine, based on a context stack having a plurality of entries that individually are indicative of context, an entry in the context stack that corresponds to the first input or the second input; anddetermine a domain agent associated with the entry in the context stack, wherein processing the request comprises providing the request to the determined domain agent to process the request, wherein the determined domain agent is configured to update the context stack responsive to the processing of the request by the determined domain agent. 32. The system of claim 17, wherein the system is caused to: determine promotional content for the user based on the first context information and the second context information;present the promotional content to the user;receive, from the user, a third input relating to the promotional content, wherein the third input comprises a natural language utterance relating to the promotional content; anddetermine a second request relating to the promotional content based on the third input.
Copyright KISTI. All Rights Reserved.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.