IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0170999
(2005-06-29)
|
등록번호 |
US-7734471
(2010-06-29)
|
발명자
/ 주소 |
- Paek, Timothy S.
- Chickering, David M.
- Horvitz, Eric J.
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
11 인용 특허 :
67 |
초록
▼
An online dialog system and method are provided. The dialog system receives speech input and outputs an action according to its models. After executing the action, the system receives feedback from the environment or user. The system immediately utilizes the feedback to update its models in an onlin
An online dialog system and method are provided. The dialog system receives speech input and outputs an action according to its models. After executing the action, the system receives feedback from the environment or user. The system immediately utilizes the feedback to update its models in an online fashion.
대표청구항
▼
What is claimed is: 1. An online learning dialog system comprising: one or more processing units; memory communicatively coupled to the one or more processing units, the memory having stored instructions that, when executed by the one or more processing units, configure the online learning dialog s
What is claimed is: 1. An online learning dialog system comprising: one or more processing units; memory communicatively coupled to the one or more processing units, the memory having stored instructions that, when executed by the one or more processing units, configure the online learning dialog system to implement: a speech model that receives a speech input and provides speech events; a decision engine model that receives the speech events from the speech model and selects an action based, at least in part, upon a probability distribution, the probability distribution being associated with uncertainty regarding a plurality of parameters of the decision engine model applied to the speech input, wherein the probability distribution is: defined by an influence diagram that is configured to maximize long term expected utility and apply the Thompson strategy; and expressed as: p ( U , V | D , Θ ) = ∏ X ∈ U ⋃ V p ( X | Pa ( X ) , Θ X ) where U denotes chance variables, D denotes decision variables, and V denotes value variables; where Pa(X) denotes a set of parents for node X; and where ΘX denotes a subset of parameters related to the applied speech input in Θ that define local distribution of X; and, a learning component that in an online manner modifies at least one of the parameters of the decision engine model based upon feedback associated with the selected action, wherein the feedback comprises a lack of verbal input from a user of the system or an environment within a predefined period of time. 2. An online learning dialog method implemented at a computing device, the method comprising: receiving, at the computing device, voice input from a user; determining, at the computing device, whether the voice input from the user is accepted as understood and initiate corresponding actions or the voice input is ambiguous and is in need of exploration based at least on a probability distribution associated with uncertainty regarding parameters of a decision engine model applied to the voice input, wherein the probability distribution is defined by an influence diagram that is configured to apply the Thompson strategy; selecting an action based, at least in part, upon the probability distribution; receiving, at the computing device, feedback associated with the selected action; and updating at least one of the parameters of the decision engine model based, at least in part, upon the feedback associated with the selected action such that the decision engine model of the computing device is configured to maximize long term expected utility via the updating at least the one of the parameters of the decision engine model, wherein the feedback comprises a lack of verbal response to the selected action in a threshold period of time. 3. A voice-controlled mobile device that comprises the system of claim 1. 4. A speech application embedded on a non-transitory computer storage medium to implement the method as recited in claim 2. 5. The system of claim 1, wherein the instructions that, when executed by the one or more processing units, configure the online learning dialog system to further implement a repair dialog on a display of the system. 6. The system of claim 5, wherein the repair dialog includes a request to repeat and/or a request for confirmation. 7. The system of claim 1, wherein the speech model is configured to: ignore the speech input, execute corresponding to a most likely command associated with the speech input, request to repeat the speech input, and provide information associated with a plurality of likely commands along with a request to confirm the speech input. 8. The system of claim 1, wherein the feedback further comprises a negative input or a positive input utterance from the user of the system or the environment. 9. The system of claim 1, wherein the plurality of parameters of the decision engine model are updated based on the feedback associated with the selected action. 10. The system of claim 1, wherein the learning component employs retrospective analysis to modify at least one of the plurality of parameters of the decision engine model. 11. The system of claim 1, wherein the feedback comprises a lack of an input from a user of the system within a threshold period of time. 12. The system of claim 1, wherein the decision engine model comprises a Markov decision process. 13. The system of claim 1, wherein: Dirichlet priors are used in the plurality of parameters for conditional distributions of discrete variables of the decision engine model, and Normal-Wishart priors are used in the plurality of parameters for distributions of continuous variables of the decision engine model. 14. An online learning dialog system comprising: means for receiving voice input; means for modeling the voice input based on a probability distribution associated with uncertainty regarding a plurality of parameters of the means for modeling the voice input, wherein the probability distribution is defined by an influence diagram that is configured to apply the Thompson strategy; means for selecting an action based, at upon in part, upon the probability distribution received from the means for modeling the voice input; and means for modifying the plurality of parameters of the means for modeling the voice input based upon feedback associated with the selected action, wherein the feedback comprises a lack of verbal response from a user in a threshold period of time. 15. The system of claim 14, wherein the means for selecting an action employs a heuristic technique to maximize long term expected utility. 16. A voice-controlled web browser embedded on a non-transitory computer storage medium to implement the method as recited in claim 2. 17. The method of claim 2, wherein the feedback further comprises a verbal response to the selected action in a threshold period of time. 18. A computer readable medium having stored thereon computer executable instructions for carrying out the method of claim 2.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.