IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0088258
(2013-11-22)
|
등록번호 |
US-9248569
(2016-02-02)
|
발명자
/ 주소 |
- Laurent, Patryk
- Passot, Jean-Baptiste
- Ponulak, Filip
- Izhikevich, Eugene
|
출원인 / 주소 |
|
대리인 / 주소 |
Gazdzinski & Associates, PC
|
인용정보 |
피인용 횟수 :
1 인용 특허 :
61 |
초록
▼
A robotic device may comprise an adaptive controller configured to learn to predict consequences of robotic device's actions. During training, the controller may receive a copy of the planned and/or executed motor command and sensory information obtained based on the robot's response to the command.
A robotic device may comprise an adaptive controller configured to learn to predict consequences of robotic device's actions. During training, the controller may receive a copy of the planned and/or executed motor command and sensory information obtained based on the robot's response to the command. The controller may predict sensory outcome based on the command and one or more prior sensory inputs. The predicted sensory outcome may be compared to the actual outcome. Based on a determination that the prediction matches the actual outcome, the training may stop. Upon detecting a discrepancy between the prediction and the actual outcome, the controller may provide a continuation signal configured to indicate that additional training may be utilized. In some classification implementations, the discrepancy signal may be used to indicate occurrence of novel (not yet learned) objects in the sensory input and/or indicate continuation of training to recognize said objects.
대표청구항
▼
1. A robotic apparatus, comprising: a platform comprising a controllable actuator;a sensor module configured to provide environmental information associated with an environment of the platform; anda controller configured to: provide a control instruction for the controllable actuator, the control in
1. A robotic apparatus, comprising: a platform comprising a controllable actuator;a sensor module configured to provide environmental information associated with an environment of the platform; anda controller configured to: provide a control instruction for the controllable actuator, the control instruction configured to cause the platform to execute an action to accomplish a target task in accordance with the environmental information;determine a predicted outcome of the action;determine a discrepancy signal based on an actual outcome of the action and the predicted outcome; anddetermine a repeat indication responsive to the discrepancy being within a range of a target value associated with the target task;wherein the repeat indication is configured to cause the robot to execute a second action to achieve the target task. 2. The apparatus of claim 1, wherein: the target task is associated with an object within the environment; andthe environmental information comprises sensory input characterizing one or more of a size, position, shape, or color of the object. 3. The apparatus of claim 1, wherein: the predicted outcome and the actual outcome comprise a characteristic of at least one of the platform and the environment; andthe actual outcome is determined based on an output of the sensor module obtained subsequent to the execution of the action. 4. The apparatus of claim 3, wherein the characteristic is selected from a group consisting of a position of the platform, a position of an object within the environment, and a distance measure between the object and the platform. 5. The apparatus of claim 3, wherein the characteristic comprises a parameter associated with the controllable actuator, the parameter being selected from a group consisting of an actuator displacement, a torque, a force, a rotation rate, and a current draw. 6. A method of training an adaptive robotic apparatus, the method comprising: for a given training trial: causing the apparatus to execute an action based on a context;determining a current discrepancy between a target outcome of the action and a predicted outcome of the action;comparing the current discrepancy to a prior discrepancy, the prior discrepancy being determined based on a prior observed outcome of the action and a prior predicted outcome of the action determined at a prior trial; andproviding an indication responsive to the current discrepancy being smaller than the prior discrepancy, the indication being configured to cause the apparatus to execute the action based on the context during a trial subsequent to the given trial. 7. The method of claim 6, wherein: the discrepancy is configured based on a difference between the actual outcome and the predicted outcome;a repeat indication is determined based on the discrepancy being greater than zero. 8. The method of claim 7, wherein the controller is further configured to determine a stop indication based on the discrepancy being no greater than zero, the stop indication being configured to cause the adaptive robotic apparatus to execute another task. 9. The method of claim 6, wherein: the determination of the current discrepancy is effectuated by a supervised learning process based on a teaching input; andthe teaching input comprises the target outcome. 10. The method of claim 6, wherein: the context is determined at a first time instance associated with the given trial; andthe predicted outcome of the action is determined based on a delayed context obtained during another trial at a second time instance prior to the first time instance. 11. A non-transitory computer-readable storage medium having instructions embodied thereon, the instructions being executable by one or more processors to perform a method of adapting training of a learning apparatus, the method comprising: determining a discrepancy between a predicted outcome and an observed outcome of an action of the learning apparatus;determining an expected error associated with the determination of the discrepancy;comparing the expected error to a target error associated with the determination of the discrepancy; andproviding a continue-training indication based on the expected error being smaller than the target error. 12. The storage medium of claim 11, wherein: the observed outcome is associated with execution of the action during a trial at a first time instance; andthe continue-training indication is configured to cause execution of the action at another trial at a second time instance subsequent to the first time instance. 13. The storage medium of claim 12, wherein: the determination of the discrepancy is effectuated based on a first supervised learning process configured based on a first teaching input; andthe first teaching input is configured to convey information related to the observed outcome of the action. 14. The storage medium of claim 13, wherein: the determination of the expected error is effectuated based on a second supervised learning process configured based on a second teaching input; andthe second teaching input is configured to convey information related to the target error. 15. The storage medium of claim 14, wherein: the target error is determined based on one or more trials preceding the trial; andthe method further comprises providing a cease-training indication based on the expected error being greater than or equal to the target error. 16. The storage medium of claim 15, wherein: the execution of the action during the another trial is characterized by another expected error determination; andthe method further comprises: adjusting the target error based on the comparison of the expected error to the target error, the adjusted target error being configured to be compared against the another expected error during the another trial. 17. The storage medium of claim 15, wherein: the execution of the action at the first time instance is configured based on an output of a random number generator; andthe method further comprises: determining one or more target error components associated with the one or more trials preceding the trial; anddetermining the target error based on a weighted sum of the one or more target error components. 18. The storage medium of claim 13, wherein: the first supervised learning process is further configured based on a neuron network comprising a plurality of neurons communicating via a plurality of connections;individual connections provide an input into a given neuron, the plurality of neurons being characterized by a connection efficacy configured to affect operation of the given neuron; andthe determination of the discrepancy comprises adjusting the efficacy of one or more connections based on the first teaching signal. 19. The storage medium of claim 13, wherein: the action is configured based on a sensory context;the first supervised learning process is configured based on a look-up table, the look-up table comprising one or more entries, individual entries thereof corresponding to an occurrence of the sensory context, the action, and the predicted outcome; andan association development comprises adjusting at least one of the one or more entries based on the first teaching signal. 20. The storage medium of claim 11, wherein: the action is configured based on a sensory context;the storage medium is embodied in a controller apparatus of a robot;responsive to the sensory context comprising a representation of an obstacle, the action comprises an avoidance maneuver executed by the robot; andresponsive to the sensory context comprising a representation of a target, the action comprises an approach maneuver executed by the robot.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.