Apparatus and methods for online training of robots
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
B25J-009/16
G06N-003/00
G06N-003/04
G05D-001/00
G05D-001/02
G06N-099/00
출원번호
US-0070114
(2013-11-01)
등록번호
US-9463571
(2016-10-11)
발명자
/ 주소
Sinyavskiy, Oleg
Passot, Jean-Baptiste
Izhikevich, Eugene
출원인 / 주소
Brian Corporation
대리인 / 주소
Gazdinski & Associates, PC
인용정보
피인용 횟수 :
0인용 특허 :
93
초록▼
Robotic devices may be trained by a user guiding the robot along a target trajectory using a correction signal. A robotic device may comprise an adaptive controller configured to generate control commands based on one or more of the trainer input, sensory input, and/or performance measure. Training
Robotic devices may be trained by a user guiding the robot along a target trajectory using a correction signal. A robotic device may comprise an adaptive controller configured to generate control commands based on one or more of the trainer input, sensory input, and/or performance measure. Training may comprise a plurality of trials. During an initial portion of a trial, the trainer may observe robot's operation and refrain from providing the training input to the robot. Upon observing a discrepancy between the target behavior and the actual behavior during the initial trial portion, the trainer may provide a teaching input (e.g., a correction signal) configured to affect robot's trajectory during subsequent trials. Upon completing a sufficient number of trials, the robot may be capable of navigating the trajectory in absence of the training input.
대표청구항▼
1. A robotic apparatus, comprising: a controllable actuator;a sensor module configured to provide information related to an environment surrounding the robotic apparatus; andan adaptive controller configured to produce a control instruction for the controllable actuator in accordance with the inform
1. A robotic apparatus, comprising: a controllable actuator;a sensor module configured to provide information related to an environment surrounding the robotic apparatus; andan adaptive controller configured to produce a control instruction for the controllable actuator in accordance with the information provided by the sensor module, the control instruction being configured to cause the robotic apparatus to execute a target task;wherein: execution of the target task is characterized by the robotic apparatus traversing a trajectory of a first trajectory and a second trajectory;the first trajectory and the second trajectory each having at least one different parameter associated with the environment;the adaptive controller is operable in accordance with a supervised learning process configured based on a training signal and a plurality of trials;at a given trial of the plurality of trials, the control instruction is configured to cause the robot to traverse one of the first trajectory and the second trajectory;the training signal is generated based on the control instruction;the training signal is configured to strengthen a trajectory selection by the controller with an effectiveness value such that, based on one of the first and second trajectory being selected for a first trial, the selected one of the first and second trajectory is more likely to be selected during one or more trials subsequent to the first trial; andthe effectiveness value of the training signal on the training process is reduced after a threshold number of trials of the plurality of trials. 2. An adaptive controller apparatus, comprising: one or more processors configured to execute computer program instructions that, when executed, cause a robot to: at a first time instance, execute a first action in accordance with a sensory context and a random choice;at a second time instance subsequent to the first time instance, determine whether to execute the first action based on the sensory context and a teaching input received during the first time instance, the teaching input being received based on the first action in accordance with the sensory context and the random choice; andexecute the first action in accordance with the determination;wherein: a target task comprises at least the first action; andthe teaching input is configured to increase or decrease a probability of execution of the first action, the teaching input having an effectiveness value determined from the execution of the first action at one or more time instances, where the effectiveness value is reduced after a threshold number of the one or more time instances. 3. The adaptive controller apparatus of claim 2, further comprising computer program instructions that, when executed, cause the robot to: at a given time instance, determine whether to execute one of the first action or a second action;wherein the execution of the first action at the given time instance is configured to increase the probability of execution of the first action at a subsequent time instance. 4. The adaptive controller apparatus of claim 3, wherein the probability of execution of the first action is increased relative to a probability of execution of the second action at the second time instance. 5. The adaptive controller apparatus of claim 3, wherein the teaching input is configured to reduce a probability of the robot executing a composite action at the second time instance, the composite action comprising the first action and the second action. 6. The adaptive controller apparatus of claim 2, further comprising a computer-readable medium comprising a plurality of instructions that, when executed, cause the robot to: receive a first control signal and a second control signal via a supervised learning process;wherein the first action execution at the first time instance and the second time instance is based on the first control signal and the second control signal, respectively, received by the supervised learning process; andresponsive to the receipt of the teaching input, associate a sensory context to the first action. 7. The adaptive controller apparatus of claim 6, wherein: the supervised learning process is configured based on a neuron network comprising a plurality of neurons communicating via a plurality of connections;one or more individual connections of the plurality of connections provide an input into a given one of the plurality of neurons that is characterized by a connection efficacy configured to affect operation of the given one of the plurality of neurons; andthe association of the sensory context to the first action comprises an adjustment of the connection efficacy based on the teaching input and the first control signal. 8. The adaptive controller apparatus of claim 2, wherein: the first action and a second action are characterized by a different value of a state parameter associated with an environment; andthe state parameter is selected from a group consisting of a spatial coordinate, a robot's velocity, a robot's orientation, and a robot's position. 9. The adaptive controller apparatus of claim 2, wherein: the adaptive controller apparatus is embodied in the robot; andresponsive to the sensory context comprising a representation of an obstacle, the target task comprises an avoidance maneuver executed by the robot; andresponsive to the sensory context comprising a representation of a target, the target task comprises an approach maneuver executed by the robot. 10. The adaptive controller apparatus of claim 2, wherein: the execution of the first action is configured based on a control signal, the control signal being updated at time intervals shorter than one second; andthe first time instance and the second time instance are separated by an interval that is no shorter than one second. 11. The adaptive controller apparatus of claim 2, wherein the teaching input is provided by a computerized entity via a wireless interface. 12. The adaptive controller apparatus of claim 2, wherein: the robot comprises an autonomous platform;the controller apparatus is embodied on the autonomous platform; andthe teaching input is provided by a computerized module comprising a proximity indicator configured to generate a proximity indicator signal based on an object being within a given range from the platform. 13. The adaptive controller apparatus of claim 2, wherein: the adaptive controller apparatus is operable in accordance with a supervised learning process configured based on the teaching signal;the sensory context comprises information indicative of an object within an environment of the robot;the execution of the first action is based on a first predicted control output of the supervised learning process configured in accordance with the sensory context; andexecution of a second action is based on a second predicted control output of the supervised learning process configured in accordance with the sensory context and the teaching input. 14. The adaptive controller apparatus of claim 13, wherein: the first and the second predicted control output are determined based on an output of an adaptive predictor module operable in accordance with the supervised learning process configured in accordance with the teaching input;the supervised learning process is configured to combine the teaching signal with the first predicted control output at the first time instance to produce a combined signal; andthe teaching input at the second time instance is configured based on the combined signal. 15. The adaptive controller apparatus of claim 14, wherein: the supervised learning process is configured based on a backward propagation of an error; andthe combined signal is determined based on a transform function configured based on a union operation. 16. The adaptive controller apparatus of claim 14, wherein: the combined signal is determined based on a transform function configured based on one or more operations including an additive operation characterized by a first weight and a second weight;the first weight is configured to be applied to a predictor output; andthe second weight is configured to be applied to the teaching input. 17. The adaptive controller apparatus of claim 16, wherein: a value of the first weight at the first time instance is greater than the value of the first weight at the second time instance; anda value of the second weight at the first time instance is lower than the value of the second weight at the second time instance. 18. The adaptive controller apparatus of claim 2, wherein: the robot comprises a mobile platform;the adaptive controller apparatus is configured to be embodied on the mobile platform; andthe sensory context is based on a visual input provided by a camera disposed on the mobile platform. 19. A method of increasing a probability of action execution by a robotic apparatus, comprising: receiving a sensory context from a sensor;at a first time instance, executing a first action with the robotic apparatus in accordance with the sensory context;at a second time instance subsequent to the first time instance, determining with an adaptive controller whether to execute the first action based on the sensory context received from the sensor and a teaching input received from a user interface during the first time instance; andexecuting the first action with the robotic apparatus in accordance with the determination of the adaptive controller;wherein: a target task comprises at least the first action; andincreasing or decreasing a probability of execution of the first action is based on the teaching input, the teaching input having an effectiveness value determined by the adaptive controller from the execution of the first action at one or more time instances, where the effectiveness value is reduced after a threshold number of the one or more time instances. 20. The method of claim 19, wherein the determining whether to execute the first action further comprises determining whether to execute a second action by the adaptive controller; and the method further comprises executing the second action with the robotic apparatus in accordance with the determination of whether to execute the second action.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (93)
Werbos Paul J., 3-brain architecture for an intelligent decision and control system.
Ito, Masato; Minamino, Katsuki; Yoshiike, Yukiko; Suzuki, Hirotaka; Kawamoto, Kenta, Apparatus and method for embedding recurrent neural networks into the nodes of a self-organizing map.
DeYong Mark R. (Las Cruces NM) Findley Randall L. (Austin TX) Eskridge Thomas C. (Las Cruces NM) Fields Christopher A. (Rockville MD), Asynchronous temporal neural processing element.
Kerr Randal H. (Richford NY) Mesnard Robert M. (Endicott NY), Automatic generation of executable computer code which commands another program to perform a task and operator modificat.
Frank D. Francone ; Peter Nordin SE; Wolfgang Banzhaf DE, Computer implemented machine learning method and system including specifically defined introns.
Spoerre Julie K. (Tallahassee FL) Lin Chang-Ching (Tallahassee FL) Wang Hsu-Pin (Tallahassee FL), Machine performance monitoring and fault classification using an exponentially weighted moving average scheme.
Grossberg Stephen (Newton Highlands MA) Kuperstein Michael (Brookline MA), Massively parellel real-time network architectures for robots capable of self-calibrating their operating parameters thr.
Sakaue Shiyuki (Yokohama JPX) Sugimoto Koichi (Hiratsuka JPX) Arai Shinichi (Yokohama JPX), Method and apparatus for controlling a robot hand along a predetermined path.
Peltola Tero (Helsinki FIX) Matakselka Jorma (Vantaa FIX) Harju Esa (Espoo FIX) Salovuori Heikki (Helsinki FIX) Keskinen Jukka (Vantaa FIX) Makinen Kari (Helsinki FIX) Roikonen Olli (Espoo FIX), Method for congestion management in a frame relay network and a node in a frame relay network.
Wilson Charles L. (Darnestown MD) Garris Michael D. (Gaithersburg MD) Wilkinson ; Jr. Robert A. (Hyattstown MD), Object/anti-object neural network segmentation.
Yokono, Jun; Sabe, Kohtaro; Costa, Gabriel; Ohashi, Takeshi, Operational control method, program, and recording media for robot device, and robot device.
Eguchi, Toru; Yamada, Akihiro; Kusumi, Naohiro; Sekiai, Takaaki; Fukai, Masayuki; Shimizu, Satoru, Plant control system and thermal power generation plant control system.
Coenen, Olivier, Proportional-integral-derivative controller effecting expansion kernels comprising a plurality of spiking neurons associated with a plurality of receptive fields.
Hickman, Ryan; Kuffner, Jr., James J.; Bruce, James R.; Gharpure, Chaitanya; Kohler, Damon; Poursohi, Arshan; Francis, Jr., Anthony G.; Lewis, Thor, Shared robot knowledge base for use with cloud computing system.
Blumberg, Bruce; Brooks, Rodney; Buehler, Christopher J.; Deegan, Patrick A.; DiCicco, Matthew; Dye, Noelle; Ens, Gerry; Linder, Natan; Siracusa, Michael; Sussman, Michael; Williamson, Matthew M., Training and operating industrial robots.
Mochizuki, Yoshiyuki; Naka, Toshiya; Asahara, Shigeo, Virtual space control data receiving apparatus,virtual space control data transmission and reception system, virtual space control data receiving method, and virtual space control data receiving prog.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.