Apparatus and methods for haptic training of robots
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G05B-019/18
B25J-009/16
G05D-001/00
G05D-001/02
G06N-003/00
G06N-003/04
G06N-099/00
출원번호
US-0102410
(2013-12-10)
등록번호
US-9597797
(2017-03-21)
발명자
/ 주소
Ponulak, Filip
Kazemi, Moslem
Laurent, Patryk
Sinyavskiy, Oleg
Izhikevich, Eugene
출원인 / 주소
Brain Corporation
대리인 / 주소
Gazdzinski & Associates, PC
인용정보
피인용 횟수 :
1인용 특허 :
77
초록▼
Robotic devices may be trained by a trainer guiding the robot along a target trajectory using physical contact with the robot. The robot may comprise an adaptive controller configured to generate control commands based on one or more of the trainer input, sensory input, and/or performance measure. T
Robotic devices may be trained by a trainer guiding the robot along a target trajectory using physical contact with the robot. The robot may comprise an adaptive controller configured to generate control commands based on one or more of the trainer input, sensory input, and/or performance measure. The trainer may observe task execution by the robot. Responsive to observing a discrepancy between the target behavior and the actual behavior, the trainer may provide a teaching input via a haptic action. The robot may execute the action based on a combination of the internal control signal produced by a learning process of the robot and the training input. The robot may infer the teaching input based on a comparison of a predicted state and actual state of the robot. The robot's learning process may be adjusted in accordance with the teaching input so as to reduce the discrepancy during a subsequent trial.
대표청구항▼
1. A processor-implemented method of operating a robot, the method being performed by one or more processors configured to execute computer program instructions, the method comprising: during a first trial, operating, using one or more processors, the robot to perform a task characterized by a targe
1. A processor-implemented method of operating a robot, the method being performed by one or more processors configured to execute computer program instructions, the method comprising: during a first trial, operating, using one or more processors, the robot to perform a task characterized by a target trajectory; andresponsive to observing a discrepancy between an actual trajectory and the target trajectory, adjusting the actual trajectory with a robotic trainer with a physical contact with the robot;wherein: the performance of the task is configured based on a learning process configured to determine a first control signal at the first trial;the adjusting of the actual trajectory comprises modifying the learning process so as to determine a second control signal; andduring a second trial subsequent to the first trial, the first and the second control signals cooperate to transition the actual trajectory towards the target trajectory. 2. The method of claim 1, wherein: the determination of the first control signal is configured based on a context conveying information about an environment of the robot;the determination of the second control signal is configured based on the context. 3. The method of claim 1, wherein: the learning process is configured based on a teaching signal; andthe modifying of the learning process is configured based on the teaching signal being determined based on an evaluation of the adjusting of the actual trajectory. 4. The method of claim 3, wherein: the learning process comprises a supervised learning process characterized by an output;the determination of the first control signal is configured based on a context conveying information about an environment of the robot; andthe teaching signal comprises a supervisory input into the learning process configured to convey a target output associated with the context. 5. The method of claim 4, wherein: the teaching signal is configured based on a combination of the first control signal and the second control signal. 6. A non-transitory computer readable medium comprising a plurality of instruction which, when executed by one or more processors, effectuate control of a robotic apparatus by: based on a context, determine a first control signal configured to transition the robotic apparatus to a first state;determine a discrepancy between a current trajectory associated with a current state, and a first trajectory associated with the first state, where the discrepancy between the trajectories comprises a measurable difference; anddetermine a second control signal based on the discrepancy, the second control signal configured to transition the robotic apparatus to the current state;wherein: the current state is configured based on the first control signal and a state modification, wherein the state modification is applied with a physical contact to the robotic apparatus. 7. The non-transitory computer readable medium of claim 6, wherein: the determination of the first control signal and the determination of the second control signal are configured in accordance with an online learning process; andthe online learning process is configured to be updated at a plurality of first time intervals based on the context and a teaching signal. 8. The non-transitory computer readable medium of claim 7, wherein: for a given first interval of the plurality of first time intervals, a change in the context is configured to cause an adaptation of the learning process, the adaptation being configured to produce another version of a control signal; andthe context is configured to convey information related to one or more of a sensory input, a robot state, and the teaching signal. 9. The non-transitory computer readable medium of claim 8, wherein: the determination of the first control signal is characterized by an update rate having a plurality of second intervals associated therewith; anda given second interval is configured to match the given first interval. 10. The non-transitory computer readable medium of claim 8, wherein: the context comprises a time history of one or more of the sensory input, the robot state, and the teaching signal determined over one or more of the plurality of first time intervals;the determination of the first control signal is characterized by an update rate having a plurality of second intervals associated therewith; andindividual ones of the plurality of the second intervals comprise one or more of the plurality of first time intervals. 11. The non-transitory computer readable medium of claim 8, wherein: individual ones of the current state and the first state are characterized by a state parameter; andthe determination of the discrepancy is configured based on an evaluation of a distance measure between the state parameter of the current state and the state parameter of the first state. 12. The non-transitory computer readable medium of claim 11, wherein: the state parameter comprises a vector comprising two or more components configured to characterize one or more of a position, a motion, an orientation, an energy use, and an available energy of the robotic apparatus. 13. The non-transitory computer readable medium of claim 11, wherein: the robotic apparatus comprises a manipulator comprising first and second actuators;the first control signal is configured to actuate at least one of the first and second actuators;the state parameter comprises a vector comprising two or more components configured to characterize a configuration of the manipulator, the configuration being characterized by one or more of an orientation, an actuator torque, a position, and a motion. 14. The non-transitory computer readable medium of claim 13, wherein: the discrepancy is configured based on an intervention from a user, the intervention configured to alter state parameters of the first and the second actuators substantially contemporaneously with one another. 15. The non-transitory computer readable medium of claim 8, wherein: the discrepancy is configured based on the physical contact;individual ones of the current state and the first state are characterized by a state parameter; andthe determination of the discrepancy is configured based on a comparison of the first state and the current state. 16. The non-transitory computer readable medium of claim 15, wherein: the determination of the discrepancy is configured based on the measurable difference, comprising a difference measure between the first state and the current state;the first state comprises a predicted state determined in accordance with a forward model of the robotic apparatus, the forward model configured to predict the first state of the robotic apparatus based on the context. 17. The non-transitory computer readable medium of claim 16, wherein: the forward model is characterized by a model parameter; andthe determination of the discrepancy at a given time is configured to modify the model parameter so as to enable a determination of a third control signal at a subsequent time, the third control signal being capable of a transition of the robotic apparatus to the current state responsive to an occurrence of the context at the subsequent time. 18. The non-transitory computer readable medium of claim 8, wherein: the determination of the first control signal and the determination of the second control signal are configured in accordance with an online learning process characterized by a learning parameter configured to be updated at a plurality of time intervals;the determination of the discrepancy is configured to be effectuated at a given interval of the plurality of time intervals; andfor a subsequent interval of the plurality of time intervals, the learning parameter is configured based on the discrepancy determined for the given interval. 19. The non-transitory computer readable medium of claim 18, wherein: the first control signal is determined during the given interval having the context associated therewith; andan update of the learning parameter based on the discrepancy during the given interval is configured to give raise to the second control signal responsive to an occurrence of the context during the subsequent interval. 20. The non-transitory computer readable medium of claim 18, wherein: the determination of the discrepancy is configured to be effectuated based on an indication provided during the physical contact, the indication being configured to convey information related to an occurrence of the state modification. 21. The non-transitory computer readable medium of claim 20, wherein: the indication comprises one or more visual signal comprising information related to the physical contact. 22. The non-transitory computer readable medium of claim 20, wherein: the robotic apparatus comprises an actuator configured to be controlled using one or more of the first and the second control signals; andthe indication comprises a current torque value of the actuator associated with the current state. 23. The non-transitory computer readable medium of claim 20, wherein: the robotic apparatus comprises an actuator configured to be controlled using one or more of the first and the second control signals; andthe indication comprises a current position value of the actuator associated with the current state. 24. An adaptive robot apparatus, comprising: a manipulator comprising first and second joints characterized by first and second joint angles, respectively;a sensor module configured to convey information related to one or more of an environment of the adaptive robot apparatus and the manipulator; andan adaptive controller operable in accordance with a learning process configured to: guide the manipulator to a target state in accordance with the information;determine a discrepancy between a target trajectory that corresponds to the target state and a current trajectory that corresponds to a current state; andupdate the learning process based on the discrepancy;wherein: the discrepancy is configured based on an intervention by a user, the intervention by the user comprising modification of the first and the second joint angles with a physical contact with the manipulator; andthe updated learning process comprises determination of a correction signal, the correction signal configured to guide the manipulator to the current state based on an occurrence of the information. 25. The apparatus of claim 24, wherein: the learning process is configured in accordance with a teaching signal;the guiding of the manipulator to the target state is configured based on a control signal determined by the learning process in accordance with the information; andthe teaching signal is configured based on the correction signal. 26. The apparatus of claim 25, wherein: the learning process is configured to be updated at one or more time intervals;the information comprises a time history of one or more of a sensor module output, a configuration of the manipulator, the control signal, and the teaching signal determined over one or more time intervals.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (77)
Ito, Masato; Minamino, Katsuki; Yoshiike, Yukiko; Suzuki, Hirotaka; Kawamoto, Kenta, Apparatus and method for embedding recurrent neural networks into the nodes of a self-organizing map.
DeYong Mark R. (Las Cruces NM) Findley Randall L. (Austin TX) Eskridge Thomas C. (Las Cruces NM) Fields Christopher A. (Rockville MD), Asynchronous temporal neural processing element.
Frank D. Francone ; Peter Nordin SE; Wolfgang Banzhaf DE, Computer implemented machine learning method and system including specifically defined introns.
Spoerre Julie K. (Tallahassee FL) Lin Chang-Ching (Tallahassee FL) Wang Hsu-Pin (Tallahassee FL), Machine performance monitoring and fault classification using an exponentially weighted moving average scheme.
Grossberg Stephen (Newton Highlands MA) Kuperstein Michael (Brookline MA), Massively parellel real-time network architectures for robots capable of self-calibrating their operating parameters thr.
Sakaue Shiyuki (Yokohama JPX) Sugimoto Koichi (Hiratsuka JPX) Arai Shinichi (Yokohama JPX), Method and apparatus for controlling a robot hand along a predetermined path.
Peltola Tero (Helsinki FIX) Matakselka Jorma (Vantaa FIX) Harju Esa (Espoo FIX) Salovuori Heikki (Helsinki FIX) Keskinen Jukka (Vantaa FIX) Makinen Kari (Helsinki FIX) Roikonen Olli (Espoo FIX), Method for congestion management in a frame relay network and a node in a frame relay network.
Wilson Charles L. (Darnestown MD) Garris Michael D. (Gaithersburg MD) Wilkinson ; Jr. Robert A. (Hyattstown MD), Object/anti-object neural network segmentation.
Yokono, Jun; Sabe, Kohtaro; Costa, Gabriel; Ohashi, Takeshi, Operational control method, program, and recording media for robot device, and robot device.
Eguchi, Toru; Yamada, Akihiro; Kusumi, Naohiro; Sekiai, Takaaki; Fukai, Masayuki; Shimizu, Satoru, Plant control system and thermal power generation plant control system.
Hickman, Ryan; Kuffner, Jr., James J.; Bruce, James R.; Gharpure, Chaitanya; Kohler, Damon; Poursohi, Arshan; Francis, Jr., Anthony G.; Lewis, Thor, Shared robot knowledge base for use with cloud computing system.
Blumberg, Bruce; Brooks, Rodney; Buehler, Christopher J.; Deegan, Patrick A.; DiCicco, Matthew; Dye, Noelle; Ens, Gerry; Linder, Natan; Siracusa, Michael; Sussman, Michael; Williamson, Matthew M., Training and operating industrial robots.
Mochizuki, Yoshiyuki; Naka, Toshiya; Asahara, Shigeo, Virtual space control data receiving apparatus,virtual space control data transmission and reception system, virtual space control data receiving method, and virtual space control data receiving prog.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.