Apparatus and methods for operating robotic devices using selective state space training
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G05B-019/18
B25J-009/16
G06N-099/00
G06N-003/00
G06N-003/04
출원번호
US-0070269
(2013-11-01)
등록번호
US-9566710
(2017-02-14)
발명자
/ 주소
Passot, Jean-Baptiste
Sinyavskiy, Oleg
Izhikevich, Eugene
출원인 / 주소
BRAIN CORPORATION
대리인 / 주소
Gazdzinski & Associates, PC
인용정보
피인용 횟수 :
0인용 특허 :
98
초록▼
Apparatus and methods for training and controlling of e.g., robotic devices. In one implementation, a robot may be utilized to perform a target task characterized by a target trajectory. The robot may be trained by a user using supervised learning. The user may interface to the robot, such as via a
Apparatus and methods for training and controlling of e.g., robotic devices. In one implementation, a robot may be utilized to perform a target task characterized by a target trajectory. The robot may be trained by a user using supervised learning. The user may interface to the robot, such as via a control apparatus configured to provide a teaching signal to the robot. The robot may comprise an adaptive controller comprising a neuron network, which may be configured to generate actuator control commands based on the user input and output of the learning process. During one or more learning trials, the controller may be trained to navigate a portion of the target trajectory. Individual trajectory portions may be trained during separate training trials. Some portions may be associated with robot executing complex actions and may require additional training trials and/or more dense training input compared to simpler trajectory actions.
대표청구항▼
1. An adaptive controller apparatus comprising a plurality of computer readable instructions configured to, when executed, cause a performance of a target task by a robot, the computer readable instructions configured to cause the adaptive controller apparatus to: during a first training trial compr
1. An adaptive controller apparatus comprising a plurality of computer readable instructions configured to, when executed, cause a performance of a target task by a robot, the computer readable instructions configured to cause the adaptive controller apparatus to: during a first training trial comprising at least one action and performed without at least one second action, determine a predicted signal configured in accordance with a sensory input, the predicted signal being configured to cause execution of an action associated with the target task, the execution of the action associated with the target task being characterized by a first performance;during a second training trial, based on a teaching input and the predicted signal, determine a combined signal of the at least one action and the at least one second action, the combined signal configured to cause the execution of the action associated with the target task, the execution of the action associated with the target task during the second training trial being characterized by a second performance; andadjust a learning parameter of the adaptive controller apparatus based on the first performance and the second performance, the adjustment of the learning parameter comprising one or more iterative adjustments of at least the first performance of the first training trial and the second performance of the second training trial in alternation until the learning parameter reaches a target threshold;wherein the performance of the target task comprises the execution of the action associated with the target task and the at least one second action contemporaneously. 2. The apparatus of claim 1, wherein: the adjustment of the learning parameter is configured to enable the adaptive controller apparatus to determine, during a third training trial, another predicted signal configured in accordance with the sensory input; andthe execution, based on the another predicted signal, of the action associated with the target task during the third training trial is characterized by a third performance that is closer to the target task compared to the first performance. 3. The apparatus of claim 2, wherein: the performance of the target task is characterized by a target trajectory in a state space;the execution of the action associated with the target task is characterized by a portion of the target trajectory having a state space extent associated therewith; andthe state space extent occupies a minority fraction of the state space. 4. The apparatus of claim 2, wherein: the second training trial is configured to occur subsequent to the first training trial and prior to the third training trial; andthe combined signal is effectuated based at least on a transform function comprising one or more operations including an additive operation. 5. The apparatus of claim 2, wherein: the combined signal is effectuated based at least on a transform function comprising one or more operations including a union operation, andthe transform function is configured based at least on a gating signal configured to toggle a state of the transform function between: (i) a transform state configured to produce the combined signal; and (ii) a bypass state configured to produce a transform function output comprising the teaching input and independent of the predicted signal. 6. The apparatus of claim 5, wherein: the transform function bypass state is effectuated responsive to one or more of (a) a zero weight being assigned to the predicted signal, or (b) a zero signal being assigned to the predicted signal, the zero signal comprising a pre-defined value. 7. The apparatus of claim 2, wherein: the predicted signal is generated based at least on a learning process configured to be adapted at time intervals in accordance with the sensory input and a feedback; andthe adaptation is based at least on an error measure between (i) the predicted signal generated at a given time interval and (ii) the feedback determined at another time interval prior to the given time interval. 8. A robotic apparatus comprising: a platform characterized by first and second degrees of freedom of motion;a sensor module configured to provide information related to an environment of the platform; andan adaptive controller apparatus configured to determine first and second control signals to facilitate operation of the first and the second degrees of freedom of motion of the robotic apparatus, respectively;wherein: the first and the second control signals are configured to cause the platform to perform a target action;the first control signal is determined in accordance with the information related to the environment of the platform and a teaching input;the second control signal is determined in an absence of the teaching input and in accordance with the information related to the environment of the platform and a configuration of the adaptive controller apparatus; andthe configuration is determined based at least on an outcome of training of the adaptive controller apparatus to operate the second degree of freedom of motion of the robotic apparatus, the training to operate the second degree of freedom of motion being configured to occur with the first degree of freedom of motion of the robotic apparatus held static entirely during the training. 9. The apparatus of claim 8, wherein: the determination of the first control signal is effectuated based at least on a supervised learning process characterized by multiple iterations; andthe performance of the target action in accordance with the first control signal at a given iteration is characterized by a first performance. 10. The apparatus of claim 9 wherein: the adaptive controller apparatus is configured to modify the configuration based at least on the teaching input, thereby enabling the adaptive controller apparatus to produce another version of the first control signal at another iteration subsequent to the given iteration and in the absence of the teaching input; andthe performance of the target action in accordance with the another version of the first control signal is characterized by a second performance that is closer, relative the first performance, to a target performance associated with the target action. 11. The apparatus of claim 10, wherein: the teaching input is associated with the operation of the first degree of freedom of motion; anda third performance associated with the performance of the target action at the given iteration absent the teaching input is lower compared to the first performance. 12. The apparatus of claim 9, wherein: the target action is characterized by a trajectory having a duration associated therewith;provision of the teaching input is characterized by a time interval configured to be shorter as compared to the duration;the information related to the environment of the platform comprises a characteristic of an object within the environment; andthe target action is configured based on the characteristic of the object. 13. A method of operating a robotic controller apparatus, the robotic controller apparatus configured to cause a robot to perform a target task, the method comprising: during a first training trial comprising at least one action and performed without at least one second action: determining a predicted signal configured in accordance with a sensory input; andexecuting an action associated with the target task via the predicted signal, the execution of the action associated with the target task being characterized by a first performance;during a second training trial, based on a teaching input and the predicted signal: determining a combined signal of the at least one action and the at least one second action; andexecuting the action associated with the target task via the combined signal, the execution of the action associated with the target task during the second training trial being characterized by a second performance; andadjusting a learning parameter of the robotic controller apparatus based on the first performance and the second performance, the adjustment of the learning parameter comprising iteratively adjusting at least the first performance of the first training trial and the second performance of the second training trial in alternation until the learning parameter reaches a target threshold;wherein the performance of the target task comprises the execution of the action associated with the target task and the at least one second action contemporaneously. 14. The method of claim 13, wherein: the adjustment of the learning parameter comprises enabling the robotic controller apparatus to determine, during a third training trial, another predicted signal configured in accordance with the sensory input; andexecuting, based on the other predicted signal, the action associated with the target task during the third training trial, the third training trial characterized by a third performance that is closer to the target task as compared to the first performance. 15. The method of claim 14, wherein: the performance of the target task is characterized by a target trajectory in a state space;the execution of the action associated with the target task is characterized by a portion of the target trajectory having a state space extent associated therewith; andthe state space extent is configured to occupy a minority fraction of the state space. 16. The method of claim 14, further comprising effectuating the combined signal based at least on a transform function comprising one or more operations including an additive operation; wherein the second training trial is configured to occur subsequent to the first training trial and prior to the third training trial. 17. The method of claim 14, further comprising effectuating the combined signal based at least on a transform function comprising one or more operations including a union operation; wherein the transform function is configured based at least on a gating signal configured to toggle a state of the transform function between: (i) a transform state configured to produce the combined signal; and (ii) a bypass state configured to produce a transform function output comprising the teaching input and independently of the predicted signal. 18. The method of claim 17, further comprising effectuating the bypass state in response to one or more of (i) a zero weight being assigned to the predicted signal and (ii) a zero signal being assigned to the predicted signal, the zero signal comprising a pre-defined value. 19. The method of claim 14, further comprising: generating the predicted signal based at least on a learning process configured to be adapted at time intervals in accordance with the sensory input and a feedback; andadapting the learning process based at least on an error measure between (i) the predicted signal generated at a given time interval and (ii) the feedback determined at another time interval prior to the given time interval.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (98)
Werbos Paul J., 3-brain architecture for an intelligent decision and control system.
Ito, Masato; Minamino, Katsuki; Yoshiike, Yukiko; Suzuki, Hirotaka; Kawamoto, Kenta, Apparatus and method for embedding recurrent neural networks into the nodes of a self-organizing map.
DeYong Mark R. (Las Cruces NM) Findley Randall L. (Austin TX) Eskridge Thomas C. (Las Cruces NM) Fields Christopher A. (Rockville MD), Asynchronous temporal neural processing element.
Kerr Randal H. (Richford NY) Mesnard Robert M. (Endicott NY), Automatic generation of executable computer code which commands another program to perform a task and operator modificat.
Frank D. Francone ; Peter Nordin SE; Wolfgang Banzhaf DE, Computer implemented machine learning method and system including specifically defined introns.
Spoerre Julie K. (Tallahassee FL) Lin Chang-Ching (Tallahassee FL) Wang Hsu-Pin (Tallahassee FL), Machine performance monitoring and fault classification using an exponentially weighted moving average scheme.
Grossberg Stephen (Newton Highlands MA) Kuperstein Michael (Brookline MA), Massively parellel real-time network architectures for robots capable of self-calibrating their operating parameters thr.
Abdallah, Muhammad E; Platt, Robert; Wampler, II, Charles W.; Reiland, Matthew J; Sanders, Adam M, Method and apparatus for automatic control of a humanoid robot.
Sakaue Shiyuki (Yokohama JPX) Sugimoto Koichi (Hiratsuka JPX) Arai Shinichi (Yokohama JPX), Method and apparatus for controlling a robot hand along a predetermined path.
Peltola Tero (Helsinki FIX) Matakselka Jorma (Vantaa FIX) Harju Esa (Espoo FIX) Salovuori Heikki (Helsinki FIX) Keskinen Jukka (Vantaa FIX) Makinen Kari (Helsinki FIX) Roikonen Olli (Espoo FIX), Method for congestion management in a frame relay network and a node in a frame relay network.
Wilson Charles L. (Darnestown MD) Garris Michael D. (Gaithersburg MD) Wilkinson ; Jr. Robert A. (Hyattstown MD), Object/anti-object neural network segmentation.
Yokono, Jun; Sabe, Kohtaro; Costa, Gabriel; Ohashi, Takeshi, Operational control method, program, and recording media for robot device, and robot device.
Eguchi, Toru; Yamada, Akihiro; Kusumi, Naohiro; Sekiai, Takaaki; Fukai, Masayuki; Shimizu, Satoru, Plant control system and thermal power generation plant control system.
Coenen, Olivier, Proportional-integral-derivative controller effecting expansion kernels comprising a plurality of spiking neurons associated with a plurality of receptive fields.
Hickman, Ryan; Kuffner, Jr., James J.; Bruce, James R.; Gharpure, Chaitanya; Kohler, Damon; Poursohi, Arshan; Francis, Jr., Anthony G.; Lewis, Thor, Shared robot knowledge base for use with cloud computing system.
Shaffer Gary K. (Butler PA) Whittaker William L. (Pittsburgh PA) West Jay H. (Pittsburgh PA) Clow Richard G. (Phoenix AZ) Singh Sanjiv J. (Pittsburgh PA) Lay Norman K. (Peoria IL) Devier Lonnie J. (P, System and method for detecting obstacles in the path of a vehicle.
Blumberg, Bruce; Brooks, Rodney; Buehler, Christopher J.; Deegan, Patrick A.; DiCicco, Matthew; Dye, Noelle; Ens, Gerry; Linder, Natan; Siracusa, Michael; Sussman, Michael; Williamson, Matthew M., Training and operating industrial robots.
Mochizuki, Yoshiyuki; Naka, Toshiya; Asahara, Shigeo, Virtual space control data receiving apparatus,virtual space control data transmission and reception system, virtual space control data receiving method, and virtual space control data receiving prog.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.