Method for the computer-assisted control and/or regulation of a technical system where the dynamic behavior of the technical system is modeled using a recurrent neural network
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06E-001/00
G06E-003/00
G06F-015/18
G06G-007/00
출원번호
US-0522040
(2007-12-19)
등록번호
US-8554707
(2013-10-08)
우선권정보
GB-10 2007 001 025 (2007-01-02)
국제출원번호
PCT/EP2007/064265
(2007-12-19)
§371/§102 date
20091222
(20091222)
국제공개번호
WO2008/080864
(2008-07-10)
발명자
/ 주소
Schäfer, Anton Maximilian
Udluft, Steffen
Zimmermann, Hans-Georg
출원인 / 주소
Siemens Aktiengesellschaft
인용정보
피인용 횟수 :
5인용 특허 :
1
초록▼
A method for the computer-assisted control and/or regulation of a technical system is provided. The method includes two steps, namely modeling the dynamic behavior of the technical system with a recurrent neural network using training data, the recurrent neural network includes states and actions de
A method for the computer-assisted control and/or regulation of a technical system is provided. The method includes two steps, namely modeling the dynamic behavior of the technical system with a recurrent neural network using training data, the recurrent neural network includes states and actions determined using a simulation model at different times and learning an action selection rule by the recurrent neural network to a further neural network. The method can be used with any technical system in order to control the system in an optimum computer-assisted manner. For example, the method can be used in the control of a gas turbine.
대표청구항▼
1. A method for computer-aided control and/or regulation of a technical system comprising: modeling a dynamic behavior of technical system, wherein the technical system is described by a state (xt) of the technical system and an action (at) performed on the technical system for a plurality of points
1. A method for computer-aided control and/or regulation of a technical system comprising: modeling a dynamic behavior of technical system, wherein the technical system is described by a state (xt) of the technical system and an action (at) performed on the technical system for a plurality of points in time (t), with a respective action (at) leading at a respective point in time (t) into a new state (xt+1) of the technical system at a next point in time (t+1); wherein the modeling of the dynamic behavior of the technical system uses a recurrent neural network using training data, the recurrent neural network includes a plurality of states (xt) and a plurality of actions (at) determined using a simulation model at a plurality of different times (t), wherein the recurrent neural network is formed by a first input layer (I) including the plurality of states (xt) and the plurality of actions (at) performed on the technical system for the plurality of different times (t), a hidden recurrent layer (H) including a first plurality of hidden states (st, pt), and an output layer (O) including the plurality of states (xt) for the plurality of different times (t);learning an action selection rule by the recurrent neural network by coupling the recurrent neural network to a further neural network for a current time and future times, wherein the further neural network comprises a feed-forward network that includes a second input layer, a second hidden layer (R) including a second plurality of hidden states (rt) and a second output layer (O′), wherein the further neural network uses as its second input layer a respective part of the first plurality of hidden states (pt) of the hidden recurrent layer (H) at a respective point in time (t), which is coupled to the second hidden layer (R), which is coupled to the second output layer (O′) that comprises a predicted action (at) to be performed on the technical system at the respective point in time, the predicted action (at) fed forward to the hidden state (st) of the hidden recurrent layer (H) such that the predicted actions learned by the further neural network rather than externally input future actions are used for the learning the actin selection rule; anddetermining the state of the technical system and the action to be performed on the technical system by the recurrent neural network coupled to the further neural network using a plurality of learned action selection rules. 2. The method as claimed in claim 1, wherein the action selection rule is learned using an evaluation function which takes into account a criteria in relation to the plurality of states and/or the plurality of actions performed on the technical system. 3. The method as claimed in claim 2, wherein the evaluation function is selected based on an optimum dynamic behavior of the technical system. 4. The method as claimed in claim 3, wherein the evaluation function is represented by a cost function to be optimized. 5. The method as claimed in claim 1, wherein the state of the technical system includes an environment variable and/or the action to be performed on the technical system includes an action variable and/or a hidden state of the recurrent neural network and/or of the further neural network includes a hidden variable. 6. The method as claimed in claim 5, wherein a number of the hidden variables of a hidden state of the recurrent neural network and/or of the further neural network is less than a number of the environment variables of the state of the technical system. 7. The method as claimed in claim 1, wherein during the modeling of the recurrent neural network an error between the plurality of states and a plurality of states of the training data is minimized. 8. The method as claimed in claim 1, wherein a non-linear dynamic behavior of the technical system is modeled during the modeling and/or a non-linear action selection rule is learned during the learning. 9. The method as claimed in claim 1, wherein during the modeling and/or during the learning a back propagation method is used. 10. The method as claimed in claim 1, wherein the recurrent neural network is a network including dynamically consistent overshooting which takes into account a plurality of future states and a plurality of future actions. 11. The method as claimed in claim 1, wherein the modeling of the dynamic behavior of the technical system using the recurrent neural network is represented by the following equations: sτ=tanh(Ipτ+Daτ+θ)xτ+1=Csτwithpτ=Asτ-1+Bxτ∑t∑τ(xτ-xτd)2→minA,B,C,D,θwherein τ represents a range of values which includes a predetermined number m of time steps before the time t and a predetermined number n of time steps after the time t,wherein tε{m, . . ., T−n}, with T being the number of times for which training data is present,wherein Xτ represents the state of the technical system determined by the recurrent neural network at the point in time τ,wherein Xτd represents the state of the technical system at time τ in accordance with the training data,wherein aτ represents the action at time τ,wherein sτ and pτ represent the hidden state at time τof the hidden layer of the recurrent neural network, andwherein I is a unity matrix and A, B, C, D are matrices to be determined and θ is a bias to be determined. 12. The method as claimed in claim 11, wherein the learning of the action selection rule is represented by the following equations: sτ=tanh(Ipτ+Daτ+θ)Rτ+1=Gh(Csτ)forallτ>twithpτ=Asτ-1+Bxτandaτ=f(Ftanh(Epτ+b))forallτ>t∑t∑τ>tc(Rτ)→minE,F,bwherein G is a matrix and h is an activation function which maps a state Xτ+1 of the technical system onto a state C(•) relevant for a cost function Rτ+1,wherein f is a given activation function, andwherein E and F are matrices to be determined and b is a bias to be determined. 13. The method as claimed in claim 1, wherein the technical system is a turbine. 14. The method as claimed in claim 13, wherein the technical system is a gas turbine. 15. The method as claimed in claim 1, wherein at the start of a control function the modeling, the learning, and the determining are performed and the recurrent neural network coupled to the further neural network thus produced is used along with the learned action selection rule to determine the plurality of actions. 16. The method as claimed in claim 1, wherein during the control function the modeling, the learning, and the determining are executed at a regular interval,whereby during the execution a plurality of new states and a plurality of new actions produced are taken into account as new training data and/or additional training data, andwhereby after the execution the recurrent neural network produced and coupled to the further neural network is used with the learned action selection rule to select a plurality of further actions. 17. A computer program product with program code stored on a non-transitory machine-readable medium, when the program executes on a processor of a computer, the program comprising: modeling a dynamic behavior of a technical system, wherein the technical system is described by a state (xt) of the technical system and an action (at) performed on the technical system for a plurality of points in time (t), with a respective action (at) leading at a respective point in time (t) into a new state, (xt+1) of the technical system at a next point in time (t+1); wherein the modeling of the dynamic behavior of the technical system uses a recurrent neural network using training data, the recurrent neural network includes a plurality of states (xt) and a plurality of actions (at) determined using a simulation model at a plurality of different times (t), wherein the recurrent neural network is formed by a first input layer (I) including the first plurality of states (xt) and the first plurality of actions (at) performed on the technical system for the plurality of different times (t), a hidden recurrent layer (H) including a plurality of hidden states (st, pt), and an output layer (O) including the first plurality of states (xt) for the plurality of different times (t);learning an action selection rule by the recurrent neural network by coupling the recurrent neural network to a further neural network for a current and future times, wherein the further neural network comprises a feed-forward network that includes a second input layer, a second hidden layer (R) including a second plurality of hidden states (rt) and a second output layer (O′), wherein the further neural network uses as its second input layer a respective part of the first plurality of hidden states (pt) of the hidden recurrent layer (H) at a respective point in time (t), which is coupled to the second hidden layer (R), which is coupled to the second output layer (O′) that comprises a predicted action (at) to be performed on the technical system at respective point in time, the predicted action (at) fed forward to the hidden state (st) of the hidden recurrent layer (H) such that the predicted actions learned by the further neural network rather than externally input future actions are used for the learning the actin selection rule; anddetermining the state of the technical system and the action to be performed on the technical system by the recurrent neural network and coupled to the further neural network using a plurality of learned action selection rules.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (1)
Rajamani Ravi ; Chbat Nicolas Wadih ; Ashley Todd Alan, Controller with neural network for estimating gas turbine internal cycle parameters.
Brummel, Hans-Gerd; Düll, Siegmund; Singh, Jatinder P.; Sterzing, Volkmar; Udluft, Steffen, Method for the computerized control and/or regulation of a technical system.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.