Method for computer-aided control or regulation of a technical system
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06E-001/00
G06E-003/00
G06F-015/18
G06G-007/00
G06N-003/00
출원번호
US-0386639
(2009-04-21)
등록번호
US-8160978
(2012-04-17)
우선권정보
DE-10 2008 020 379 (2008-04-23)
발명자
/ 주소
Schäfer, Anton Maximilian
Sterzing, Volkmar
Udluft, Steffen
출원인 / 주소
Siemens Aktiengesellschaft
인용정보
피인용 횟수 :
3인용 특허 :
1
초록▼
A method for computer-aided control of any technical system is provided. The method includes two steps, the learning of the dynamic with historical data based on a recurrent neural network and a subsequent learning of an optimal regulation by coupling the recurrent neural network to a further neural
A method for computer-aided control of any technical system is provided. The method includes two steps, the learning of the dynamic with historical data based on a recurrent neural network and a subsequent learning of an optimal regulation by coupling the recurrent neural network to a further neural network. The recurrent neural network has a hidden layer comprising a first and a second hidden state at a respective time point. The first hidden state is coupled to the second hidden state using a matrix to be learned. This allows a bottleneck structure to be created, in that the dimension of the first hidden state is smaller than the dimension of the second hidden state or vice versa. The autonomous dynamic is taken into account during the learning of the network, thereby improving the approximation capacity of the network. The technical system includes a gas turbine.
대표청구항▼
1. A method for computer-aided control of a technical system, comprising: characterizing a dynamic behavior of the technical system by a number of states and actions at a number of time points, a respective action at a respective time point resulting in a new state at a next time point;modeling the
1. A method for computer-aided control of a technical system, comprising: characterizing a dynamic behavior of the technical system by a number of states and actions at a number of time points, a respective action at a respective time point resulting in a new state at a next time point;modeling the dynamic behavior with a recurrent neural network by a training data comprising known states and known actions at the number of time points, wherein the recurrent neural network comprises: an input layer comprising the states and the actions at the number of time points,a hidden recurrent layer comprising a number of hidden states at the number of time points, andan output layer comprising the states at the number of time points,wherein a respective hidden state at the respective time point comprises a first hidden state and a second hidden state at the respective time point,wherein a respective state in the input layer at the respective time point is associated with the first hidden state and the respective action in the input layer at the respective time point is associated with the second hidden state, andwherein the first hidden state is coupled to the second hidden by a matrix which is learned during the modeling;learning an action selection rule by coupling the recurrent neural network to a further neural network, wherein the further neural network comprises: a further input layer comprising the hidden states of the recurrent neural network,a further hidden layer comprising further hidden states, anda further output layer comprising the actions and changes of the actions compared with temporally preceding actions; anddefining the states and the actions by coupling the recurrent neural network to the further neural network with the learned action selection rule. 2. The method as claimed in claim 1, wherein the first hidden state in the hidden recurrent layer comprises a first number of variables,wherein the second hidden state in the hidden recurrent layer comprises a second number of variables being different from the first number of variable, andwherein the first number of variables are smaller than the second number of variables or vice versa. 3. The method as claimed in claim 1, wherein: the state comprises one or more ambient variables,the action comprises one or more action variables,the hidden state comprises one or more hidden variables,the further hidden state comprises one or more further hidden variables, anda number of the hidden variables or a number of the further hidden variables is smaller than a number of the ambient variables. 4. The method as claimed in claim 1, wherein: the actions comprise changeable manipulated variables of the technical system,the changes in the further output layer are changes in the manipulated variables,the changes are coupled to the actions by coupling matrixes,the actions are coupled to the temporally preceding actions by unit matrixes, andthe coupling matrixes restrict or scale the changes. 5. The method as claimed in claim 1, wherein a number of discrete actions are predetermined and the further output layer at least partially comprises the discrete actions. 6. The method as claimed in claim 1, wherein the further input layer comprises the first hidden state at the respective time point. 7. The method as claimed in claim 1, wherein: the selection rule is learned according to an evaluation function with a criteria relating to the states or the actions or the modeling,the evaluation function parameterizes an optimal dynamic behavior of the technical system, andthe evaluation function comprises a cost function to be optimized. 8. The method as claimed in claim 1, wherein the modeling of the dynamic behavior minimizes an error between the states in the recurrent neural network and the known states in the training data. 9. The method as claimed in claim 1, wherein a nonlinear dynamic behavior of the technical system is modeled or a nonlinear action selection rule is learned. 10. The method as claimed in claim 1, wherein the model of the dynamic behavior or the learning of the action selection rule is performed by a backpropagation method. 11. The method as claimed in claim 1, wherein the recurrent neural network is a network with dynamically consistent temporal deconvolution with future states and actions. 12. The method as claimed in claim 1, wherein the modeling of the dynamic behavior is represented by following equations: s_τ=tanh(A^s^τ+Daτd+θ)xτ+1=Cs_τwiths^τ={As_τ-1+Bxτd∀τ≤tAs_τ-1+Bxτ∀τ>t∑t∑τ(xτ-xτd)2→minA,A^,B,C,D,θwhere a value range of τ comprises a predetermined number m of time steps before a time point t and a predetermined number n of time steps after the time point t;where t∈ {m, . . . , T−n}, where T is a number of time points for which the training data is present;where xτ represents a state at the time point τ in the recurrent neural network;where xτd represents a known state at the time point τ in the training data;where aτ represents an action at the time point τ in the recurrent neural network;where aτd represents a known action at the time point τ in the training data;where ŝτ represents the first hidden state and sτ represents the second hidden state at the time point τ in the hidden layer; andwhere I is an unit matrix and Â, A, B, C, D are matrices to be defined and θ is a bias to be defined. 13. The method as claimed in claim 1, wherein the learning of the action selection rule is represented by following equations: s_τ={tanh(A^s^τ+Daτd+θ)∀τtandaτ=aτ-1+Hf(Ftanh(Es^τ+b))forallτ≥t∑t∑τ>tc(Rτ)→minE,F,bwhere G is a matrix and h is an activation function mapping a state xτ+1 onto a further hidden state Rτ+1 of relevance to a cost function c(•);where f is an other activation function;where E and F are matrices to be defined and b is a bias to be defined;where H is a matrix for adapting the changes in the actions. 14. The method as claimed in claim 1, wherein: the technical system comprises a turbine, andthe turbine is a gas turbine. 15. The method as claimed in claim 1, wherein: a resulting recurrent neural network is generated by coupling the learned action selection rule to the further neural network, andthe actions are defined by the resulting recurrent neural network. 16. The method as claimed in claim 1, wherein: the technical system is computer-aidedly controlled at regular intervals,a new training data is generated by new states and actions resulting during the control, anda resulting recurrent neural network is generated by coupling the learned action selection rule to the further neural network, andfurther actions are selected by the resulting recurrent neural network. 17. A method for computer-aided simulation of a technical system, comprising: characterizing a dynamic behavior of the technical system by a number of states and actions at a number of time points, a respective action at a respective time point resulting in a new state at a next time point;modeling the dynamic behavior with a recurrent neural network by a training data comprising known states and known actions at the number of time points, wherein the recurrent neural network comprises: an input layer comprising the states and the actions at the number of time points,a hidden recurrent layer comprising a number of hidden states at the number of time points, andan output layer comprising the states at the number of time points,wherein a respective hidden state at the respective time point comprises a first hidden state and a second hidden state at the respective time point,wherein a respective state in the input layer at the respective time point is associated with the first hidden state and the respective action in the input layer at the respective time point is associated with the second hidden state, andwherein the first hidden state is coupled to the second hidden by a matrix which is learned during the modeling; andsimulating the dynamic behavior by defining the new state at the next time based on the modeling. 18. A computer program product executable on a computer for computer-aided control of a technical system, comprising: a computer program code that performs steps of: characterizing a dynamic behavior of the technical system by a number of states and actions at a number of time points, a respective action at a respective time point resulting in a new state at a next time point;modeling the dynamic behavior with a recurrent neural network by a training data comprising known states and known actions at the number of time points, wherein the recurrent neural network comprises: an input layer comprising the states and the actions at the number of time points,a hidden recurrent layer comprising a number of hidden states at the number of time points, andan output layer comprising the states at the number of time points,wherein a respective hidden state at the respective time point comprises a first hidden state and a second hidden state at the respective time point,wherein a respective state in the input layer at the respective time point is associated with the first hidden state and the respective action in the input layer at the respective time point is associated with the second hidden state, andwherein the first hidden state is coupled to the second hidden by a matrix which is learned during the modeling;learning an action selection rule by coupling the recurrent neural network to a further neural network, wherein the further neural network comprises: a further input layer comprising the hidden states of the recurrent neural network,a further hidden layer comprising further hidden states, anda further output layer comprising the actions and changes of the actions compared with temporally preceding actions; anddefining the states and the actions by coupling the recurrent neural network to the further neural network with the learned action selection rule.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (1)
Rajamani Ravi ; Chbat Nicolas Wadih ; Ashley Todd Alan, Controller with neural network for estimating gas turbine internal cycle parameters.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.