Method for the computer-aided regulation and/or control of a technical system, especially a gas turbine
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G05B-013/02
G06G-007/70
G06G-007/00
G06F-017/50
G06F-015/18
출원번호
US-0521144
(2007-12-19)
등록번호
US-8099181
(2012-01-17)
우선권정보
DE-10 2007 001 024 (2007-01-02)
국제출원번호
PCT/EP2007/064262
(2007-12-19)
§371/§102 date
20090625
(20090625)
국제공개번호
WO2008/080862
(2008-07-10)
발명자
/ 주소
Sterzing, Volkmar
Udluft, Steffen
출원인 / 주소
Siemens Aktiengesellschaft
인용정보
피인용 횟수 :
11인용 특허 :
6
초록▼
A method for the computer-aided regulation and/or control of a technical system is provided. In the method, first a simulation model of the technical system is created, to which subsequently a plurality of learning and/or optimization methods are applied. Based on the results of these methods, the m
A method for the computer-aided regulation and/or control of a technical system is provided. In the method, first a simulation model of the technical system is created, to which subsequently a plurality of learning and/or optimization methods are applied. Based on the results of these methods, the method best suited for the technical system is selected. The selected learning and/or optimization method is then used to regulate the technical system. Based on the simulation model, the method can thus be used to train an initial controller, which can be used as an intelligent controller, and is not modified during further regulation of the technical system.
대표청구항▼
1. A method for a computer-aided regulation and/or a control of a technical system, comprising: creating a simulation model of the technical system using a first plurality of states of the technical system, each consecutive state occurring later than a previous state;applying a plurality of learning
1. A method for a computer-aided regulation and/or a control of a technical system, comprising: creating a simulation model of the technical system using a first plurality of states of the technical system, each consecutive state occurring later than a previous state;applying a plurality of learning and/or optimization methods to the simulation model, the plurality of learning and/or optimization methods delivering a learned parameter and a sequence of states as a result in each case, the first plurality of states delivering a first plurality of actions, and wherein an action is assigned to a state leading to a new state in the sequence;selecting a learning and/or optimization method from the plurality of learning and/or optimization methods for the regulation of the technical system using the results of the plurality of learning and/or optimization methods in accordance with a predetermined criteria, the selection of the learning and/or optimization method is a function of an evaluation of each learning and/or optimization method, and the evaluation is output by the simulation model and/or is determined using the result of the respective learning and/or optimization method; andregulating the technical system with the selected learning and/or optimization method, wherein the regulation specifies a subsequent action to be performed on the technical system as a function of the state of the technical system. 2. The method as claimed in claim 1, wherein the regulating uses the selected learning and/or optimization method on the basis of the learned parameter, andwherein the learned parameter is not changed during the regulating of the technical system. 3. The method as claimed in claim 1, wherein the learned parameter is used at a beginning of the regulating, andwherein during the regulating the learned parameter is recalculated using the new state and the action produced during the regulating. 4. The method as claimed in claim 1, wherein the learned parameter is reset to a predetermined value and then recalculated during the regulating. 5. The method as claimed in claim 1, wherein the simulation model is created using a recurrent neural network. 6. The method as claimed in claim 1, wherein the evaluation is a measure of a quality of the learning and/or optimization method in relation to a second evaluation or a reward function. 7. The method as claimed in claim 1, wherein the plurality of learning and/or optimization methods applied to the simulation model, comprising: a reinforcement learning method, comprising: modeling a dynamic behavior of the technical system with the recurrent neural network using training data including the first plurality of states and the first plurality of actions determined by the simulation model at a plurality of different times;learning an action selection rule by the recurrent neural network for a current time and a future time and coupling the recurrent neural network to a second neural network; anddetermining the first plurality of states and the first plurality of actions by the recurrent neural network and coupling the recurrent neural network to the second neural network using a plurality of learned action selection rules,wherein the recurrent neural network is formed by a first input layer including the first plurality of states and the first plurality of actions performed on the technical system for the plurality of times, a first hidden recurrent layer including a first plurality of hidden states, and a first output layer including the first plurality of states for the plurality of different times,wherein the second neural network includes a second input layer, a second hidden layer including a second plurality of hidden states, and a second output layer, andwherein the second input layer at a point in time including a part of the first plurality of hidden states at the point in time and the second output layer including the action performed on the technical system at the point in time. 8. The method as claimed in claim 7, wherein the reinforcement learning method is a table-based reinforcement learning method. 9. The method as claimed in claim 1, wherein the plurality of the learning and/or optimization methods applied to the simulation model includes an adaptive heuristic criticism algorithm and/or a Q learning algorithm and/or a prioritized sweeping algorithm. 10. The method as claimed in claim 1, wherein the state of the technical system includes a plurality of state variables in a first state space with a first dimension and/or the action assigned to the state includes a plurality of action variables. 11. The method as claimed in claim 10, wherein a minimization of the first state space is done before the applying for a part of each learning and/or optimization method, wherein the minimization of the first state space includes modeling the first plurality of states using the recurrent neural network with an aid of training data,wherein the recurrent neural network includes a first input layer, a first recurrent hidden layer and a first output layer,wherein the first input layer and the first output layer are formed by the first plurality of states in the first state space for a plurality of points in time,wherein the first recurrent hidden layer is formed by a first plurality of hidden states, with a plurality of hidden state variables in a second state space with a second dimension, with a second dimension being lower than a first dimension, andwherein after the minimization the respective learning and/or optimization method is executed in a reduced second state space of the plurality of the hidden states. 12. The method as claimed in claim 1, wherein a change to a manipulated variable of the technical system causes a change to the action assigned to the state. 13. The method as claimed in claim 1, wherein the applying further comprises discretizing the first plurality of states and/or the first plurality of actions as a function of a prespecified criteria. 14. The method as claimed in claim 1, wherein during the applying a range of values is defined for the first plurality of states and/or the corresponding first plurality of actions. 15. The method as claimed in claim 14, wherein during the applying the range of values are realized by a penalty signal in an application of the respective learning and/or optimization method to the simulation model,wherein a strength of the penalty signal corresponds with an increase in a deviation of the first plurality of states and/or a first plurality of actions, defined from the learning and/or optimization method, to a plurality of measured or allowed states and/or plurality of measured or allowed actions. 16. The method as claimed in claim 1, wherein a gas turbine is regulated using the method, andwherein the first plurality of states and/or the first plurality of actions assigned to the states comprise at least one variable selected from the group consisting of overall power of the turbine, a pressure in the gas turbine, the pressure in a vicinity of the gas turbine, a temperature in the gas turbine, the temperature in the vicinity of the gas turbine, a combustion chamber acceleration in the gas turbine, a setting parameter on the gas turbine, and any combination thereof. 17. The method as claimed in claim 16, wherein the plurality of learning and/or optimization methods applied to the simulation model include a low combustion chamber acceleration as a learning target and/or as an optimization target. 18. The method as claimed in claim 1, wherein the technical system is a gas turbine. 19. A computer program product with program code stored on a machine-readable medium, when the program executes on a processor of a computer, the program comprising: creating a simulation model of a technical system using a first plurality of states of the technical system, each consecutive state occurring later than a previous state;applying a plurality of learning and/or optimization methods to the simulation model, the plurality of learning and/or optimization methods delivering a learned parameter and a sequence of states as a result in each case, the first plurality of states delivering a first plurality of actions, and wherein an action is assigned to a state leading to a new state in the sequence;selecting a learning and/or optimization method from the plurality of learning and/or optimization methods for the regulation of the technical system using the results of the plurality of learning and/or optimization methods in accordance with a predetermined criteria, the selection of the learning and/or optimization method is a function of an evaluation of each learning and/or optimization method, and the evaluation is output by the simulation model and/or is determined using the result of the respective learning and/or optimization method; andregulating the technical system with the selected learning and/or optimization method, wherein the regulation specifies a subsequent action to be performed on the technical system as a function of the state of the technical system.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (6)
Kojima Yasuhiro (Hyogo JPX) Izui Yoshio (Hyogo JPX) Goda Tadahiro (Hyogo JPX) Kyomoto Sumie (Hyogo JPX), Control method using neural networks and a voltage/reactive-power controller for a power system using the control method.
Neubauer Werner (Munchen DEX) Bocionek Siegfried (Munchen DEX) Moller Marcus (Munchen DEX) Joppich Martin (Unterhaching DEX), Process for optimizing control parameters for a system having an actual behavior depending on the control parameters.
Brummel, Hans-Gerd; Düll, Siegmund; Singh, Jatinder P.; Sterzing, Volkmar; Udluft, Steffen, Method for the computerized control and/or regulation of a technical system.
Chandler, Christopher, Optimization of gas turbine combustion systems low load performance on simple cycle and heat recovery steam generator applications.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.