IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0675555
(2008-08-26)
|
등록번호 |
US-8447706
(2013-05-21)
|
우선권정보 |
DE-10 2007 042 440 (2007-09-06) |
국제출원번호 |
PCT/EP2008/061115
(2008-08-26)
|
§371/§102 date |
20100226
(20100226)
|
국제공개번호 |
WO2009/033944
(2009-03-19)
|
발명자
/ 주소 |
- Schneegaβ, Daniel
- Udluft, Steffen
|
출원인 / 주소 |
- Siemens Aktiengesellschaft
|
인용정보 |
피인용 횟수 :
8 인용 특허 :
4 |
초록
▼
A method for a computer-aided control of a technical system is provided. The method involves use of a cooperative learning method and artificial neural networks. In this context, feed-forward networks are linked to one another such that the architecture as a whole meets an optimality criterion. The
A method for a computer-aided control of a technical system is provided. The method involves use of a cooperative learning method and artificial neural networks. In this context, feed-forward networks are linked to one another such that the architecture as a whole meets an optimality criterion. The network approximates the rewards observed to the expected rewards as an appraiser. In this way, exclusively observations which have actually been made are used in optimum fashion to determine a quality function. In the network, the optimum action in respect of the quality function is modeled by a neural network, the neural network supplying the optimum action selection rule for the given control problem. The method is specifically used to control a gas turbine.
대표청구항
▼
1. A method for computer-aided control of a technical system, comprising: characterizing a dynamic behavior of the technical system for a plurality of time points by a state of the technical system and an action carried out on the technical system for each time point, wherein an action at a time poi
1. A method for computer-aided control of a technical system, comprising: characterizing a dynamic behavior of the technical system for a plurality of time points by a state of the technical system and an action carried out on the technical system for each time point, wherein an action at a time point results in a sequential state of the technical system at a next time point;learning an action selection rule with a plurality of data records, wherein each data record comprises the state of the technical system at the time point, the action carried out in the state and the sequential state, and wherein an evaluation is assigned to each data record, the learning of the action selection rule comprising: modeling of a quality function by a first neural network comprising the states and actions of the technical system as parameters;learning the first neural network based on an optimality criterion, which is a function of the evaluations of the data records and the quality function, an optimum action in respect of the quality function being modeled by a second neural network, which is learned based on the quality function; andcontrolling the technical system such that the actions to be carried out on the technical system are selected using the action selection rule based upon the second neural network. 2. The method as claimed in claim 1, wherein the quality function is modeled by the first neural network such that an evaluation function is tailored to the evaluations of the data records. 3. The method as claimed in claim 1, wherein the optimum action in respect of the quality function is the action which maximizes the quality function. 4. The method as claimed in claim 1, wherein the first neural network forms a feed-forward network with an input layer comprising a respective state of the technical system and the action to be carried out in the respective state, one or more hidden layers and an output layer comprising the quality function. 5. The method as claimed in claim 1, wherein the second neural network forms a feed-forward network with an input layer comprising a respective sequential state of the technical system, one or more hidden layers and an output layer comprising the optimum action in the sequential state in respect of the quality function. 6. The method as claimed in claim 1, wherein a backpropagation method is used to learn the first neural network and the second neural network. 7. The method as claimed in claim 1, wherein the optimality criterion is selected such that an optimum dynamic behavior of the technical system is parameterized. 8. The method as claimed in claim 1, wherein the optimality criterion is the minimization of the Bellman residual. 9. The method as claimed in claim 1, wherein the optimality criterion is the reaching of the fixed point of the Bellman iteration. 10. The method as claimed in claim 1, wherein the optimality criterion is the minimization of a modified Bellman residual, the modified Bellman residual comprising an auxiliary function, which is a function of the state of the technical system and the action to be carried out in the respective state. 11. The method as claimed in claim 10, wherein the auxiliary function is modeled by a third neural network, which is learned based upon the optimality criterion, the third neural network forming a feed-forward network with an input layer comprising a respective state of the technical system and the action to be carried out in the respective state, one or more hidden layers and an output layer comprising the auxiliary function. 12. The method as claimed in claim 1, wherein the optimality criterion comprises an adjustable parameter, and wherein the optimality criterion is adapted based upon a change of the adjustable parameter. 13. The method as claimed in claim 1, wherein a state of the technical system comprises one or more variables, in particular observed state variables of the technical system. 14. The method as claimed in claim 1, wherein an action to be carried out on the technical system comprises one or more action variables. 15. The method as claimed in claim 1, wherein the states are states of the technical system hidden in the data records, which are generated by a recurrent neural network with the aid of source data records, the source data records respectively comprising an observed state of the technical system, an action carried out in the observed state and the resulting sequential state. 16. The method as claimed in claim 15, wherein the dynamic behavior of the technical system is modeled by the recurrent neural network, the recurrent neural network being formed by at least one input layer comprising the observed states of the technical system and the actions carried out on the technical system, at least one hidden recurrent layer comprising the hidden states and at least one output layer comprising the observed states of the technical system. 17. The method as claimed in claim 16, wherein the recurrent neural network is learned using a backpropagation method. 18. The method as claimed in claim 1, wherein the technical system is a gas turbine. 19. The method as claimed in claim 18, wherein the method is used to control a gas turbine, the states of the technical system and/or the actions to be carried out in the respective states comprising one or more of the following variables: gross output of the gas turbine;one or more pressures and/or temperatures in the gas turbine or in the area around the gas turbine;combustion chamber accelerations in the gas turbine; andone or more adjustment parameters at the gas turbine, in particular valve settings and/or fuel ratios and/or preliminary vane positions. 20. A non-transitory computer readable medium storing a program code for implementing a method for computer-aided control of a technical system when the program is running on a computer, the method comprising: characterizing a dynamic behavior of the technical system for a plurality of time points by a state of the technical system and an action carried out on the technical system for each time point, wherein an action at a time point results in a sequential state of the technical system at a next time point;learning an action selection rule with a plurality of data records, wherein each data record comprises the state of the technical system at the time point, the action carried out in the state and the sequential state, and wherein an evaluation is assigned to each data record, the learning of the action selection rule comprising: modeling of a quality function by a first neural network comprising the states and actions of the technical system as parameters;learning the first neural network based on an optimality criterion, which is a function of the evaluations of the data records and the quality function, an optimum action in respect of the quality function being modeled by a second neural network, which is learned based on the quality function; andcontrolling the technical system such that the actions to be carried out on the technical system are selected using the action selection rule based upon the second neural network.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.