강화학습(reinforcement learning)은 온라인으로 환경(environment)과 상호작용 하는 과정을 통하여 목표를 이루기 위한 전략을 학습한다. 강화학습의 기본적인 알고리즘인 Q-learning의 학습 속도를 가속하기 위해서, 거대한 상태공간 문제(curse of dimensionality)를 해결할 수 있고 강화학습의 특성에 적합한 함수 근사 방법이 필요하다. 본 논문에서는 이러한 문제점들을 개선하기 위해서, 온라인 퍼지 클러스터링(online fuzzy clustering)을 기반으로 한 Fuzzy Q-Map을 제안한다. Fuzzy Q-Map은 온라인 학습이 가능하고 환경의 불확실성을 표현할 수 있는 강화학습에 적합한 함수근사방법이다. Fuzzy Q-Map을 마운틴 카 문제에 적용하여 보았고, 학습 초기에 학습 속도가 가속됨을 보였다.
Reinforcement learning learns policies for accomplishing a task's goal by experience through interaction between agent and environment. Q-learning, basis algorithm of reinforcement learning, has the problem of curse of dimensionality and slow learning speed in the incipient stage of learning. In order to solve the problems of Q-learning, new function approximation methods suitable for reinforcement learning should be studied. In this paper, to improve these problems, we suggest Fuzzy Q-Map algorithm that is based on online fuzzy clustering. Fuzzy Q-Map is a function approximation method suitable to reinforcement learning that can do on-line teaming and express uncertainty of environment. We made an experiment on the mountain car problem with fuzzy Q-Map, and its results show that learning speed is accelerated in the incipient stage of learning.
Richard Sutton, Andrew G. Barto, 'Reinforcement Learning :An Introduction,' MIT Press, 1998
Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moor, 'Reinforcement Learning: A Survey,' Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996
Pierre Yves Glorennce, 'Reinforcement Learning : an Overview,' Proceedings of the European Symposium on Intelligent Techniques, 2000
William Donald Smart, 'Making Reinforcement Learning Work on Real Robots,' Ph. D. Thesis, Brown University, 2002
A.K. Jain, M.N, Murty, P.J. Flynn, 'Data Clustering: A Review,' ACM Computing Surveys, vol. 31, no. 3, 1999
Baraldi, A. and Blonda, P., 1999, 'A Survey of Fuzzy Clustering Algorithms for Pattern Recognition - Part I,' IEEE Transactions on Systems, Man, and Cybernetics, Part B, Vol. 29, No.6, pp. 778-786
Aristidis Likas, 'A Reinforcement Learning Approach to On-line Clustering,' Neural computation 11 (8): 1915-1932, 1999
Nicolas B. Karayiannis, James C. Bezdek, 'An Integrated Approach to Fuzzy Learning Vector Quantization and Fuzzy c-Means Clstering,' IEEE Transactions of Fuzzy systems, vol. 5, no. 4, 1997
전종원, 민준영, 'GLVQ클러스터링을 위한 필기체 숫자의 효율적인 특징추출 방법', 한국정보처리학회 논문지, vol. 2, no. 6, 1995
Barbara Hammer, Thomas Villmann, 'Generalized Relevance Learning Vector Quantization,' Neural Networks, vol. 15 no. 8-9, pp. 1059-1068, 2002
Shyn Jong Hu, 'Pattern Recognition by LVQ and GLVQ Networks,' http://neuron.et.ntust.edu.tw/homework/87/NN/87Homework%232/M8702043
Michael Herrmann, Ralf Der, 'Efficient Q- Learning by Division of Labor,' Proceedings of International Conference on Artificial Neural Networks, 1995
K. Yamada, M. Svinin, K. Ueda, 'Reinforcement Learning with Autonomous State Space Construction using Unsupervised Clustering Method,' Proceedings of the 5th International Symposium on Artificial Life and Robotics, 2000
Lionel Jouffe, 'Fuzzy Inference System Learning by Reinforcement Methods,' IEEE Transactions on Systems, Man and Cybernetics pp. 338-355, 1998.
Andrea Bonarini, 'Delayed Reinforcement, Fuzzy Q-Learning and Fuzzy Logic Controllers,' In Herrera, F., Verdegay, J. L. (Eds.) Genetic Algorithms and Soft Computing, pp. 447-466, 1996
Pierre Yves Glorennec, Lionel Jouffe, 'Fuzzy Q-Learning,' Proceedings of Sixth IEEE International Conference on Fuzzy Systems, pp. 719-724, 1997
정석일, 이연정, '분포기여도를 이용한 퍼지 Q-Learning', 퍼지 및 지능시스템 학회 논문지, vol. 11, no. 5, pp. 388-394, 2001
Richard S. Sutton, 'Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding,' Advances in Neural Information Processing Systems 8, pp. 1038-1044, MIT Press, 1996
R. Matthew Kretchmar, Charles W. Anderson, 'Comparison of CMACs and Radial Basis Functions for Local Function Approximators in Reinforcement Learning,' Proceedings of International Conference on Neural Networks, 1997
Juan Carlos Santamaria, Richard S. Sutton, Ashwin Ram, 'Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces,' COINS Technical Report 96-88, 1996
William D. Smart, Leslie Pack Kaelbling, 'Practical Reinforcement Learning in Continuous Spaces,' Proceedings of International Conference on Machine Learning, 2000
William D. Smart, Leslie Pack Kaelbling, 'Reinforcement Learning for Robot Control,' In Mobile Robots XVI, 2001