[논문]강화학습의 학습 가속을 위한 함수 근사 방법

이영아; 정태충

doi:10.5391/jkiis.2003.13.6.635

강화학습의 학습 가속을 위한 함수 근사 방법
Function Approximation for accelerating learning speed in Reinforcement Learning 원문보기

퍼지 및 지능시스템학회 논문지 = Journal of fuzzy logic and intelligent systems, v.13 no.6, 2003년, pp.635 - 642

초록
AI-Helper

강화학습은 제어, 스케쥴링 등 많은 응용분야에서 성공적인 학습 결과를 얻었다. 기본적인 강화학습 알고리즘인 Q-Learning, TD(λ), SARSA 등의 학습 속도의 개선과 기억장소 등의 문제를 해결하기 위해서 여러 함수 근사방법(function approximation methods)이 연구되었다. 대부분의 함수 근사 방법들은 가정을 통하여 강화학습의 일부 특성을 제거하고 사전지식과 사전처리가 필요하다. 예로 Fuzzy Q-Learning은 퍼지 변수를 정의하기 위한 사전 처리가 필요하고, 국소 최소 자승법은 훈련 예제집합을 이용한다. 본 논문에서는 온-라인 퍼지 클러스터링을 이용한 함수 근사 방법인 Fuzzy Q-Map을 제안하다. Fuzzy Q-Map은 사전 지식이 최소한으로 주어진 환경에서, 온라인으로 주어지는 상태를 거리에 따른 소속도(membership degree)를 이용하여 분류하고 행동을 예측한다. Fuzzy Q-Map과 다른 함수 근사 방법인 CMAC와 LWR을 마운틴 카 문제에 적용하여 실험 한 결과 Fuzzy Q-Map은 훈련예제를 사용하지 않는 CMAC보다는 빠르게 최고 예측율에 도달하였고, 훈련 예제를 사용한 LWR보다는 낮은 예측율을 보였다.

Abstract ▼ AI-Helper

Reinforcement learning got successful results in a lot of applications such as control and scheduling. Various function approximation methods have been studied in order to improve the learning speed and to solve the shortage of storage in the standard reinforcement learning algorithm of Q-Learning. Most function approximation methods remove some special quality of reinforcement learning and need prior knowledge and preprocessing. Fuzzy Q-Learning needs preprocessing to define fuzzy variables and Local Weighted Regression uses training examples. In this paper, we propose a function approximation method, Fuzzy Q-Map that is based on on-line fuzzy clustering. Fuzzy Q-Map classifies a query state and predicts a suitable action according to the membership degree. We applied the Fuzzy Q-Map, CMAC and LWR to the mountain car problem. Fuzzy Q-Map reached the optimal prediction rate faster than CMAC and the lower prediction rate was seen than LWR that uses training example.

주제어

참고문헌 (14)

Stephan ten Hagen and Ben Krose, "Q learning for System with continuous state and action spaces", BENELEARN 2000, 10th Belgian-Dutch conference on Machine Learning.
Chris Gaskett, David Wettergreen, and Alexander Zelinsky, "Q learning in continuous state and action spaces", Australian Joint Conference on Artificial Intelligence 1999.
전효병,이동욱,김대준,심귀보, "퍼지추론에 의한 리커런트 뉴럴 네트워크 강화학습", 한국퍼지 및 지능 시스템 학회 '97년도 춘계학술대회 논문집.
Richard S. Sutton, Andrew G. Barto "Reinforcement Learning: An Introduction". The MIT Press, Cambridge, MA., 1998.
Juan Carlos Santamaria, Richard S. Sutton, Ashwin Ram, "Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces", COINS Technical Report 96-88, December 1996.
William Donald Smart, "Making Reinforcement Learning Work on Real Robots", Ph D Thesis, Department of Computer Science, Brown University, 2002.
Jan Jantzen, "Neurofuzzy Modelling", Technical Report, Technical University of Denmark 1998.
정석일, 이연정, "분포 기여도를 이용한 퍼지 Q-learning", 퍼지 및 지능시스템학회 논문지 2001, Vol. 11, No.5 pp.388-394.

원문보기 상세보기
Pierre Yves Glorennec, Lionel Jouffe, "Fuzzy Q-Iearning", Proceedings of Fuzz-Ieee'97, Sixth International Conference on Fuzzy Systems, P719-724, Barcelona, july,1997.
Lionel Jouffe, "Fuzzy Inference System Learning by Reinforcement Methods", Ieee Transactions on System, Man and Cybernetics, vol.98, no 3, August,1998.
Andrea Bonarini, "Delayed Reinforcement, Fuzzy Q-Iearning and Fuzzy Logic Controllers", In Herrera, F., Verdegay, J. L. (Eds.) Genetic Algorithms and Soft Computing, (Studies in Fuzziness, 8), Physica-Verlag, Berlin, D, 447-466.
William D. Smart, Leslie Pack Kaelbling, "Practical Reinforcement Learning in Continuous Spaces", Proceedings of the sixteenth International Conference on Machine Learning, 2000.
William D. Smart, Leslie Pack Kaelbling, "Reinforcement Learning for Robot Control", In "Mobile Robots XVI", 2001.
Artistidis Likas, "A Reinforcement Learning: Approach to On-Line Clustering", Neural Computation, vol. 11, no. 8, pp. 1915-1932, 1999.

상세보기

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

강화학습의 학습 가속을 위한 함수 근사 방법
Function Approximation for accelerating learning speed in Reinforcement Learning 원문보기

초록
AI-Helper

Abstract ▼ AI-Helper

주제어

참고문헌 (14)

이 논문을 인용한 문헌

저자의 다른 논문 :

관련 콘텐츠

원문 보기

원문 URL 링크

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

강화학습의 학습 가속을 위한 함수 근사 방법 Function Approximation for accelerating learning speed in Reinforcement Learning 원문보기

초록 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

참고문헌 (14)

이 논문을 인용한 문헌

저자의 다른 논문 :

이영아 (4) 정태충 (45)

관련 콘텐츠

원문 보기

원문 URL 링크

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

강화학습의 학습 가속을 위한 함수 근사 방법
Function Approximation for accelerating learning speed in Reinforcement Learning 원문보기

초록
AI-Helper