[논문]삼목 게임을 위해 개선된 몬테카를로 트리탐색 알고리즘

이병두

doi:10.7583/jkgs.2016.16.4.79

초록
AI-Helper

몬테카를로 트리탐색은 최대우선탐색 알고리즘이며, 많은 게임 특히 바둑 게임에 성공적으로 적용되어 왔다. 삼목 게임에서 MCTS 간의 대국을 통해 성능을 평가하고자 했다. 첫 번째 대국자는 항상 두 번째 대국자에 비해 압도적인 우위를 보였으며, 최선의 게임 결과가 무승부가 됨에도 불구하고 첫 번째 대국자가 두 번째 대국자에 비해 우월한 이유를 찾고자 했다. MCTS는 반복적인 무작위 샘플링을 기반으로 하는 통계적 알고리즘이기 때문에, 특히 두 번째 대국자를 위해 전략을 요하는 시급한 문제를 적절히 대처하지 못한다. 이를 위해 전략적 MCTS(S-MCTS)를 제안하며, S-MCTS는 결코 삼목 게임에서 지지 않는다는 것을 보였다.

Abstract ▼ AI-Helper

Monte-Carlo Tree Search(MCTS) is a best-first tree search algorithm and has been successfully applied to various games, especially to the game of Go. We evaluate the performance of MCTS playing against each other in the game of Tic-Tac-Toe. It reveals that the first player always has an overwhelming...

Monte-Carlo Tree Search(MCTS) is a best-first tree search algorithm and has been successfully applied to various games, especially to the game of Go. We evaluate the performance of MCTS playing against each other in the game of Tic-Tac-Toe. It reveals that the first player always has an overwhelming advantage to the second player; and we try to find out the reason why the first player is superior to the second player in spite of the fact that the best game result should be a draw. Since MCTS is a statistical algorithm based on the repeated random sampling, it cannot adequately tackle an urgent problem that needs a strategy, especially for the second player. For this, we propose a strategic MCTS(S-MCTS) and show that the S-MCTS player never loses a Tic-Tac-Toe game.

주제어

AI 본문요약
AI-Helper

문제 정의

인공지능 분야에서 이러한 위대한 역사를 만들어 낸 MCTS의 적극적인 활용을 위해, 저자는 소규모의 게임인 삼목 게임을 통해 순수 MCTS의 문제점과 이에 대한 대처 방안을 제시하고자 한다.

제안 방법

반복적인 무작위 탐험에 의해 생성된 통계적 확률값에 전적으로 의존하는 MCTS가 안고 있는 문제점으로는 (1)수학적 확률값이 같은 경우 이에 대한 적절한 대처가 미흡하며, (2)특히 두 번째 대국자인 최소 대국자는 발생되는 전체 게임에 대한 최소 평균 승률값을 선택하는 행동을 취하기 때문에 첫 번째 대국자인 최대 대국자에게 최선의 대응을 할 수 없는 상항이 벌어지는 경우가 있다. 결국 이러한 문제점을 해결하기 위하여 순수 MCTS가 아닌 전략을 구사할 수 있는 전략적 몬테카를로 트리탐색인 S-MCTS를 본 논문에서 제시하였다. 전략을 구사하는 S-MCTS는 첫 번째 대국자가 MCTS 또는 S-MCTS 여부에 상관없이 항상 최선의 행동을 취해 게임에 결코 지지 않음을 실험을 통해 보였다.
우선 MCTS의 성능을 측정하기 위해 MCTS 간에 10,000번의 삼목 게임을 수행했으며, 한 게임 당 1개의 최선의 착수를 결정짓기 위해 100,000번의 MC 시뮬레이션을 시켰다. 게임의 결과는 이긴 경우 +1점, 비긴 경우 +0.

성능/효과

결국 본 논문에서의 최대 성과는 향후 MCTS를 활용한 게임프로그래밍을 구축하는 경우, 순수 MCTS가 아닌 전략기반 S-MCTS를 구축해야 한다는 사실을 찾아낸 것이다.
또한 [Table 2]에서 보듯이 10,000번의 삼목 게임에서 두 번째 대국자인 백으로 S-MCTS가 두는 경우, 첫 번째 대국자인 흑이 MCTS 또는 S-MCTS 여부에 상관없이 50.0±1.0%의 평균 승률을 보여 결코 게임에 지지 않는 것을 알 수 있다. 참고로 [Fig.
결국 이러한 문제점을 해결하기 위하여 순수 MCTS가 아닌 전략을 구사할 수 있는 전략적 몬테카를로 트리탐색인 S-MCTS를 본 논문에서 제시하였다. 전략을 구사하는 S-MCTS는 첫 번째 대국자가 MCTS 또는 S-MCTS 여부에 상관없이 항상 최선의 행동을 취해 게임에 결코 지지 않음을 실험을 통해 보였다.

후속연구

또한 향후 순수 MCTS보다 다소 성능이 우수한 신뢰상한 트리탐색(UCT: Upper Confidence Bounds for Trees)과 전략기반 S-MCTS의 성능을 비교하는 것도 좋은 연구 과제가 될 듯하다.

질의응답

핵심어	질문	논문에서 추출한 답변
	게임트리탐색에서 최적의 해를 구하는 데에 있어 고려해야 할 사항은 무엇인가?	대부분의 게임프로그래밍은 평가함수(evaluation function)를 이용한 게임트리탐색를 활용한다[1,2]. 게임트리탐색에서의 최적의 해를 구하는 데에 있어 고려해야 할 사항으로는 게임트리의 분기수와 깊이의 규모에 있다. 분기수와 깊이가 비교적 작은 경우에는 최소최대탐색 또는 알파-베타 가지치기를 수행하여 최적의 해를 구할 수 있으나, 바둑과 같이 분기수와 깊이가 매우 큰 경우에는 이를 이용한 전역탐색(exhaustive search)을 수행할 수가 없어 MCTS를 활용해야 한다[1,2,13].
	삼목 게임은 무엇인가?	삼목 게임(Tic-Tac-Toe)은 전 세계적으로 잘 알려진 게임 중의 하나로 두 대국자가 3⨯3칸으로 된 종이 위에 ☓와 ◯를 번갈아 연필로 써서 가로, 세로 또는 대각선상에 동일한 모양이 연속하여 3개가 형성되면 이기는 게임이다[12]. 참고로 [Fig.

참고문헌 (18)

B.D. Lee, D.S. Park and Y.W. Choi, "The UCT algorithm applied to find the best first move in the game of Tic-Tac-Toe", Journal of Korea Game Society, Vol. 15, No. 5, pp. 109-118, 2015.

원문보기 상세보기
B.D. Lee, "Competition between MCTS and UCT in the game of Tic-Tac-Toe", Journal of Korean Society for Computer Game, Vol. 29, No. 1, pp. 1-6, 2016.

상세보기
B.D. Lee, "Analysis of Tic-Tac-Toe Game Strategies using Genetic Algorithm", Journal of Korea Game Society, Vol. 14, No. 6, pp. 39-48, 2014.
B.D. Lee, "Monte-Carlo Tree Search Applied to the Game of Tic-Tac-Toe", Journal of Korea Game Society, Vol. 14, No. 3, pp. 47-54, 2014.

원문보기 상세보기
D. Silver et al., "Mastering the game of Go with deep neural networks and tree search", Journal of Nature, Vol. 529, Issue 7587, pp. 484-489, 2016.

상세보기
P.S. Jang "Overseas Innovation Trend", from http://www.stepi.re.kr, 2016.
B.D. Lee and Y.W. Choi, "The best move sequence in playing Tic-Tac-Toe game", Journal of The Korean Society for Computer Game, Vol. 27, No. 3, pp. 11-16, 2014.
B.D. Lee, "Evolutionary neural network model for recognizing strategic fitness of a finished Tic-Tac-Toe game", Journal of Korean Society for Computer Game, Vol. 28, No. 2, pp. 95-101, 2015.
S. Gelly, M. Schoenauer, M. Sebag, O. Teytaud, L. Kocsis, D. Silver and C. Szepesvari, "The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions", Communications of the ACM, Vol. 55, No. 3, pp. 106-113, 2012.

상세보기
G. Chaslot, "Monte-Carlo Tree Search", Ph.D. dissertation, University of Masstricht, 2010.
Wikipedia, "Computer Go", from http://en.wikipedia.org/wiki/Computer_Go, 2016.
Wikipedia, "Tic-Tac-Toe", from http://en.wikipedia.org/wiki/Tic-Tac-Toe, 2016.
A.A.J van der Kleij, "Monte Carlo Tree Search and Opponent Modeling through Player Clustering in no-limit Texas Hold'en Poker", Master thesis, University of Groningen, 2010. http://en.wikipedia.org/wiki/Tic-Tac-Toe, 2016.
N. Sephton, P.I. Cowling, E. Powley and N.H. Slaven, "Heuristic Move Pruning in Monte Carlo Tree Search for the Strategic Card Game Lords of War", In Computational Intelligence and Games (CIG) of IEEE, pp. 1-7, 2014.
T. Pepels, "Novel Selection Methods for Monte-Carlo Tree Search", Master thesis, University of Masstricht, 2014.
D. Brand and S. Kroon, "Sample Evaluation for Action Selection in Monte Carlo Tree Search", from http://dl.acm.org/citation.cfm?doid2664591.2664612, 2016.
Y. Wang and S. Gelly, "Modification of UCT and sequence-like simulations for Monte-Carlo Go", from http://dept.stat.lsa.umich.edu/-yizwang/publications/wang07modifications.pdf, 2016.
B.D. Lee, "Implementation of robust Tic-Tac-Toe game player, using enhanced Monte-Carlo algorithm", Journal of Korean Society for Computer Game, Vol. 28, No. 3, pp. 135-141, 2015.

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

삼목 게임을 위해 개선된 몬테카를로 트리탐색 알고리즘
Enhanced strategic Monte-Carlo Tree Search algorithm to play the game of Tic-Tac-Toe 원문보기

초록
AI-Helper

Abstract ▼ AI-Helper

주제어

AI 본문요약
AI-Helper

문제 정의

제안 방법

성능/효과

후속연구

질의응답

참고문헌 (18)

이 논문을 인용한 문헌

저자의 다른 논문 :

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

삼목 게임을 위해 개선된 몬테카를로 트리탐색 알고리즘 Enhanced strategic Monte-Carlo Tree Search algorithm to play the game of Tic-Tac-Toe 원문보기

초록 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

AI 본문요약 엑셀 다운로드 AI-Helper

문제 정의

제안 방법

성능/효과

후속연구

질의응답

참고문헌 (18)

이 논문을 인용한 문헌

저자의 다른 논문 :

이병두 (8)

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

삼목 게임을 위해 개선된 몬테카를로 트리탐색 알고리즘
Enhanced strategic Monte-Carlo Tree Search algorithm to play the game of Tic-Tac-Toe 원문보기

초록
AI-Helper

AI 본문요약
AI-Helper