[논문]다중 에이전트 강화학습을 이용한 다중 AGV의 충돌 회피 경로 제어

최호빈; 김주봉; 한연희; 오세원; 김귀훈

doi:10.3745/ktccs.2022.11.9.281

[국내논문] 다중 에이전트 강화학습을 이용한 다중 AGV의 충돌 회피 경로 제어
Collision Avoidance Path Control of Multi-AGV Using Multi-Agent Reinforcement Learning 원문보기

정보처리학회논문지. KIPS transactions on computer and communication systems 컴퓨터 및 통신 시스템, v.11 no.9, 2022년, pp.281 - 288

최호빈 (한국기술교육대학교 컴퓨터공학과 미래융합공학전공) , 김주봉 (한국기술교육대학교 컴퓨터공학과 미래융합공학전공) , 한연희 (한국기술교육대학교 컴퓨터공학과 미래융합공학전공) , 오세원 (한국전자통신연구원) , 김귀훈 (한국교원대학교 인공지능융합교육전공)

초록
AI-Helper

산업 응용 분야에서 AGV는 공장이나 창고와 같은 대규모 산업 시설의 무거운 자재를 운송하기 위해 자주 사용된다. 특히, 주문처리 센터에서는 자동화가 가능하여 유용성이 극대화된다. 이러한 주문처리 센터와 같은 창고에서 생산성을 높이기 위해서는 AGV들의 정교한 운반 경로 제어가 요구된다. 본 논문에서는 대중적인 협력 MARL 알고리즘인 QMIX에 적용될 수 있는 구조를 제안한다. 성능은 두 종류의 주문처리 센터 레이아웃에서 세 가지의 메트릭으로 측정하였으며, 결과는 기존 QMIX의 성능과 비교하여 제시된다. 추가적으로, AGV들의 행동 패턴에 대한 가시적인 분석을 위해 훈련된 AGV들의 운반 경로를 시각화한 히트맵을 제공한다.

Abstract ▼ AI-Helper

AGVs are often used in industrial applications to transport heavy materials around a large industrial building, such as factories or warehouses. In particular, in fulfillment centers their usefulness is maximized for automation. To increase productivity in warehouses such as fulfillment centers, sophisticated path planning of AGVs is required. We propose a scheme that can be applied to QMIX, a popular cooperative MARL algorithm. The performance was measured with three metrics in several fulfillment center layouts, and the results are presented through comparison with the performance of the existing QMIX. Additionally, we visualize the transport paths of trained AGVs for a visible analysis of the behavior patterns of the AGVs as heat maps.

Keyword

표/그림 (6)

그림 Fig. 1. Description of the Proposed Multi-agent Reinforcement Learning Environment Modeling AGV Warehouse, Where the Target Position of Agent a_i is t_i for Each Agent i∈{1,...,4}.
그림 Fig. 2. Example of Sequential Action Masking
그림 Fig. 3. Overall Network Architecture
그림 Fig. 4. Training results on S layout.
그림 Fig. 5. Training Results on L Layout
그림 Fig. 6. Heat Maps on L Layout

참고문헌 (22)

L. Busoniu, R. Babuska, and B. Schutter, "Multi-agent reinforcement learning: An overview," Innovations in Multi-agent Systems and Applications-1, pp.183-221, 2010.
J. Cui, Y. Liu, and A. Nallanathan, "Multi-agent reinforcement learning-based resource allocation for UAV networks," IEEE Transactions on Wireless Communications, Vol.19, No.2, pp.729-743, 2019.

상세보기
X. Li, J. Zhang, J. Bian, Y. Tong, and T. Liu, "A cooperative multi-agent reinforcement learning framework for resource balancing in complex logistics network," In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019.
X. Li, X. Hu, W. Li, and H. Hu, "A multi-agent reinforcement learning routing protocol for underwater optical sensor networks," In Proceedings of IEEE International Conference on Communications, 2019.
F. A. Oliehoek, M. T. J. Spaan, and N. Vlassis, "Optimal and approximate Q-value functions for decentralized POMDPs," Journal of Artificial Intelligence Research, Vol.32, pp.289-353, 2008.

상세보기
F. A. Oliehoek and C. Amato, "A concise introduction to decentralized POMDPs," SpringerBriefs in Intelligent Systems, Springer, 2016.
J. J. Enright and P. R. Wurman, "Optimization and coordinated autonomy in mobile fulfillment systems," In Proceedings of the AAAI Workshop on Automated Action Planning for Autonomous Mobile Robots, pp.33-38, 2011.
J. Bae and W. Chung, "A heuristic for a heterogeneous automated guided vehicle routing problem," International Journal of Precision Engineering and Manufacturing, Vol.18, No.6, pp.795-801, 2017.

상세보기
Z. Han, D. Wang, F. Liu, and Z. Zhao, "Multi-AGV path planning with double-path constraints by using an improved genetic algorithm," PloS one, Vol.12, No.7, 2017.

상세보기
Y. Lian and W. Xie, "Improved A* multi-AGV path planning algorithm based on grid-shaped network," In 2019 Chinese Control Conference, 2019.
R. Kamoshida and Y. Kazama, "Acquisition of automated guided vehicle route planning policy using deep reinforcement learning," IEEE International Conference on Advanced Logistics and Transport (ICALT), 2017.
Y. Yang, J. Li, and L. Peng, "Multi-robot path planning based on a deep reinforcement learning DQN algorithm," CAAI Transactions on Intelligence Technology, Vol.5, No.3, pp.177-183, 2020.

상세보기
C. J. C. H. Watkins and P. Dayan, "Q-learning," Machine Learning, Vol.8, pp.279-292, 1992.

상세보기
V. Mnih et al., "Human-level control through deep reinforcement learning," Nature, Vol.518, No.7540, pp.529-533, 2015.

상세보기
M. Tan, "Multi-agent reinforcement learning: Independent vs. cooperative agents," In Proceedings of the Tenth International Conference on Machine Learning, pp.330-337, 1993.
P. Sunehag et al., "Value-decomposition networks for co-operative multi-agent learning based on team reward," In Proceedings of 17th International Conference on Autonomous Agents and Multiagent Systems, Stockholm, Sweden, 2018.
T. Rashid, M. Samvelyan, C. S. de Witt, G. Farquhar, J. Foerster, and S. Whiteson, "QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning," In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 2018.
O. Vinyals et al., "Starcraft II: A new challenge for reinforcement learning," arXiv preprint arXiv:1708.04782, 2017.
D. Ye et al., "Mastering complex control in moba games with deep reinforcement learning," In Proceedings of the AAAI Conference on Artificial Intelligence, pp.6672-6679, 2020.
S. Huang and S. Ontanon, "A closer look at invalid action masking in policy gradient algorithms," arXiv preprint arXiv:2006.14171, 2020.
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical evaluation of gated recurrent neural networks on sequence modeling," In NIPS 2014 Workshop on Deep Learning, 2014.
D. Ha, A. Dai, and Q. V. Le, "Hypernetworks," In Proceedings of the International Conference on Learning Representations (ICLR), 2017.

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증