[논문]종방향 주행성능향상을 위한 Latent SAC 강화학습 보상함수 설계

조성빈; 정한유

doi:10.7471/ikeee.2021.25.4.728

종방향 주행성능향상을 위한 Latent SAC 강화학습 보상함수 설계
On the Reward Function of Latent SAC Reinforcement Learning to Improve Longitudinal Driving Performance 원문보기

전기전자학회논문지 = Journal of IKEEE, v.25 no.4, 2021년, pp.728 - 734

조성빈 (Dept. of Electrical Engineering, Pusan National University) , 정한유 (Dept. of Electrical Engineering, Pusan National University)

초록
AI-Helper

최근 심층강화학습을 활용한 종단간 자율주행에 대한 관심이 크게 증가하고 있다. 본 논문에서는 차량의 종방향 주행 성능을 개선하는 잠재 SAC 기반 심층강화학습의 보상함수를 제시한다. 기존 강화학습 보상함수는 주행 안전성과 효율성이 크게 저하되는 반면 제시하는 보상함수는 전방 차량과의 충돌위험을 회피하면서 적절한 차간거리를 유지할 수 있음을 보인다.

In recent years, there has been a strong interest in the end-to-end autonomous driving based on deep reinforcement learning. In this paper, we present a reward function of latent SAC deep reinforcement learning to improve the longitudinal driving performance of an agent vehicle. While the existing reward function significantly degrades the driving safety and efficiency, the proposed reward function is shown to maintain an appropriate headway distance while avoiding the front vehicle collision.

주제어

표/그림 (8)

그림 Fig. 1. End-to-end LSAC architecture [3]. 그림 1. 종단간 LSAC 신경망의 구조 [3]
그림 Fig. 2. Input sensor data. 그림 2. 입력 센서데이터
그림 Fig. 3. Linear and differential speed reward vs. speed 그림 3. 속력에 따른 선형속력 보상과 차등속력 보상
그림 Fig. 4. Safety distance reward vs. headway distance. 그림 4. 차간거리에 따른 안전거리 보상함수
그림 Fig. 5. Longitudinal driving environment. 그림 5. 종방향 주행환경
표 Table 1. CARLA simulation parameters. 표 1. CARLA 시뮬레이션 파라미터
그림 Fig. 6. Longitudinal speed profile. 그림 6. 종방향 속력 프로파일
그림 Fig. 7. Safety distance margin. 그림 7. 안전거리 마진

참고문헌 (13)

Z. Zhu and H. Zhao, "A survey of deep RL and IL for autonomous driving policy learning," arXiv preprint, arXiv:2101.01993, 2021.
H. Abdou et al, "End-to-end deep conditional imitation learning for autonomous driving," Proc. of IEEE ICM'19, pp.346-334, 2019.
M. Bansal, K. Alex, and O. Abhijit, "Chauffeurnet: Learning to drive by imitating the best and synthesizing the wors," arXiv preprint arXiv: 1812.03079, 2018.
W. Zeng et al. "End-to-end interpretable neural motion planner," Proc. of the IEEE CVPR'19, 2019.
J. Chen, E. L. Shengbo, and T. Masayoshi, "Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning," IEEE Trans on Intelli. Transpt. Syst., 2021.
A. Dosovitskiy et al. "CARLA: An open urban driving simulator," Conf. on Robot Learning. 2017.
V. Mnih et al. "Human-level control through deep reinforcement learning," Nature, vol.518, no.7540 pp.529-533, 2015.

상세보기
R. J. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning," Machine Learning, vol.8, no.3, pp.229-256, 1992. DOI: 10.1007/BF00992696

상세보기
T. P. Lillicrap et al. "Continuous control with deep reinforcement learning," arXiv preprint, arXiv: 1349.02971, 2015.
T. Haarnoja et al. "Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor," Intern. Conf. on Machine Learning, 2018.
D. P. Kingma, and W. Max, "Auto-encoding variational bayes," arXiv preprint arXiv:1312.6114, 2013.
D. Zhao, Z. Xia, and Q. Zhang, "Model-free optimal control based intelligent cruise control with hardware-in-the-loop demonstration," IEEE Comput. Intelli. Mag., vol.12, no.2, pp.56-69, 2017.

상세보기
C. Desjardins and B. Chaib-Draa, "Cooperative adaptive cruise control: A reinforcement learning approach," IEEE Trans. on intelli. transpt. syst., vol.12, no.4, pp.1248-1260, 2011.

상세보기

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증