[논문]딥 러닝 기반의 눈 랜드마크 위치 검출이 통합된 시선 방향 벡터 추정 네트워크

주희영; 고민수; 송혁

doi:10.5909/jbe.2021.26.6.748

딥 러닝 기반의 눈 랜드마크 위치 검출이 통합된 시선 방향 벡터 추정 네트워크
Deep Learning-based Gaze Direction Vector Estimation Network Integrated with Eye Landmark Localization 원문보기

방송공학회논문지 = Journal of broadcast engineering, v.26 no.6, 2021년, pp.748 - 757

주희영 (한국전자기술연구원) , 고민수 (한국전자기술연구원) , 송혁 (한국전자기술연구원)

초록
AI-Helper

본 논문은 눈 랜드마크 위치 검출과 시선 방향 벡터 추정이 하나의 딥러닝 네트워크로 통합된 시선 추정 네트워크를 제안한다. 제안하는 네트워크는 Stacked Hourglass Network를 백본(Backbone) 구조로 이용하며, 크게 랜드마크 검출기, 특징 맵 추출기, 시선 방향 추정기라는 세 개의 부분(Part)으로 구성되어 있다. 랜드마크 검출기에서는 눈 랜드마크 50개 포인트의 좌표를 추정하며, 특징 맵 추출기에서는 시선 방향 추정을 위한 눈 이미지의 특징 맵을 생성한다. 그리고 시선 방향 추정기에서는 각 출력 결과를 조합하여 최종 시선 방향 벡터를 추정한다. 제안하는 네트워크는 UnityEyes 데이터셋을 통해 생성된 가상의 합성 눈 이미지와 랜드마크 좌표 데이터를 이용하여 학습하였으며, 성능 평가는 실제 사람의 눈 이미지로 구성된 MPIIGaze 데이터셋을 이용하였다. 실험을 통해 시선 추정 오차는 3.9°의 성능을 보였으며, 네트워크의 추정 속도는 42 FPS(Frame per second)로 측정되었다.

Abstract ▼ AI-Helper

In this paper, we propose a gaze estimation network in which eye landmark position detection and gaze direction vector estimation are integrated into one deep learning network. The proposed network uses the Stacked Hourglass Network as a backbone structure and is largely composed of three parts: a landmark detector, a feature map extractor, and a gaze direction estimator. The landmark detector estimates the coordinates of 50 eye landmarks, and the feature map extractor generates a feature map of the eye image for estimating the gaze direction. And the gaze direction estimator estimates the final gaze direction vector by combining each output result. The proposed network was trained using virtual synthetic eye images and landmark coordinate data generated through the UnityEyes dataset, and the MPIIGaze dataset consisting of real human eye images was used for performance evaluation. Through the experiment, the gaze estimation error showed a performance of 3.9, and the estimation speed of the network was 42 FPS (Frames per second).

주제어

표/그림 (15)

그림 그림 1. 사용자의 시선 정보를 입력으로 갖는 웨어러블 기기의 예 (a) 구글 글래스 (b) 홀로렌즈 Fig. 1. Examples of wearable devices which take gaze information of users as an input (a) Google Glass (b) HoloLens
그림 그림 2. 현대모비스 사(社)의 운전자 시선 추적 개발 내용 Fig. 2. Hyundai Mobis' research development about driver's eye tracking
그림 그림 3. 시선 추정 결과의 시각화 예시 Fig. 3. An example of the visualization of the gaze estimation
그림 그림 4. 인간의 안구와 동공의 움직임 모델링 Fig. 4. The modeling of human eye ball and the movement of a pupil
그림 그림 5. 제안하는 네트워크 아키텍처 Fig. 5. The architecture of proposed network.
표 표 1. 제안하는 네트워크의 Convs Layers 상세 Table 1. The details of Convs Layers in proposed network
그림 그림 6. UnityEyes로부터 생성된 가상의 눈 이미지 예시 Fig. 6. Examples of virtual eye images generated from UnityEyes
그림 그림 7. MPIIGaze 데이터셋 예시 Fig 7. Examples of MPIIGaze dataset
표 표 2. 제안하는 네트워크 학습에 사용된 하이퍼-파라미터 값 Table 2. Hyper-parameters used for training the proposed network
표 표 3. 실험에 사용된 PC 사양 Table 3. Specifications of the PC used in the experiment and evaluation
그림 그림 8. 제안하는 기법에 따른 시선 추정 결과 예시 Fig. 8. Examples of the gaze estimation result based on proposed method
그림 그림 9. MPIIGaze 데이터 셋의 추정 결과 중, 추정 오차가 추정 오차가 3° 미만인 경우 예시 Fig. 9. Among the estimation results of the MPIIGaze data set, an example where the estimation error is less than 3°
그림 그림 10. MPIIGaze 데이터 셋의 추정 결과 중, 추정 오차가 추정 오차가 10° 이상인 경우 예시 Fig. 10. Among the estimation results of the MPIIGaze data set, an example where the estimation error is more than 10°
표 표 4. 제안하는 네트워크의 성능 평가 결과 Table 4. The result of the performance of the proposed network
표 표 5. 제안하는 손실함수의 Ablation study 결과 Table 5. The result of the Ablation study on our proposed Loss fundction

참고문헌 (25)

J. Carmigniani, and B.Furht, "Augmented Reality: An Overview," Springer, New York, pp.3-46, 2011.
R. Sherman, and Alan B. Craig, "Understanding Virtual Reality: Interface, Application, and Design, Second Edition," Morgan Kaufmann Series in Computer Graphics, Massachusetts, pp.3-58, 2018.
The Market prediction for the virtual, augmented, and mixed reality technology by Statista https://www.statista.com/statistics/591181/global-augmented-virtual-reality-market-size (accessed Sep. 3, 2021).
Oliver J. Muensterer, Martin Lacher, Christoph Zoeller, Matthew Bronstein, and Joachim Kubler, "Google Glass in pediatric surgery: An exploratory study," International Journal of Surgery, Vol.12, No.4, pp.281-289, 2014.

상세보기
M. Tepper, L. Rudy, A. Lefkowitz Aaron, A. Weimer, M. Marks, S. Stern, and S. Garfein, "Mixed Reality with HoloLens: Where Virtual Reality Meets Augmented Reality in the Operating Room," Plastic and Reconstructive Surgery, Vol.140, No.5, pp.1066-1070, 2017.

상세보기
Noel Gorelick, Matt Hancher, Mike Dixon, Simon Ilyushchenko, David Thau, and Rebecca Moore,"Google Earth Engine: Planetary-scale geospatial analysis for everyone," Remote Sensing of Environment, Vol.202, pp.18-27, 2017.

상세보기
Hyundai mobis' research development about driver's eye tracking, http://www.epnc.co.kr/news/articleView.html?idxno91211 (accessed Sep. 3, 2021).
Anuradha Kar, and Peter Corcoran, "A review and Analysis of Eye-Gaze Estimation Systems, Algorithms and Performance Evaluation Methods in Consumer Platforms," IEEE Access, Vol.5, pp.16495-16519, 2017.

상세보기
An example of the visualization of the gaze estimation, https://www.hankyung.com/it/article/201701051859v (accessed Sep. 3, 2021).
Sunghyun Cho, "Introduction to eye-tracking technology," The Magazine of the IEEK, Vol.45, pp.23-32, 2018.
Laura Sesma, Arantxa Villanueva, and Rafael Cabeza, "Evaluation of pupil center-eye corner vector for gaze estimation using a web cam," Proceedings of the Symposium on Eye Tracking Research and Applications, pp.217-220, 2012.
A. Tsukada, M. Shino, M. Devyver, and T. Kanade, "Illumination-free gaze estimation method for first-person vision wearable device", 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp.2084-2091, 2011.
Christian Nitschke, Atsushi Nakazawa, and Haruo Takemura, "Display-camera calibration using eye reflections and geometry constraints", Computer Vision and Image Understanding, Vol.115, No.6, pp.835-853, 2011.

상세보기
Seonwook Park, Xucong Zhang, Andreas Bulling, and Otmar Hilliges, "Learning to find eye region landmarks for remote gaze estimation in unconstrained settings," Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications, pp.1-10, 2018.
Mariette Awad, Rahul Khanna, "Support vector regression," Efficient learning machines, pp.67-80, 2015.
Alejandro Newell, Kaiyu Yang, Jia Deng, "Stacked hourglass networks for human pose estimation," European conference on computer vision, pp.483-499, 2016.
Erroll Wood, Tadas Baltrusaitis, Louis-Philippe Morency, Peter Robinson, Andreas Bulling, "Learning an appearance-based gaze estimator from one million synthesized images," In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, pp. 131-138. 2016.
Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling, "MPIIGaze: Real-world dataset and deep appearance-based gaze estimation," IEEE transactions on pattern analysis and machine intelligence Vol.41, pp.162-175, 2017.

상세보기
Examples of UnityEyes dataset,
https://www.cl.cam.ac.uk/research/rainbow/projects/unityeyes/tutorial.html#:~:textUnityEyes%20is%20a%20tool%20for,for%20other%20eye%20tracking%20systems (accessed Sep. 3, 2021).
Examples of MPIIGaze dataset, https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/gaze-based-human-computer-interaction/appearance-based-gaze-estimation-in-thewild (accessed Sep. 3, 2021).
Diederik P. Kingma, Jimmy Ba, "Adam: A method for stochastic optimization," arXiv preprint, 2014.
Park, Seonwook, Adrian Spurr, and Otmar Hilliges, "Deep pictorial gaze estimation," Proceedings of the European Conference on Computer Vision (ECCV), pp. 721-738, 2018.
Gao Huang, Zhuang Liu, Kilian Q. Weinberger, and Laurens van der Maaten, "Densely Connected Convolutional Networks," arXiv preprint arXiv:1608.06993, 2016.
Simonyan, Karen, and Andrew Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.

표제어: PCR

동의어: Packet Collision Rate

용어 설명 출처 목록 (6)

용어 설명: PCR은 세균 특이성이 있는 primer를 이용하여 적은 수의 세균이 있을지라도 쉽게 검출할 수 있는 유용한 방법이며, 이를 이용하여 구강 내 치면세균막이나 타액에서 직접 세균을 검출할 수 있게 되었다[8].

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증