[논문]Spatio-Temporal Self Attention 기반의 ResI3D를 이용한 딥페이크 영상 탐지

이재규

Spatio-Temporal Self Attention 기반의 ResI3D를 이용한 딥페이크 영상 탐지
Spatio-Temporal Self Attention ResI3D for Deepfake Detection 원문보기

이재규 (연세대학교 일반대학원 산업공학과 국내석사)

초록 ▼
AI-Helper

딥페이크는 특정 인물의 얼굴을 딥러닝 기술을 통해 다른 인물 영상에 합성한 것이다. 딥러닝 기술이 빠른 속도로 발전하며 고품질 딥페이크를 악용한 디지털 범죄가 크게 증가하고 있다. 딥페이크 악용에 대한 우려가 점차 커짐에 따라 이를 탐지할 수 있는 딥러닝 기반 방법론 연구가 필요하다. 기존에는 영상 속 프레임들을 일부 추출해 각각 CNN을 통해 압축하고 이를 RNN의 입력으로 활용해 조작 여부를 분류하는 방법론들이 연구되어 왔다. 허나 이는 영상 내 픽셀 간의 시간적 정보를 손실시키기에, 영상이라는 데이터가 갖는 특성을 온전히 활용하지 못하는 한계가 존재한다. 또한 RNN의 복잡한 연산과정으로 인해 모델 학습이 어려운 단점도 존재한다. 본 연구에서는 영상 자체를 입력으로 하는 ResNet50을 3D화한 ResI3D모델을 활용해 딥페이크 탐지 문제를 해결하고자 한다. 이에 NL Block을 활용해 입력 영상에 대한 이해가 높은 Non-Local한 ResI3D 모델 구조를 제안한다. NL Block의 가장 큰 장점은 데이터 내 local한 영역이 아닌 전체 영역을 고려하며 CNN모델을 학습 가능하게 한다는 것이다. 이는 한 픽셀에 대해 모든 픽셀을 비교하는 self attention을 수행하여, 데이터 학습 과정에서 발생하는 Long term dependency 문제를 해결하기 때문이다. 특히 영상 데이터는 이미지와 달리 시간 축이 존재하기 때문에 문제 개선 정도가 더 크다. 본 연구에서 제안하는 모델은 영상을 입력으로 하며 이를 구성하는 픽셀에 대한 Spatial축과 Temporal축에 대해 self attention을 수행한다. 이를 통해 영상 데이터의 시각적 정보와 시간적 정보가 포함된 전역 정보를 모두 활용하여 학습해, 기존 딥페이크 영상 탐지 모델들보다 개선된 성능을 보인다. 특히 FaceForensics++만을 활용해 개발된 현 SOTA 모델과 동일한 데이터셋에서 학습 시, 더 적은 프레임을 활용했음에도 이를 뛰어넘는 AUC 값(성능)을 보인다.

Abstract ▼ AI-Helper

Deepfake is a synthesis of a specific person's face into another
person's image through deep learning technology. With the rapid
development of deep learning technology, digital crimes that abuse
high-quality deepfakes are increasing significantly. As concerns about
deepfake abuse gradually grow, research on deep learning-based
methodologies that can detect this is needed. Previously, methodologies
have been studied to extract some of the frames in the video,
embedded them through CNN, and use them as input of RNN to
classify whether they are manipulated or not. However, since this
loses temporal information between pixels in the video, there is a
limitation in that the characteristics of the data of the video cannot be
fully utilized. In addition, there is a disadvantage that model learning is
difficult due to the complex computational process of RNN. This study
aims to solve the deepfake detection problem by using the ResI3D
model, which uses the video itself as an input. Therefore, a non-local
ResI3D model structure with a high understanding of input images is
proposed using Non Local Block. The biggest advantage of Non Loca
Block is that it enables learning of the CNN model by considering the
entire area rather than the local area in the data. This is because
self-attention, which compares all pixels for one pixel, is performed
to solve the long term dependency problem that occurs in the data
learning process. In particular, unlike images, video data has a greater
degree of problem improvement because it contains a time axis. The
model proposed in this study uses video as inputs and performs
self-attention on the Spacial and Temporal axes for the pixels
constituting them. Through this, it learns using both visual and
temporal information of image data, showing improved performance
compared to existing deep-fake image detection models. In particular,
when learning on the same dataset as the current SOTA model
developed using only FaceForensics++, it shows an AUC value
(performance) that exceeds this even though fewer frames are utilized.l

주제어

학위논문 정보

저자	이재규
학위수여기관	연세대학교 일반대학원
학위구분	국내석사
학과	산업공학과
지도교수	김우주
발행연도	2022
총페이지	vi, 38 p.
키워드	컴퓨터 비전 딥러닝 딥페이크 Non Local Block ResI3D
언어	kor
원문 URL	http://www.riss.kr/link?id=T16070901&outLink=K
정보원	한국교육학술정보원

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명(한글), 저자명(한글), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문) 관리번호, 논문명(한글), 논문명(영문), 저자명(한글), 저자명(영문), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문)
저장형식	Text(ASCII format) Excel format
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Spatio-Temporal Self Attention 기반의 ResI3D를 이용한 딥페이크 영상 탐지
Spatio-Temporal Self Attention ResI3D for Deepfake Detection 원문보기

초록 ▼
AI-Helper

Abstract ▼ AI-Helper

주제어

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Spatio-Temporal Self Attention 기반의 ResI3D를 이용한 딥페이크 영상 탐지 Spatio-Temporal Self Attention ResI3D for Deepfake Detection 원문보기

초록 ▼ 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

Spatio-Temporal Self Attention 기반의 ResI3D를 이용한 딥페이크 영상 탐지
Spatio-Temporal Self Attention ResI3D for Deepfake Detection 원문보기

초록 ▼
AI-Helper