[논문]강한 잡음환경에서의 지능형 음향신호처리를 위한 딥러닝 기법

김영진

강한 잡음환경에서의 지능형 음향신호처리를 위한 딥러닝 기법
Deep Learning Approach for Intelligent Acoustic Signal Processing in Low SNR Environments 원문보기

김영진 (한국기술교육대학교 일반대학원 컴퓨터공학과 컴퓨터전공 국내박사)

초록 ▼
AI-Helper

음향신호처리는 음향의 품질 향상, 관심 구간의 강조 및 검출, 음원의
위치 및 방향 추정 등 음향에 기초한 신호처리 기술로, 음향 통신, 방송,
검색 및 구조, 감시 등 다양한 분야에서 활용되고 있다. 따라서 음향신호
처리의 성능을 향상시키기 위해, 지난 수 세기 동안 관련된 많은 연구가
진행되고 있다. 특히, 데이터 중심(data-driven)의 딥러닝 기법들은 기존
의 통계적 특성이나 모델에 기초한 기존 음향신호처리 알고리즘의 성능을
크게 뛰어넘고 있으며, 강한 잡음에서도 강인한 결과를 나타내고 있다.
반면, 딥러닝이 여러 음향신호처리 분야에 성공적으로 적용되고 있으나,
매우 강한 잡음환경에서는 음향신호의 위상이 크게 변질됨에 따라 대다수
의 기존 딥러닝 기반 음향신호처리 기술들이 강인하고 최적화된 결과를
보이지 못하고 있다. 또한, 음향 향상, 음원 방향 추정, 그리고 관심 구간
의 검출 등의 음향신호처리 기법들에서 다루는 음향 특성들은 서로 밀접
하게 관련되어 있음에도 불구하고, 기존 다수의 딥러닝 방법론들은 이러
한 연관 특성을 통합해서 고려하지 못하고 있다.
본 연구에서는 매우 강한 잡음환경 중 하나인 멀티로터 UAV 음향 시
스템에서 수집된 음향을 대상으로 음향의 품질을 향상시키고, 음성 구간
의 검출 및 음원의 방향을 추정하는 딥러닝 기반 음향신호처리 기법을 제
안하였다. 기존 딥러닝 기반 음향 향상 기법과 달리, 제안한 지능형 음향 향상 기
법은 복소 스펙트로그램의 실수부와 허수부 사이의 관계 및 멀티채널 신
호의 특성을 고려한 딥러닝 모델을 설계함으로써, 매우 강한 잡음환경에
서도 효과적인 음향 향상이 가능하다. 또한, 본 연구에서는 복소 스펙트로
그램의 특성에 기초한 다양한 목적 함수를 정의해서 실험을 진행하였으
며, 실험 결과 위상에 대한 Mean Squared Error를 목적 함수에 추가해서
학습했을 때 추가적인 성능 향상이 이루어졌다. 최종적으로, 제안한 지능
형 음향 향상 기법은 -35.69dB의 매우 강한 잡음환경에서 SDR 4.79,
STOI 0.64의 성능을 나타내었다. 음성 구간 검출 기법과 음원 방향 추정
기법에서는 음향 향상 기법과의 연관 특성을 고려할 수 있도록 멀티테스
크 러닝 기반 음향신호처리 프레임워크를 구현하였다. 제안한 음성 구간
검출 기법은 약 80%의 정확도를 나타냈으며, 음원 방향 추정 기법은 프
레임 단위(16ms)에서 39%, 발언 단위(3-8s)에서 83%의 정확도를 보였다.

Abstract ▼ AI-Helper

Acoustic signal processing is a signal processing technique based on
sound, such as improving sound quality, highlighting or detecting a
region of interest, and estimating the location or direction of a sound
source. It is used in various fields such as acoustic communication,
broadcasting, search and rescue, and surveillance. Over the past
decades, many approaches have been proposed to improve the
performance of acoustic signal processing. In particular, data-driven
approaches such as deep learning outperform existing statistical
characteristics or stochastic models based algorithms in low SNR
environments. Although deep learning has been successfully applied to
acoustic signal processing, most existing approaches have not shown
robust results because of the phase spectrum which is greatly
corrupted in extremely strong noise conditions. In addition, the acoustic
characteristics of the acoustic signal processing techniques are closely
related to each other, but many existing deep learning models have not
fully considered the integration of those characteristics.
In this paper, we proposed the deep learning-based acoustic signal
processing method for extremely strong noise environments, especially,
multi-rotor UAV acoustic system. In the proposed intelligent sound
enhancement method, we design a deep learning model considering the
relationship between the real and imaginary parts of the complex
spectrogram and the characteristics of the multichannel signal, so that
it is possible to enhance the sound effectively even in the strong noise
environments. In addition, we performed experiments that defined
various objective functions based on the characteristics of complex
spectrograms. Experimental results show that the performance
improvement is achieved when the mean squared error of the phase is
added to the objective function. Finally, the intelligent sound
enhancement technique shows the performance of SDR 4.79 and STOI
0.64 in an extremely strong noise environment which is –35.69dB. In
voice activity detection and direction of arrival estimation, we
implemented a multi-task learning-based acoustic signal processing
framework to consider the correlation with the sound enhancement
technique. The voice activity detection showed about 80% accuracy, and
the direction of arrival estimation showed 39% accuracy in
frame-level(16ms) evaluation and 83% in utterance-level(3-8s) evaluation.

주제어

학위논문 정보

저자	김영진
학위수여기관	한국기술교육대학교 일반대학원
학위구분	국내박사
학과	컴퓨터공학과 컴퓨터전공
지도교수	김은경
발행연도	2020
총페이지	184
키워드	딥러닝 멀티채널 신호처리 음향 품질 향상 음원 방향 추정 음성 구간 검출
언어	kor
원문 URL	http://www.riss.kr/link?id=T15618304&outLink=K
정보원	한국교육학술정보원

표제어: PCR

동의어: Packet Collision Rate

용어 설명 출처 목록 (6)

용어 설명: PCR은 세균 특이성이 있는 primer를 이용하여 적은 수의 세균이 있을지라도 쉽게 검출할 수 있는 유용한 방법이며, 이를 이용하여 구강 내 치면세균막이나 타액에서 직접 세균을 검출할 수 있게 되었다[8].

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명(한글), 저자명(한글), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문) 관리번호, 논문명(한글), 논문명(영문), 저자명(한글), 저자명(영문), 학위수여기관, 학위연도, 학위구분, 학과, 총페이지, 키워드, 초록(한글), 초록(영문)
저장형식	Text(ASCII format) Excel format
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

강한 잡음환경에서의 지능형 음향신호처리를 위한 딥러닝 기법
Deep Learning Approach for Intelligent Acoustic Signal Processing in Low SNR Environments 원문보기

초록 ▼
AI-Helper

Abstract ▼ AI-Helper

주제어

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

강한 잡음환경에서의 지능형 음향신호처리를 위한 딥러닝 기법 Deep Learning Approach for Intelligent Acoustic Signal Processing in Low SNR Environments 원문보기

초록 ▼ 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

학위논문 정보

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

강한 잡음환경에서의 지능형 음향신호처리를 위한 딥러닝 기법
Deep Learning Approach for Intelligent Acoustic Signal Processing in Low SNR Environments 원문보기

초록 ▼
AI-Helper