최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기한국음향학회지= The journal of the acoustical society of Korea, v.41 no.1, 2022년, pp.30 - 37
황서림 (연세대학교 지능형신호처리연구실) , 박성욱 (강릉원주대학교 전자공학과) , 박영철 (연세대학교 지능형신호처리연구실)
This paper compares and evaluates model performance from two perspectives according to the learning target and network structure for training Deep Neural Network (DNN)-based speech enhancement models in the frequency domain. In this case, spectrum mapping and Time-Frequency (T-F) masking techniques ...
A. Narayanan and D. Wang, "Ideal ratio mask estimation using deep neural networks for robust speech recognition," Proc. IEEE ICASSP. 7092-7096 (2013).
T. Gerkmann, M. Krawczyk-Becker, and J. Le Roux, "Phase processing for single-channel speech enhancement: History and recent advances," IEEE Signal Process. Mag. 32, 55-66 (2015).
H.-S. Choi, J-H Kim, J. Huh, A. Kim, J.-W. Ha, and K. Lee,"Phase-aware speech enhancement with deep complexu-net," Proc. ICLR. 2019.
S. A. Nossier, J. Wall, M. Moniri, C. Glackin, and N. Cannings, "Mapping and masking targets comparison using different deep learning based speech enhancement architectures," Proc. IJCNN. 1-8 (2020).
K. Paliwal, K. Wojcicki, and B. Shannon, "The importance of phase in speech enhancement," Speech Commun. 53, 465-494 (2011).
K. Tan and D. Wang, "Complex spectral mapping with a convolutional recurrent network for monaural speech enhancement," Proc. IEEE ICASSP. 6865-6869 (2019).
Y. Hu, Y. Liu, S. Lv, M. Xing, S. Zhang, Y. Fu, J. Wu, B. Zhang, and L. Xie, "Dccrn: Deep complex convolution recurrent network for phase-aware speech enhancement," Proc. Interspeech, 2472-2476 (2020).
S. Santurkar, D. Tsipras, A. Ilyas, and A. Madry, "How does batch normalization help optimization?," Proc. NeurIPS. 1-11 (2018).
C. K. Reddy, V. Gopal, R. Cutler, E. Beyrami, R. Cheng, H. Dybey, S. Matusevych, R. Aichner, A. Aazami, S. Braun, and J. Gehrke, "The interspeech 2020 deep noise suppression challenge: Dataset, subjective testing framework, and challenge results," arXiv preprint arXiv:2005.13981 (2020).
J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, N. L. Dahlgren, and V. Zue, "Timit acoustic phonetic continuous speech corpus," Linguistic Data Consortium (1993).
A. Varga and H. J. M. Steeneken, "Assessment for automatic speech recognition: Ii. noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems," Speech commun. 12, 247-251 (1993).
E. Vincent, J. Barker, S. Watanabe, J. Le Roux, F. Nesta, and M. Matassoni, "The second 'chime'speech separation and recog-nition challenge: Datasets, tasks and baselines," Proc. IEEE ICASSP. 126-130 (2013).
J. Barker, R. Marxer, E. Vincent, and S. Watanabe, "The third 'chime' speech separation and recognition challenge: Dataset, taskand baselines," Proc. ISRU. 504-511 (2015).
A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, "Perceptual evaluation of speech quality (PESQ)-A new method for speech quality assessment of telephone networks and coders," Proc. IEEE ICASSP. 749-752 (2001).
C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, "An algorithm for intelligibility prediction of time-frequency weighted noisy speech," IEEE Trans. on Audio, Speech, and Lang. Process. 19, 2125-2136 (2011).
*원문 PDF 파일 및 링크정보가 존재하지 않을 경우 KISTI DDS 시스템에서 제공하는 원문복사서비스를 사용할 수 있습니다.
오픈액세스 학술지에 출판된 논문
※ AI-Helper는 부적절한 답변을 할 수 있습니다.