$\require{mediawiki-texvc}$

연합인증

연합인증 가입 기관의 연구자들은 소속기관의 인증정보(ID와 암호)를 이용해 다른 대학, 연구기관, 서비스 공급자의 다양한 온라인 자원과 연구 데이터를 이용할 수 있습니다.

이는 여행자가 자국에서 발행 받은 여권으로 세계 각국을 자유롭게 여행할 수 있는 것과 같습니다.

연합인증으로 이용이 가능한 서비스는 NTIS, DataON, Edison, Kafe, Webinar 등이 있습니다.

한번의 인증절차만으로 연합인증 가입 서비스에 추가 로그인 없이 이용이 가능합니다.

다만, 연합인증을 위해서는 최초 1회만 인증 절차가 필요합니다. (회원이 아닐 경우 회원 가입이 필요합니다.)

연합인증 절차는 다음과 같습니다.

최초이용시에는
ScienceON에 로그인 → 연합인증 서비스 접속 → 로그인 (본인 확인 또는 회원가입) → 서비스 이용

그 이후에는
ScienceON 로그인 → 연합인증 서비스 접속 → 서비스 이용

연합인증을 활용하시면 KISTI가 제공하는 다양한 서비스를 편리하게 이용하실 수 있습니다.

[해외논문] A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments 원문보기

IEEE access : practical research, open solutions, v.8, 2020년, pp.175448 - 175466  

Jung, Youngmoon (Korea Advanced Institute of Science and Technology, School of Electrical Engineering, Daejeon, South Korea) ,  Choi, Yeunju (Korea Advanced Institute of Science and Technology, School of Electrical Engineering, Daejeon, South Korea) ,  Lim, Hyungjun (Korea Advanced Institute of Science and Technology, School of Electrical Engineering, Daejeon, South Korea) ,  Kim, Hoirin (Korea Advanced Institute of Science and Technology, School of Electrical Engineering, Daejeon, South Korea)

Abstract AI-Helper 아이콘AI-Helper

Speaker verification (SV) has recently attracted considerable research interest due to the growing popularity of virtual assistants. At the same time, there is an increasing requirement for an SV system: it should be robust to short speech segments, especially in noisy and reverberant environments. ...

참고문헌 (76)

  1. Proc IEEE Workshop Autom Speech Recog and Understanding The Kaldi speech recognition toolkit povey 2011 

  2. Jongseo Sohn, Nam Soo Kim, Wonyong Sung. A statistical model-based voice activity detection. IEEE signal processing letters, vol.6, no.1, 1-3.

  3. Varga, A., Steeneken, H.J.M.. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech communication, vol.12, no.3, 247-251.

  4. IEEE Transactions on Audio Speech and Language Processing A tandem algorithm for pitch estimation and voiced speech segregation hu 2010 10.1109/TASL.2010.2041110 18 2067 

  5. J Mach Learn Res Visualizing data using t-SNE van der maaten 2008 9 2579 

  6. Proc Adv Neural Inf Process Syst Autodiff Workshop Automatic differentiation in pytorch paszke 2017 

  7. 10.1109/CVPR.2019.00482 

  8. 10.1109/ICASSP.2019.8682611 

  9. Wang, Feng, Cheng, Jian, Liu, Weiyang, Liu, Haijun. Additive Margin Softmax for Face Verification. IEEE signal processing letters, vol.25, no.7, 926-930.

  10. 10.1109/ASRU46091.2019.9003935 

  11. 10.21437/Interspeech.2019-2177 

  12. 10.21437/Interspeech.2017-950 

  13. 10.21437/Odyssey.2018-11 

  14. 10.1109/CVPR.2017.713 

  15. 10.1007/978-3-319-46478-7_31 

  16. Proc 33rd Int Conf Mach Learn Large-margin softmax loss for convolutional neural networks liu 2016 507 

  17. 10.1109/ICASSP.2012.6288857 

  18. 10.21437/Interspeech.2018-1151 

  19. Buda, Mateusz, Maki, Atsuto, Mazurowski, Maciej A.. A systematic study of the class imbalance problem in convolutional neural networks. Neural networks : the official journal of the International Neural Network Society, vol.106, 249-259.

  20. 10.1109/ICCV.2017.324 

  21. Ghosh, Prasanta Kumar, Tsiartas, Andreas, Narayanan, Shrikanth. Robust Voice Activity Detection Using Long-Term Signal Variability. IEEE transactions on audio, speech, and language processing, vol.19, no.3, 600-613.

  22. 10.21437/Odyssey.2020-66 

  23. arXiv 2003 12266 Dual attention in time and frequency domain for voice activity detection lee 2020 

  24. 10.1109/ICASSP40776.2020.9053823 

  25. 10.21437/Interspeech.2018-2461 

  26. 10.1109/ICSDA.2017.8384419 

  27. arXiv 2005 03867 Multi-task network for noise-robust keyword spotting and speaker verification using CTC-based soft VAD and global query attention jung 2020 

  28. arXiv 1510 08484 [cs] MUSAN: A music, speech, and noise corpus snyder 2015 

  29. 10.1109/ICASSP.2017.7953152 

  30. Aurora working group: DSR front end LVCSR evaluation AU/384/02 pearce 2002 

  31. Dehak, Najim, Kenny, Patrick J, Dehak, Réda, Dumouchel, Pierre, Ouellet, Pierre. Front-End Factor Analysis for Speaker Verification. IEEE transactions on audio, speech, and language processing, vol.19, no.4, 788-798.

  32. Hansen, John H. L., Hasan, Taufiq. Speaker Recognition by Machines and Humans: A tutorial review. IEEE signal processing magazine, vol.32, no.6, 74-99.

  33. Wang, Shuai, Huang, Zili, Qian, Yanmin, Yu, Kai. Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification. IEEE/ACM transactions on audio, speech, and language processing, vol.27, no.11, 1686-1696.

  34. arXiv 2004 03194 Improving multi-scale aggregation using feature pyramid module for robust speaker verification of variable-duration utterances jung 2020 

  35. 10.1109/ASRU46091.2019.9004029 

  36. arXiv 2004 02863 Meta-learning for short utterance speaker recognition with imbalance length pairs min kye 2020 

  37. 10.21437/Interspeech.2019-2240 

  38. 10.21437/Interspeech.2019-1496 

  39. Al-Ali, Ahmed Kamil Hasan, Dean, David, Senadji, Bouchra, Chandran, Vinod, Naik, Ganesh R.. Enhanced Forensic Speaker Verification Using a Combination of DWT and MFCC Feature Warping in the Presence of Noise and Reverberation Conditions. IEEE access : practical research, open solutions, vol.5, 15400-15413.

  40. 10.1109/ICASSP.2017.7953192 

  41. Proc Int Conf Med Image Comput -Assist Intervent U-net: Convolutional networks for biomedical image segmentation ronneberger 2015 234 

  42. 10.1109/ICSDA.2017.8384446 

  43. Xiao-Lei Zhang, DeLiang Wang. Boosting Contextual Information for Deep Neural Network Based Voice Activity Detection. IEEE/ACM transactions on audio, speech, and language processing, vol.24, no.2, 252-264.

  44. Proc INTERSPEECH Comparison of forced-alignment speech recognition and humans for generating reference VAD kraljevski 2015 2937 

  45. 10.21437/Interspeech.2016-268 

  46. 10.1109/SLT.2018.8639586 

  47. 10.21437/Interspeech.2018-1158 

  48. 10.1109/ICASSP.2015.7178861 

  49. Proc Conf Neural Inf Process Syst Residual networks behave like ensembles of relatively shallow networks veit 2016 550 

  50. 10.21236/ADA613971 

  51. 10.1109/ICASSP.2014.6854363 

  52. 10.21437/Interspeech.2019-2357 

  53. 10.1109/ICASSP.2016.7472652 

  54. Zhang, Chunlei, Koishida, Kazuhito, Hansen, John H. L.. Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings. IEEE/ACM transactions on audio, speech, and language processing, vol.26, no.9, 1633-1644.

  55. 10.1109/ICASSP.2018.8461375 

  56. Proc INTERSPEECH A time delay neural network architecture for efficient modeling of long temporal contexts peddinti 2015 3214 

  57. Proc Int Conf Learn Represent Very deep convolutional networks for large-scale image recognition simonyan 2015 

  58. 10.1109/CVPR.2016.90 

  59. 10.21437/Interspeech.2018-1769 

  60. 10.21437/Interspeech.2018-1545 

  61. Proc Odyssey Bayesian speaker verification with heavy-tailed priors kenny 2010 14 

  62. Ioffe, S.. Probabilistic Linear Discriminant Analysis. Lecture notes in computer science, vol.3954, 531-542.

  63. 10.1109/ICASSP.2013.6639151 

  64. Proc INTERSPEECH Analysis of i-vector length normalization in speaker recognition systems garcia-romero 2011 10.21437/Interspeech.2011-53 249 

  65. Proc INTERSPEECH I-vector based speaker recognition on short utterances kanagasundaram 2011 10.21437/Interspeech.2011-58 2341 

  66. Zhang, Xingyu, Zou, Xia, Sun, Meng, Zheng, Thomas Fang, Jia, Chong, Wang, Yimin. Noise Robust Speaker Recognition Based on Adaptive Frame Weighting in GMM for i-Vector Extraction. IEEE access : practical research, open solutions, vol.7, 27874-27882.

  67. 10.21437/Interspeech.2019-2195 

  68. Proc Odyssey Deep neural networks for extracting Baum-Welch statistics for speaker recognition kenny 2014 10.21437/Odyssey.2014-44 293 

  69. He, K., Zhang, X., Ren, S., Sun, J.. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. Lecture notes in computer science, vol.8691, 346-361.

  70. 10.21437/Interspeech.2018-993 

  71. 10.21437/Interspeech.2019-1489 

  72. Campbell, W.M., Sturim, D.E., Reynolds, D.A.. Support vector machines using GMM supervectors for speaker verification. IEEE signal processing letters, vol.13, no.5, 308-311.

  73. 10.1109/ICCV.2019.00346 

  74. 10.21437/Interspeech.2017-1575 

  75. 10.21437/Interspeech.2017-620 

  76. 10.1109/SLT.2016.7846260 

LOADING...

활용도 분석정보

상세보기
다운로드
내보내기

활용도 Top5 논문

해당 논문의 주제분야에서 활용도가 높은 상위 5개 콘텐츠를 보여줍니다.
더보기 버튼을 클릭하시면 더 많은 관련자료를 살펴볼 수 있습니다.

관련 콘텐츠

오픈액세스(OA) 유형

GOLD

오픈액세스 학술지에 출판된 논문

유발과제정보 저작권 관리 안내
섹션별 컨텐츠 바로가기

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

AI-Helper 아이콘
AI-Helper
안녕하세요, AI-Helper입니다. 좌측 "선택된 텍스트"에서 텍스트를 선택하여 요약, 번역, 용어설명을 실행하세요.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.

선택된 텍스트

맨위로