최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기Journal of the convergence on culture technology : JCCT = 문화기술의 융합, v.7 no.1, 2021년, pp.640 - 645
The deep learning based end-to-end TTS system consists of Text2Mel module that generates spectrogram from text, and vocoder module that synthesizes speech signals from spectrogram. Recently, by applying deep learning technology to the TTS system the intelligibility and naturalness of the synthesized...
* AI 자동 식별 결과로 적합하지 않은 문장이 있을 수 있으니, 이용에 유의하시기 바랍니다.
A. J. Hunt and A. W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database", Proceedings of the International Conference on Acoustics, Speech, Signal Processing, pp. 373-376, 1996
T. Yoshimura, K. Tokuda, T. Masuko, T, Kobayashi, T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM based speech synthesis". Proceedings of the Eurospeech 1999, pp. 2347-2350, 1999
Y. Wan, R J, Skerry-Ryan, D. Stanton, Y. Wu, R. J. Weiss, N. Jaitly, Z. Yang, Y. Xiao, Z. Chen, S. Bengio, Q. Le, Y. Agiomyrgiannakis, R. Clark, R. A. Saurous, "Tacotron: Towards end-to-end speech synthesis", arXiv preprint, https://arxiv.org/pdf/1703.10135.pdf, 2017 Apr.
J. Shen, R. Pang, R. J. Weiss, M. Schuster, N. Jaitly, Z. Yang, Z. Chen, Y. Zhang, Y. Wang, R. Skerry-Ryan, R. A. Saurous, Y. Agiomyrgiannakis, Y. Wu, "Natural TTS synthesis by conditioning wavenet on mel spectrogram predictions", arXiv preprint, https://arxiv.org/pdf/1712.05884.pdf, 2018 Feb.
N. Li, S. Liu, Y. Liu, S. Zhao, M. Liu, M. Zhou, "Neural speech synthesis with transformer network", arXiv preprint, https://arxiv.org/pdf/1809.08895.pdf, 2019 Jan.
Y. Ren, Y. Ruan, X. Tan, T. Qin, S. Zhao, Z. Zhao, T. Y. Liu, "FastSpeech: Fast, robust and controllable text to speech", arXiv preprint, https://arxiv.org/pdf/1905.09263.pdf, 2019 Nov.
A. Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A.Senior, K. Kavukcuoglu, "WaveNet: A generative model for raw audio," arXiv preprint, https://arxiv.org/pdf/1609.03499.pdf, 2016 Sep.
N. Kalchbrenner, E. Elsen, K. Simonyan, S. Noury, N. Casagrande, E. Lockhart, F. Stimberg, A. Oord, S. Dieleman, K. Kavukcuoglu. "Efficient neural audio synthesis", arXiv preprint. https://arxiv.org/pdf/1802.08435.pdf, 2018, Feb.
Y. Ren, C. Hu, X. Tan, T. Qin, S. Zhao, Z. Zhao, T. Y. Liu, "FastSpeech 2: Fast and high-quality end-to-end text to speech", arXiv preprint, https://arxiv.org/pdf/2006.04558.pdf, 2020 Oct.
A. Lancucki, "FastPitch: Parallel text-to- speech with pitch prediction", arXiv preprint, https://arxiv.org/pdf/2006.06873.pdf, 2020 June
R. Yamamoto, E. W. Song, J. M. Kim, "Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram", arXiv preprint, https://arxiv.org/pdf/1910.11480.pdf, 2020 Feb.
K. Kumar, R. Kumar, T. de Boissiere, L. Gestin, W. Z. Teoh, J. Sotelo, A. de Brebisson, Y. Bengio, A. Courville, "MelGAN: Generative adversarial networks for conditional waveform synthesis", arXiv preprint, https://arxiv.org/pdf/1910.06711.pdf, 2018 Dec.
G. Yang, S. Yang, K. Liu, P. Fang, W. Chen, L. Xie1, "Multi-band MelGAN: Faster waveform generation for high-quality text-to-speech", arXiv preprint, https://arxiv.org/pdf/2005.05106.pdf, 2020 Nov.
R. Prenger, R. Valle, B. Catanzaro, "WaveGlow: A flow-based generative network for speech synthesis", arXiv preprint. https://arxiv.org/pdf/1811.00002.pdf, 2018 Oct.
*원문 PDF 파일 및 링크정보가 존재하지 않을 경우 KISTI DDS 시스템에서 제공하는 원문복사서비스를 사용할 수 있습니다.
오픈액세스 학술지에 출판된 논문
※ AI-Helper는 부적절한 답변을 할 수 있습니다.