$\require{mediawiki-texvc}$

연합인증

연합인증 가입 기관의 연구자들은 소속기관의 인증정보(ID와 암호)를 이용해 다른 대학, 연구기관, 서비스 공급자의 다양한 온라인 자원과 연구 데이터를 이용할 수 있습니다.

이는 여행자가 자국에서 발행 받은 여권으로 세계 각국을 자유롭게 여행할 수 있는 것과 같습니다.

연합인증으로 이용이 가능한 서비스는 NTIS, DataON, Edison, Kafe, Webinar 등이 있습니다.

한번의 인증절차만으로 연합인증 가입 서비스에 추가 로그인 없이 이용이 가능합니다.

다만, 연합인증을 위해서는 최초 1회만 인증 절차가 필요합니다. (회원이 아닐 경우 회원 가입이 필요합니다.)

연합인증 절차는 다음과 같습니다.

최초이용시에는
ScienceON에 로그인 → 연합인증 서비스 접속 → 로그인 (본인 확인 또는 회원가입) → 서비스 이용

그 이후에는
ScienceON 로그인 → 연합인증 서비스 접속 → 서비스 이용

연합인증을 활용하시면 KISTI가 제공하는 다양한 서비스를 편리하게 이용하실 수 있습니다.

[국내논문] 딥러닝 기반 소셜미디어 한글 텍스트 우울 경향 분석
A Deep Learning-based Depression Trend Analysis of Korean on Social Media 원문보기

정보관리학회지 = Journal of the Korean society for information management, v.39 no.1, 2022년, pp.91 - 117  

박서정 (Department of Library and Information Science, Yonsei University) ,  이수빈 (Department of Library and Information Science, Yonsei University) ,  김우정 (Department of Psychiatry, Yongin Severance Hospital, Yonsei University College of Medicine) ,  송민 (Department of Library and Information Science, Yonsei University)

초록
AI-Helper 아이콘AI-Helper

국내를 비롯하여 전 세계적으로 우울증 환자 수가 매년 증가하는 추세이다. 그러나 대다수의 정신질환 환자들은 자신이 질병을 앓고 있다는 사실을 인식하지 못해서 적절한 치료가 이루어지지 않고 있다. 우울 증상이 방치되면 자살과 불안, 기타 심리적인 문제로 발전될 수 있기에 우울증의 조기 발견과 치료는 정신건강 증진에 있어 매우 중요하다. 이러한 문제점을 개선하기 위해 본 연구에서는 한국어 소셜 미디어 텍스트를 활용한 딥러닝 기반의 우울 경향 모델을 제시하였다. 네이버 지식인, 네이버 블로그, 하이닥, 트위터에서 데이터수집을 한 뒤 DSM-5 주요 우울 장애 진단 기준을 활용하여 우울 증상 개수에 따라 클래스를 구분하여 주석을 달았다. 이후 구축한 말뭉치의 클래스 별 특성을 살펴보고자 TF-IDF 분석동시 출현 단어 분석을 실시하였다. 또한, 다양한 텍스트 특징을 활용하여 우울 경향 분류 모델을 생성하기 위해 단어 임베딩과 사전 기반 감성 분석, LDA 토픽 모델링을 수행하였다. 이를 통해 문헌 별로 임베딩된 텍스트와 감성 점수, 토픽 번호를 산출하여 텍스트 특징으로 사용하였다. 그 결과 임베딩된 텍스트에 문서의 감성 점수와 토픽을 모두 결합하여 KorBERT 알고리즘을 기반으로 우울 경향을 분류하였을 때 가장 높은 정확률인 83.28%를 달성하는 것을 확인하였다. 본 연구는 다양한 텍스트 특징을 활용하여 보다 성능이 개선된 한국어 우울 경향 분류 모델을 구축함에 따라, 한국 온라인 커뮤니티 이용자 중 잠재적인 우울증 환자를 조기에 발견해 빠른 치료 및 예방이 가능하도록 하여 한국 사회의 정신건강 증진에 도움을 줄 수 있는 기반을 마련했다는 점에서 의의를 지닌다.

Abstract AI-Helper 아이콘AI-Helper

The number of depressed patients in Korea and around the world is rapidly increasing every year. However, most of the mentally ill patients are not aware that they are suffering from the disease, so adequate treatment is not being performed. If depressive symptoms are neglected, it can lead to suici...

주제어

표/그림 (16)

참고문헌 (47)

  1. Aizawa, A. (2003). An information-theoretic perspective of tf-idf measures. Information Processing & Management, 39(1), 45-65. http://doi.org/10.1109/ICHI.2018.00058 

  2. Al Essa, A. (2018). Efficient Text Classification with Linear Regression Using a Combination of Predictors for Flu Outbreak Detection. Doctoral dissertation, University of Bridgeport. 

  3. Alessa, A., Faezipour, M., & Alhassan, Z. (2018). Text classification of flu-related tweets using fasttext with sentiment and keyword features. In 2018 Institute of Electrical and Electronics Engineers International Conference on Healthcare Informatics (ICHI), 366-367. http://doi.org/10.1109/ICHI.2018.00058 

  4. Athiwaratkun, B., Wilson, A. G., & Anandkumar, A. (2018). Probabilistic fasttext for multi-sense word embeddings. arXiv. https://doi.org/10.48550/arXiv.1806.02901 

  5. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993-1022. https://doi.org/10.1016/b978-0-12-411519-4.00006-9 

  6. Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008. https://doi.org/10.1088/1742-5468/2008/10/P10008 

  7. Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22(2), 191-235. https://doi/org/10.1177/053901883022002003 

  8. Cheng, C. H. & Chen, H. H. (2019). Sentimental text mining based on an additional features method for text classification. PloS One, 14(6), e0217591. https://doi.org/10.1371/journal.pone.0217591 

  9. Chronis, G. & Erk, K. (2020). When is a bishop not like a rook? When it's like a rabbi! Multi-prototype BERT embeddings for estimating semantic relationships. In Proceedings of the 24th Conference on Computational Natural Language Learning, 227-244. https://doi.org/10.18653/v1/2020.conll-1.17 

  10. Conway, M. & O'Connor, D. (2016). Social media, big data, and mental health: current advances and ethical implications. Current Opinion in Psychology, 9, 77-82. https://doi.org/10.1016/j.copsyc.2016.01.004 

  11. Coppersmith, G., Dredze, M., & Harman, C. (2014, June). Quantifying mental health signals in Twitter. In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 51-60. https://doi.org/10.3115/v1/w14-3207 

  12. Coppersmith, G., Dredze, M., Harman, C., Hollingshead, K., & Mitchell, M. (2015). CLPsych 2015 shared task: Depression and PTSD on Twitter. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 31-39. https://doi.org/10.3115/v1/w15-1204 

  13. De Choudhury, M., Kiciman, E., Dredze, M., Coppersmith, G., & Kumar, M. (2016). Discovering shifts to suicidal ideation from mental health content in social media. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2098-2110. https://doi.org/10.1145/2858036.2858207 

  14. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv. http://arxiv.org/abs/1810.04805 

  15. Friedrich, M. J. (2017). Depression is the leading cause of disability around the world. Jama, 317(15), 1517-1517. https://doi.org/10.1001/jama.2017.3826 

  16. Guntuku, S. C., Yaden, D. B., Kern, M. L., Ungar, L. H., & Eichstaedt, J. C. (2017). Detecting depression and mental illness on social media: an integrative review. Current Opinion in Behavioral Sciences, 18, 43-49. https://doi.org/10.1016/j.cobeha.2017.07.005 

  17. Kim Y. (2014). Convolutional neural networks for sentence classification. EMNLP2014-2014 Conference on Empirical Methods in Natural Language Processig, Association for Computational Linguistics, 1746-1751. https://doi.org/10.3115/v1/d14-1181 

  18. KNU Korean Emotion Dictionary (2018, November 5). Available: https://github.com/park1200656/KnuSentiLex 

  19. Lalithamani, N., Thati, L. S., & Adhikesavan, R. (2014). Sentence level sentiment polarity calculation for customer reviews by considering complex sentential structures. IJRET: International Journal of Research in Engineering and Technology, 3(3), 433-438. https://doi.org/10.15623/ijret.2014.0303081 

  20. Lee G. (2019). Korean Ebedding. Korea: Acorn Publishing. 

  21. Liang, H., Sun, X., Sun, Y., & Gao, Y. (2017). Text feature extraction based on deep learning: a review. EURASIP Journal on Wireless Communications and Networking, 2017(1), 1-12. https://doi.org/10.1186/s13638-017-0993-1 

  22. Lilleberg, J., Zhu, Y., & Zhang, Y. (2015). Support vector machines and word2vec for text classification with semantic features. In 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), 136-140. Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ICCI-CC.2015.7259377 

  23. Lim, J. H., Kim, H. K., & Kim, Y. K. (2020). Recent R&D trends for pretrained language model. Electronics and Telecommunications Trends, 35(3), 9-19. https://doi.org/10.22648/ETRI.2020.J.350302 

  24. Liu, G. & Guo, J. (2019). Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing, 337, 325-338. https://doi.org/10.1016/j.neucom.2019.01.078 

  25. Martin, L., Muller, B., Suarez, P. J. O., Dupont, Y., Romary, L., de la Clergerie, E. V., Seddah, D., & Sagot, B. (2019). Camembert: a tasty french language model. https://doi.org/10.18653/v1/2020.acl-main.645 

  26. Moon, E. & Han, S. (2011). A qualitative method to find influencers using similarity-based approach in the blogosphere. International Journal of Social Computing and Cyber-Physical Systems, 1(1), 56-78. https://doi.org/10.1504/ijsccps.2011.043604 

  27. Mowery, D., Smith, H., Cheney, T., Stoddard, G., Coppersmith, G., Bryan, C., & Conway, M. (2017). Understanding depressive symptoms and psychosocial stressors on Twitter: a corpus-based study. Journal of Medical Internet Research, 19(2), e48. https://doi.org/10.2196/jmir.6895 

  28. Nam, K. K., Ackerman, M. S., & Adamic, L. A. (2009). Questions in, knowledge in? A study of Naver's question answering community. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 779-788. https://doi.org/10.1145/1518701.1518821 

  29. Orabi, A. H., Buddhitha, P., Orabi, M. H., & Inkpen, D. (2018). Deep learning for depression detection of twitter users. In Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, 88-97. https://doi.org/10.18653/v1/W18-0609 

  30. Pasupa, K. & Ayutthaya, T. S. N. (2019). Thai sentiment analysis with deep learning techniques: A comparative study based on word embedding, POS-tag, and sentic features. Sustainable Cities and Society, 50, 101615. https://doi.org/10.1016/j.scs.2019.101615 

  31. Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532-1543. https://doi.org/10.3115/v1/D14-1162 

  32. Petterson, J., Smola, A. J., Caetano, T. S., Buntine, W. L., & Narayanamurthy, S. M. (2010). Word features for latent dirichlet allocation. In NIPS, 1921-1929. https://doi.org/10.1.1.942.7045 

  33. Qaiser, S. & Ali, R. (2018). Text mining: use of TF-IDF to examine the relevance of words to documents. International Journal of Computer Applications, 181(1), 25-29. https://doi.org/10.5120/ijca2018917395 

  34. Resnik, P., Armstrong, W., Claudino, L., Nguyen, T., Nguyen, V. A., & Boyd-Graber, J. (2015). Beyond LDA: exploring supervised topic modeling for depression-related language in Twitter. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 99-107. https://doi.org/10.3115/v1/w15-1212 

  35. Resnik, P., Garron, A., & Resnik, R. (2013). Using topic modeling to improve prediction of neuroticism and depression in college students. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 1348-1353. url: https://www.aclweb.org/anthology/D13-1133 

  36. Ruas, T., Ferreira, C. H. P., Grosky, W., de Franca, F. O., & de Medeiros, D. M. R. (2020). Enhanced word embeddings using multi-semantic representation through lexical chains. Information Sciences, 532, 16-32. https://doi.org/10.1016/j.ins.2020.04.048 

  37. Schwartz, H. A., Eichstaedt, J., Kern, M., Park, G., Sap, M., Stillwell, D., Kosinski, M., & Ungar, L. (2014). Towards assessing changes in degree of depression through facebook. In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 118-125. https://doi.org/10.3115/v1/w14-3214 

  38. Tadesse, M. M., Lin, H., Xu, B., & Yang, L. (2019). Detection of depression-related posts in reddit social media forum. IEEE(Institute of Electrical and Electronics Engineers) Access, 7, 44883-44893. https://doi.org/10.1109/ACCESS.2019.2909180 

  39. Trotzek, M., Koitka, S., & Friedrich, C. M. (2018). Early detection of depression based on linguistic metadata augmented classifiers revisited. In International Conference of the Cross-Language Evaluation Forum for European Languages, 191-202. Springer, Cham. https://doi.org/10.1007/978-3-319-98932-7_18 

  40. Tsugawa, S., Kikuchi, Y., Kishino, F., Nakajima, K., Itoh, Y., & Ohsaki, H. (2015). Recognizing depression from twitter activity. In Proceedings of the 33rd annual ACM conference on Human Factors in Computing Systems, 3187-3196. https://doi.org/10.1145/2702123.2702280 

  41. Turney, P. D. & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141-188. https://doi.org/10.1613/jair.2934 

  42. Wang, Z. Y., Li, G., Li, C. Y., & Li, A. (2012). Research on the semantic-based co-word analysis. Scientometrics, 90(3), 855-875. https://doi.org/10.1007/s11192-011-0563-y 

  43. World Health Organization (2020). Available: https://www.who.int/health-topics/depression#tabtab_1 

  44. Yin, Z. & Shen, Y. (2018). On the dimensionality of word embedding. arXiv preprint arXiv:1812.04224. https://doi.org/10.48550/arXiv.1812.04224 

  45. Yun-tao, Z., Ling, G., & Yong-cheng, W. (2005). An improved TF-IDF approach for text classification. Journal of Zhejiang University-Science A, 6(1), 49-55. https://doi.org/10.1007/BF02842477 

  46. Zhang, L., Huang, X., Liu, T., Li, A., Chen, Z., & Zhu, T. (2014). Using linguistic features to estimate suicide probability of Chinese microblog users. In International Conference on Human Centered Computing, 549-559. Springer, Cham. https://doi.org/10.1007/978-3-319-15554-8_45 

  47. Zhao, J., Zhou, Y., Li, Z., Wang, W., & Chang, K. W. (2018). Learning gender-neutral word embeddings. arXiv preprint arXiv:1809.01496. https://doi.org/10.18653/v1/d18-1521 

저자의 다른 논문 :

관련 콘텐츠

오픈액세스(OA) 유형

GOLD

오픈액세스 학술지에 출판된 논문

저작권 관리 안내
섹션별 컨텐츠 바로가기

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

AI-Helper 아이콘
AI-Helper
안녕하세요, AI-Helper입니다. 좌측 "선택된 텍스트"에서 텍스트를 선택하여 요약, 번역, 용어설명을 실행하세요.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.

선택된 텍스트

맨위로