[논문]텍스트 마이닝 기법을 이용한 컴퓨터공학 및 정보학 분야 연구동향 조사: DBLP의 학술회의 데이터를 중심으로

김수연; 송성전; 송민

doi:10.3743/kosim.2015.32.1.135

텍스트 마이닝 기법을 이용한 컴퓨터공학 및 정보학 분야 연구동향 조사: DBLP의 학술회의 데이터를 중심으로
Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP 원문보기

정보관리학회지 = Journal of the Korean society for information management, v.32 no.1 = no.95, 2015년, pp.135 - 152

김수연 (연세대학교) , 송성전 (연세대학교 문헌정보학과 대학원) , 송민 (연세대학교 문헌정보학과)

초록
AI-Helper

이 논문의 연구목적은 컴퓨터공학 및 정보학 관련 연구동향을 분석하는 것이다. 이를 위해 텍스트마이닝 기법을 이용하여 DBLP(Digital Bibliography & Library Project)의 학술회의 데이터를 분석하였다. 대부분의 연구동향 분석 연구가 계량서지학적 연구방법을 사용한 것과 달리 이 논문에서는 LDA(Latent Dirichlet Allocation) 기반 다항분포 토픽모델링 기법을 이용하였다. 가능하면 컴퓨터공학 및 정보학과 관련된 광범위한 자료를 수집하기 위해서 DBLP에서 컴퓨터공학 및 정보학과 관련된 353개의 학술회의를 수집 대상으로 하였으며 2000년부터 2011년 기간 동안 출판된 236,170개의 문헌을 수집하였다. 토픽모델링 결과와 주제별 문헌 수, 주제별 학술회의 수를 조사하여 2000년부터 2011년 사이의 주제별 상위 저자와 주제별 상위 학술회의를 제시하였다. 주제동향 분석 결과 네트워크 관련 연구 주제 분야는 성장 패턴을 보였으며, 인공지능, 데이터마이닝 관련 연구 분야는 쇠퇴 패턴을 나타냈고, 지속 패턴을 보인 주제는 웹, 텍스트마이닝, 정보검색, 데이터베이스 관련 연구 주제이며, HCI, 정보시스템, 멀티미디어 시스템 관련 연구 주제 분야는 성장과 하락을 지속하는 변동 패턴을 나타냈다.

Abstract ▼ AI-Helper

The goal of this paper is to explore the field of Computer and Information Science with the aid of text mining techniques by mining Computer and Information Science related conference data available in DBLP (Digital Bibliography & Library Project). Although studies based on bibliometric analysis are most prevalent in investigating dynamics of a research field, we attempt to understand dynamics of the field by utilizing Latent Dirichlet Allocation (LDA)-based multinomial topic modeling. For this study, we collect 236,170 documents from 353 conferences related to Computer and Information Science in DBLP. We aim to include conferences in the field of Computer and Information Science as broad as possible. We analyze topic modeling results along with datasets collected over the period of 2000 to 2011 including top authors per topic and top conferences per topic. We identify the following four different patterns in topic trends in the field of computer and information science during this period: growing (network related topics), shrinking (AI and data mining related topics), continuing (web, text mining information retrieval and database related topics), and fluctuating pattern (HCI, information system and multimedia system related topics).

주제어

참고문헌 (28)

Adamic, L., & Adar, E. (2005). How to search a social network. Social Networks, 27(3), 187-203.

상세보기
Blei, D., & Lafferty, J. (2006). Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning, 113-120.
Blei, D., Ng A., & Jordan, M. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993-1022.
Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30, 107-117.

상세보기
Buckland, M. (2012). What kind of science can information science be?. Journal of the American Society for Information Science and Technology, 63(1), 1-7.

상세보기
Chen, C., & Carr, L. (1999). Visualizing the evolution of a subject domain: A case study. In Proceedings of the conference on Visualization '99: celebrating ten years, 449-452.
Cutting, D., Karger, D., & Pederson, J. (1993). Constant interaction-time scatter/gather browsing of very large document collections. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 126-134.
Frank, E., Paynter, G., Witten, I., Gutwin, C., & Nevill-Manning, C. (1999). Domain-specific keyphrase extraction. In Proceeding of 16th International Joint Conference on Artificial Intelligence, 668-673.
Glanzel, W. (2012). Bibliometric methods for detecting and analysing emerging research topics. El profesional de la informacion, 21(2), 194-201.

상세보기
Griffiths, T., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl. 1), 5228-5235.

상세보기
HaCohen-Kerner, Y., Gross, Z., & Masa, A. (2005). Automatic extraction and learning of keyphrases from scientific articles. In Proceedings of the 6th International Conference on Computational Linguistics and Intelligent Text Processing, 657-669.
He, D., & Parker, S. (2010). Topic dynamics: an alternative model of 'bursts' in streams of topics. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, 443-452.
Hulth, A. (2003). Improved automatic keyword extraction given more linguistic knowledge. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, 216-223.
Janssens, F., Glanzel W., & De Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607-631.

상세보기
Kleinberg, J. (2003). Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery, 7(4), 373-397.

상세보기
Liu, F., Liu, F., & Liu, Y. (2008). Automatic keyword extraction for the meeting corpus using supervised approach and bigram expansion. In Proceedings of 2008 IEEE Workshop on Spoken Language Technology, 181-184.
Liu, Z., Huang, W., Zheng, Y., & Sun, M. (2010). Automatic keyphrase extraction via topic decomposition. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 366-376.
McCallum, A. (2002). MALLET: A Machine learning for language toolkit. Retrieved from http://mallet.cs.umass.edu
Matsuo, Y., & Ishizuka, M. (2004). Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(1), 157-169.

상세보기
Merriam-Webster and American Heritage Dictionary. Retrieved from http://www.britannica.com/EBchecked/topic/19759/The-American-Heritage-Dictionary
Mimno, D., & McCallum, A. (2008). Topic models conditioned on arbitrary features with Dirichlet-multinomial regression. Retrieved from http://arxiv.org/abs/1206.3278v1
Tang, X., Yang, C. C., & Song, M. (2013). Understanding the evolution of multiple scientific research domains using a content and network approach. Journal of the American Society for Information Science and Technology, 64(5), 1065-1075.

상세보기
Treeratpituk, P., & Callan, J. (2006). Automatically labeling hierarchical clusters. In Proceedings of the 2006 International Conference on Digital Government Research, 167-176.
Wan, X., Yang, J., & Xiao, J. (2007). Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 552-559.
Wang, C., Blei, D., & Heckerman, D. (2012). Continuous time dynamic topic models. Retrieved from http://arxiv.org/abs/1206.3298v1
Wang, X., Mohanty, N., & McCallum, A. (2005). Group and topic discovery from relations and text. The 11th ACM SIGKDD International conference on Knowledge Discovery and Data Mining Workshop on Link Discovery: Issues, Approaches & Applications, 28-35.
White, D., & McCain, W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of American Society of Information Science and Technology, 49(4), 327-355.
Xu, J., Marshall, B., Kaza, S., & Chen, H. (2004). Analyzing and visualizing criminal network dynamics: A case study. In H.Chen, R.Moore, D.D.Zeng, & J.Leavitt (Eds.), Lecture Notes in Computer Science, 3073: Intelligence and Security Informatics, 359-377. Berlin: Springer.

저자의 다른 논문 :

LOADING...

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

텍스트 마이닝 기법을 이용한 컴퓨터공학 및 정보학 분야 연구동향 조사: DBLP의 학술회의 데이터를 중심으로
Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP 원문보기

초록
AI-Helper

Abstract ▼ AI-Helper

주제어

참고문헌 (28)

이 논문을 인용한 문헌

저자의 다른 논문 :

연구과제 타임라인

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

텍스트 마이닝 기법을 이용한 컴퓨터공학 및 정보학 분야 연구동향 조사: DBLP의 학술회의 데이터를 중심으로 Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP 원문보기

초록 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

참고문헌 (28)

이 논문을 인용한 문헌

저자의 다른 논문 :

김수연 (1) 송성전 (1) 송민 (30)

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

텍스트 마이닝 기법을 이용한 컴퓨터공학 및 정보학 분야 연구동향 조사: DBLP의 학술회의 데이터를 중심으로
Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP 원문보기

초록
AI-Helper