논문 상세정보

한국어 특수 말뭉치의 구축 현황과 그 특징 - 21세기 세종계획의 성과를 중심으로

The Status and Characteristics of Korean Specialized Corpus

한국사전학 = Journal of Korealex no.12 , 2008년, pp.41 - 60  

This paper examines the current status of Korean specialized corpus focusing on the 〈21 st Century Sejong Project〉 Special Data Division. The objectives of the specialized corpus, fields of application, and other detailed information such as a size, construction, and other specific features were discussed in this paper. Korean specialized corpus was planned to expand the boundaries of existing corpora composed of "contemporary Korean written language" and to make a database that contains Korean language resources as a whole by comprising the variations across time and space as well as spoken and written language. In addition to expanding the written language corpora which include data from 15th century to the present time, the data of both south and north Korea and China and former Soviet republics, the project developed -a large-scale spoken language corpus consisting of contemporary Korean natural locutionary acts, which not only made the research on colloquial language possible but research on- comparative study of spoken and written language feasible as well. When developing multi-lingual parallel corpora, the diversity of the contents and the quality of translation was considered in a great measure, so that we could use a high quality parallel corpus. Additionally, it is regarded as desirable to take steps to get the permission from the authors for the use of data in research and technology development.

