This paper examines the current status of Korean specialized corpus focusing on the 〈21 st Century Sejong Project〉 Special Data Division. The objectives of the specialized corpus, fields of application, and other detailed information such as a size, construction, and other specific features were discussed in this paper. Korean specialized corpus was planned to expand the boundaries of existing corpora composed of "contemporary Korean written language" and to make a database that contains Korean language resources as a whole by comprising the variations across time and space as well as spoken and written language. In addition to expanding the written language corpora which include data from 15th century to the present time, the data of both south and north Korea and China and former Soviet republics, the project developed -a large-scale spoken language corpus consisting of contemporary Korean natural locutionary acts, which not only made the research on colloquial language possible but research on- comparative study of spoken and written language feasible as well. When developing multi-lingual parallel corpora, the diversity of the contents and the quality of translation was considered in a great measure, so that we could use a high quality parallel corpus. Additionally, it is regarded as desirable to take steps to get the permission from the authors for the use of data in research and technology development.
DOI 인용 스타일