최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기한국전자거래학회지 = The Journal of Society for e-Business Studies, v.25 no.3, 2020년, pp.77 - 93
장수정 (Graduate School(Big Data Analytics), Ewha Womans University) , 민대기 (School of Business, Ewha Womans University)
The literature has reported that hierarchical classification methods generally outperform the flat classification methods for a multi-class document classification problem. Unlike the literature that has constructed a class hierarchy, this paper evaluates the performance of hierarchical and flat cla...
* AI 자동 식별 결과로 적합하지 않은 문장이 있을 수 있으니, 이용에 유의하시기 바랍니다.
핵심어 | 질문 | 논문에서 추출한 답변 |
---|---|---|
비정형 텍스트 문서 분류는 무엇을 의미하는가? | 비정형 텍스트 문서 분류(unstructured text document classification)는 텍스트 문서가 어떤 종류의 범주(class)에 속하는지를 구분하는 작업을 의미한다. 최근 사회적으로 대용량 데이터의 증가로 그에 대한 분석이 다양한 방면에서 사용되고 있다[4]. | |
본 논문에서 텍스트 문서 분류를 위해 어떤 분류기를 사용하였는가? | 데이터의 전처리 과정에서는 문서 집합으로부터 단어들을 추출하기 위해 TF-IDF(Term Frequency-Inverse Document Frequency)와 같은 빈출 단어 기반의 방법을 이용하여 분류를 위한 Feature 집합을 구성한다[18]. 분류기는 텍스트 문서 분류와 관련한 선행연구에서 광범위하게 사용하고 있는 SVM을 사용하였다. | |
SVM이 다중 분류 문제에서 많이 이용되는 이유는 무엇인가? | 최근 SVM이 다중 분류 문제에서 많이 이용되는 이유는 다음과 같이 3가지로 요약할 수 있다. 첫째, 확실한 이론적 근거에 기반을 두는 기법으로 결과를 해석하는 것이 용이하다[5]. 둘째, SVM을 사용하여 도출한 결과가 인공신경망을 통해 도출한 결과 성능과 유사하거나 그 이상으로 개선된 결과를 도출한다. 마지막으로 적은 학습 데이터로 짧은 시간 내에 분류 결과를 도출할 수 있으며, 불균형 데이터 집합에 대해서 우수한 성능을 보인다[21]. |
Agnihotri, D., Verma, K., and Tripathi, P., "Variable global feature selection scheme for automatic classification of text documents," Expert Systems with Applications, Vol. 81, pp. 268-281, 2017.
Bertule, M., Appelquist, L. R., Spensley, J., Traerup, S. L. M., and Naswa, P., "Climate change adaptation technologies for water: A practitioner's guide to adaptation technologies for increased water sector resilience," CTCN publications, Copenhagen, Denmark, 2018.
Beyan, C. and Fisher, R., "Classifying imbalanced data sets using similarity based hierarchical decomposition," Pattern Recognition, Vol. 48, pp. 1653-1672, 2015.
Byun, J. H., "Current Status and Perspectives of Fintech Innovation," Journal of New Industry and Business, Vol. 26, No. 2, pp. 35-48, 2018
Chen, Y., Craword, M. M., and Ghosh, J., "Integrating support vector machines in a hierarchical output space decomposition framework," IEEE International Geoscience and Remote Sensing Symposium, Vol. 2, pp. 949-952, 2004.
Cristianini, N. and Shawe-Taylor, J., "An introduction to support vector machines and other kernel-based leartning methods", Cambridge University Press, MA, 2000.
Du, Y., Liu, J., Ke, W., and Gong, X., "Hierarchy construction and text classification based on the relaxation strategy and least information model," Expert Systems with Applications, Vol. 100, pp. 157-164, 2018.
Duan, K. B. and Keerthi, S. S., "Which is the best multiclass SVM method? An empirical study," International Workshop on Multiple Classifier Systems, Vol. 3531, pp. 278-285, 2005.
Gargiulo, F., Silvestri, S., Ciampi, M., and De Pietro, G., "Deep neural network for hierarchical extreme multi-label text classification," Applied Soft Computing, Vol. 79, pp. 125-138, 2019.
Kang, S., Cho, S., and Kang, P., "Constructing a multi-class classier using one-against-one approach with different binary classifiers," Neurocomputing, Vol. 149, pp. 677-682, 2015.
Kim, P. J. and Lee, J. Y., "An experimental study on the performance improvement of automatic classification for the articles of korean journals based on controlled keywords in international database," Journal of the Korean Society for Library and Information Science, Vol. 48, No. 3, pp. 491-510, 2014
Kim, P. J., "An analytical study on automatic classification of domestic journal articles based on machine learning," Journal of the Korean Society for information Management, Vol. 35, No. 2, pp. 37-62, 2018.
Kim, Y. S. and Lee, B. Y., "Multi-class support vector machines model based clustering for hierarchical document categorization in big data environment," The Journal of the Korea Contents Association, Vol. 17, pp. 600-608, 2017.
Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., and Brown, D,, "Text classification algorithms: A survey," Information, Vol. 10, No. 4, 2019.
Lee, J. H., Yi, J. S., and Son, J. W., "Unstructured construction data analytics using R programming: Focused on overseas construction adjudication cases", Journal of the Architectural Institute of Korea Structure & Construction, Vol. 32, No. 5, pp. 37-44, 2016.
Lee, J. S. and Kwon, J. G., "A hybrid SVM classifier for imbalanced data sets," Journal of Intelligence and Information Systems, Vol. 19, pp. 125-140, 2013.
Lee, S. J. and Kim, H. J., "Keyword extraction from news corpus using modified TF-IDF," The Journal of Society for e-Business Studies, Vol. 14, No. 4, pp, 59-73, 2009.
Lorena, A. C., De Carvalho, A. C., and Gama, J. M. P., "A review on the combination of binary classifiers in multiclass problems," Artificial Intelligence Review, Vol. 30, No. 19, 2008.
Madzarov, G., Gjorgjevikj, D., and Chorbev, I., "A multi-class SVM classifier utilizing binary decision tree," Informatica, Vol. 33, 2009.
Min, J. H. and Lee, Y. C., "Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters," Expert Systems with Applications, Vol. 28, pp. 603-614, 2005.
Naik, A. and Rangwala, H., "Improving large-scale hierarchical classification by rewiring: A data-driven filter based approach," Journal of Intelligent Information Systems, Vol. 52, pp. 141-164, 2019
Park, J. H. and Kim, J. S., "A text classification system for hierarchical categories," The Korean Institute of Information Scientists and Engineers, Vol. 27, No. 2, pp. 128-130, 2000.
Silla, C. N. and Freitas, A. A., "A survey of hierarchical classification across different application domains," Data Mining and Knowledge Discovery, Vol. 22, pp. 31-72, 2011
Silva-Palacios, D., Ferri, C., and Ramirez-Quintana, M. J., "Probabilistic class hierarchies for multiclass classification," Journal of Computational Science, Vol. 26, pp. 254-263, 2018
Sun, A., Lim, E. P., Ng, W. K., and Srivastava, J., "Blocking reduction strategies in hierarchical text classification," IEEE Transactions on Knowledge and Data Engineering, Vol. 16, pp. 1305-1308, 2004
Tegegnie, A. K., Tarekegn, A. N., and Alemu, T. A., "A comparative study of flat and hierarchical classification for amharic news text using SVM," Information Engineering and Electronic Business, Vol. 3, pp. 36-42, 2017.
UNEP, "Technologies for climate change mitigation," UNEP, 2011.
Vapnik, V., "Estimation of Dependences Based on Empirical Data." Nauka, Moscow, 1979.
Vapnik, V., "The nature of statistical learning theory", Chapter 5. Springer-Verlag, New York, 1995.
Williams, T. P. and Gong, J., "Predicting construction cost overruns using text mining, numericaldata and ensemble classifiers," Automation in Construction, Vol. 43, pp. 23-29, 2014
Yoon, Y. W. Lee, C. K., and Lee, G. B., "Hierarchical text categorization using support vector machine," Annual Conference on Human and Language Technology, pp. 7-13, 2013.
Zhang, L., Shah, S. K., and Kakadiaris, I. A., "Hierarchical multi-label classification using fully associative ensemble learning," Pattern Recognition, Vol. 70, pp. 89-103, 2017.
Zhao, Z., Wang, X., and Wang, T., "A novel measurement data classification algorithm based on SVM for tracking closely spaced targets," IEEE Transactions on Instrumentation and Measurement, Vol. 68, No. 4, pp. 1089-1100, 2019.
Zheng, J., Guo, Y., Feng, C., and Chen., H., "A hierarchical neural network based document representation approach for text classification," Mathematical Problems in Engineering, Vol. 2018, 2018.
*원문 PDF 파일 및 링크정보가 존재하지 않을 경우 KISTI DDS 시스템에서 제공하는 원문복사서비스를 사용할 수 있습니다.
오픈액세스 학술지에 출판된 논문
※ AI-Helper는 부적절한 답변을 할 수 있습니다.