최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기지능정보연구 = Journal of intelligence and information systems, v.15 no.3, 2009년, pp.1 - 15
김명종 (동서대학교 경영학부)
In a classification problem, data imbalance occurs when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such...
* AI 자동 식별 결과로 적합하지 않은 문장이 있을 수 있으니, 이용에 유의하시기 바랍니다.
핵심어 | 질문 | 논문에서 추출한 답변 |
---|---|---|
데이터 불균형 문제는 어떤 경우에 나타나는가? | 데이터 불균형 문제는 분류 및 예측 문제에서 하나의 범주에 속하는 표본의 수가 다른 범주들에 속하는 표본 수에 비하여 현저하게 적을 경우 나타난다. 데이터 불균형이 심화됨에 따라 범주 사이의 분류 경계영역이 왜곡되고 결과적으로 분류자의 학습성과가 저하되는 문제가 발생한다. | |
기하평균 정확도는 어떻게 계산되는가? | 이러한 문제점에 대한 해결대안으로 제안된 방법이 기하평균 정확도와 ROC 분석이다. 기하평균 정확도는 다수 범주의 정확도와 소수 범주의 정확도를 모두 고려한 성과지표로 (민감도 × 특이도) 1/2로 계산된다(Kubat et al., 1997). | |
SVM의 장점으로 인해 어떤 분야에도 활발하게 적용되고 있는가? | SVM은 첫째, 명료한 이론적 근거에 기반하므로 결과 해석이용이하고, 둘째, 실제 응용에 있어 높은 성과를 나타내고, 셋째, 입력변수의 차원에 의존하지 않고 자료의 수에 의존하여 신속하게 학습을 수행할 수 있 으며, 넷째, 구조적 위험 최소화 원칙(structural risk minimization)에 기반하므로 과대적합 (overfitting) 문제에 견고하다는 장점이 있다. 이러한 장점으로 인하여 문자인식, 이미지 인식, 마이크로어레이 분석 등 자연과학 분야에서 적용되어 왔으며 최근 시계열 예측 및 분류(Cao and Tay, 2001; Kim, 2004; Tay and Cao, 2002), 채권신용등급(Huang et al., 2004 ), 기업부실예측(Shin et al., 2005; Min et al., 2006) 등 경영분야에도 활발하게 적용되고 있다. |
강필성, 조성준 (2006), "데이터 불균형 해결을 위한 Under-sampling 기반 앙상블 SVMs", 대한산업공학회/한국경영과학회 2006 춘계공동학술대회.
Altman, E. L., "Financial ratios, discriminant analysis and the prediction of corporate bankruptcy", The Journal of Finance, Vol.23 No.4(1968), 589-609.
Altman, E. L., I. Edward, R. Haldeman, and P. Narayanan, "A new model to identify bankruptcy risk of corporations", Journal of Banking and Finance, Vol.1(1977), 29-54.
Beaver, W., "Financial ratios as predictors of failure, empirical research in accounting:Selected studied", Journal ofAccounting Research, Vol.4, No.3(1966), 71-111.
Bruzzone, L. and S. B. Serpico, "Classifications of imbalanced remote-sensing data by neural networks", Pattern recognition letters, Vol.18, No.11-13(1997), 1323-1328.
Bryant, S. M., "A case-based reasoning approach to bankruptcy prediction modeling", International Journal of Intelligent Systems in Accounting, Finance and Management, Vol.6, No.3(1997), 195-214
Buta, P., "Mining for financial knowledge with CBR", AI Expert, Vol.9. No.10(1994), 34-41.
Cao, L. and F. E. H. Tay, "Financial forecasting using support vector machines", Neural Computing and Applications, Vol.10(2001), 184-192.
Chawla, N., K. Bowyer, L. Hall, and W. Kegelmeyer, "SMOTE: synthetic minority oversampling techniques", Journal of Artificial Intelligence Research, Vol.16(2002), 321-357.
Chawla, N., A. Lazarevic, L. Hall, and K. Bowyer, "SMOTEBoost:improving prediction of the minority class in boosting", 7th European conference on principles and practice of knowledge discovery in databases. Cavtat-Dubrovnik, Croatia, (2003), 107-119.
Cover, T. M. and J. A. Thomas, Element of information theory, John Wiley and Sons, (1991).
Darbellay, G. A., "An estimator of the mutual information based on a criterion for independence", Computational Statistics and Data Analysis, Vol.32(1999), 1-17.
Dimitras, A. I., S. H. Zanakis, and C. Zopounidis, "A survey of business failure with an emphasis on prediction methods and industrial applications", European Journal of Operational Research, Vol.90, No.3(1996), 487-513.
Elkan, C., "The foundation of cost-sensitive learning", In Proceedings of the 17th International Joint Conference on Artificial Intelligence, (2001), 973-978, Seattle, WA.
Fawcett, T., "An introduction to ROC analysis", Pattern Recognition Letters, Vol.27(2006), 861-874.
Fawcett, T. and F. Provost, "Adaptive fraud detection", Data Mining and Knowledge discovery, Vol.1, No.3(1997), 291-316.
Freund, Y. and R. E. Schapire, "A decision theoretic generalization of online learning and an application to boosting", Journal of Computer and System Science, Vol.55, No.1(1997), 119-139.
Han, I., J. S. Chandler, and T. P. Liang, "The impact of measurement scale and correlation structure on classification performance of inductive learning and statistical methods". Expert System with Applications, Vol.10, No.2(1996), 209-221.
Hong, X., "A kernel-based two-class classifier for imbalanced data sets", IEEE Transactions on neural networks, Vol.18, No.1(2007), 28-40.
Huang, Zan, Chen, Hsinchun, Hsu, Chia-Jung, Chen, Wun-Hwa, and Wu, Soushan, "Credit rating analysis with support vector machines and neural networks. A market comparative study", Decision Support Systems, Vol.37(2004), 543-558.
Japkowicz, N. and S. Stephen, "The class imbalance problem:a systematic study", Intelligent Data Analysis, Vol.6, No.5(2002), 429-250.
Kim, K., "Financial time series forecasting using support vector machines", Neurocomputing, Vol.55(2004), 307-319.
Kotsiantis, S., D. Tzelepis, E. Kounmanakos, and V. Tampakas, "Selective costing voting for bankruptcy prediction", International Journal of Knowledge-based and Intelligent Engineering Systems, Vol.11(2007), 115-127.
Kubat, M., Holte, R., and S. Matwin, "Learning when Negative example abound", Proceedings of the 9th European Conference on Machine Learning, ECML'97 (1997).
Kubat M. and S. Matwin, "Addressing the curse of imbalanced training sets:one-sided selection", In Proceedings of the Fourteenth International Conference onMachine Learning, (1997), 179-186.
Laitinen, T. and M. Kankaanpaa, "Comparative analysis of failure prediction methods:the Finish case", European Accounting Review, Vol.8, No.1(1999), 67-92.
Laurikkala, J., "Instance-based data reduction for improved identification of difficult small classes", Intelligent Data Analysis, Vol.6, No.4(2002), 311-322.
Maia, T. T., A. P. Braga, and A. F. Carvalho, "Hybrid classification algorithms based on boosting and support vector machines", Kybernetes, Vol.37, No.9(2008), 1469-1491.
Meyer, P. A. and H. Pifer, "Prediction of bank failures", The Journal of Finance, Vol.25(1970), 853-68.
Min, S. H., J. M. Lee, and I. G. Han, Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Systems with Applications, Vol.31(2006), 652-660.
Odom, M. and R. Sharda, "A neural network for bankruptcy prediction", Proceedings of the International Joint Conference on Neural Networks, IEEE Press, San Diego, CA. (1990).
Ohlson, J., "Financial ratios and the probabilistic prediction of bankruptcy", Journal of Accounting Research, Vol.18, No.1(1980), 109-131.
Optiz, D. and R. Maclin, "Popular ensemble methods: an empirical study", Journal of Artificial Intelligence, Vol.11(1999), 169-198.
Pantalone, C. and M. B. Platt, "Predicting commercial bank failure since deregulation", New England Economic Review, (1987), 37-47.
Platt, J., "Fast Training of Support Vector Machines using Sequential Minimal Optimization. In B. Schoelkopf, C. Burges, and A. Smola, (Eds.)", Advances in Kernel Methods-Support Vector Learning, MIT Press, (1998).
Ravi, P. and K. V. Ravi, "Bankruptcy prediction in banks and firms via statistical and intelligent techniques-a review", European Journal of Operational Research, Vol.180(2007), 1-28.
Seiffert, C., T. M. Khoshgoftaar, J. Van Hulse, and A. Napolitano, "RUSBoost: Improving classification performance when training data is skewed", 19th International Conference on Pattern Recognition, (2008), 1-4.
Shaw, M. and J. Gentry, "Using and expert system with inductive learning to evaluate business loans", Financial Management, Vol.17, No.3(1998), 45-56.
Shin, H. J. and S. Z. Cho, "Response modeling with support vector machines", Expert Systems with applications, Vol.30, No.4(2006), 746-760.
Shin, K., T. Lee, and H. Kim, "An application of support vector machines in bankruptcy prediction", Expert Systems with Applications, Vol.28(2005), 127-135.
Tay. F. E. J. and L. J. Cao, "Modified support vector machine in financial time series forecasting", Neurocomputing, Vol.48(2002), 847-861.
Vapnik, V. N., "The nature of statistical learning theory", New York:Springer, (1995).
Wang, B. X. and N. Japkowicz, "Boosting support vector machines for imbalanced data sets", Knowledge and Information Systems, forthcoming, (2009).
Weiss, G. M., "Mining with rarity:a unifying framework", SIGKDD Explorations, Vol.T, No.1(2004), 7-19.
Wu, G. and E. Chang, "Adaptive feature-space conformal transformation for imbalanced data learning", In Proceedings of the 20th International Conference on Machine Learning, (2003).
Wu, G. and E. Chang, "KBA: Kernel boundary alignment considering imbalanced data distribution", IEEE Transactions on knowledge and data engineering, Vol.17, No.6(2005), 786-795.
Wu, G. Y. Wu, L. Jiao, Y. F. Wang, and E. Chang, "Multi-camera spatio-temporal fusion and biased sequence-data learning for security surveillance", Proceedings of 20th International Conference on Multimedia, (2003).
Yan, R., Y. Liu, and R. Hauptman, "On predicting rare classes with SVM ensembles in scene classification", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'03), (2003).
Zmijewski, M. E.,:"Methodological issues related to the estimation of financial distress prediction models", Journal of Accounting Research, Vol.22, No.1(1984), 59-82.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.