[논문]Support Vector Regression을 이용한 희소 데이터의 전처리

전성해; 박정은; 오경환

doi:10.5391/jkiis.2004.14.6.789

Support Vector Regression을 이용한 희소 데이터의 전처리
A Sparse Data Preprocessing Using Support Vector Regression 원문보기

퍼지 및 지능시스템학회 논문지 = Journal of fuzzy logic and intelligent systems, v.14 no.6, 2004년, pp.789 - 792

전성해 (청주대학교 통계학과) , 박정은 (서강대학교 컴퓨터학과) , 오경환 (서강대학교 컴퓨터학과)

초록
AI-Helper

웹 마이닝, 바이오정보학, 통계적 자료 분석 등 여러 분야에서 매우 다양한 형태의 결측치가 발생하여 학습 데이터를 희소하게 만든다. 결측치는 주로 전처리 과정에서 가장 기본적인 평균과 최빈수뿐만 아니라 조건부 평균, 나무 모형, 그리고 마코프체인 몬테칼로 기법과 같은 결측치 대체 기법들을 적용하여 추정된 값에 의해 대체된다. 그런데 주어진 데이터의 결측치 비율이 크게 되면 기존의 결측치 대체 방법들의 예측의 정확도는 낮아지는 특성을 보인다. 또한 데이터의 결측치 비율이 증가할수록 사용 가능한 결측치 대체 방법들의 수는 제한된다. 이러한 문제점을 해결하기 위하여 본 논문에서는 통계적 학습 이론 중에서 Vapnik의 Support Vector Regression을 데이터 전처리 과정에 알맞게 변형하여 적용하였다. 제안 방법을 이용하여 결측치 비율이 큰 희소 데이터의 전처리도 가능할 수 있도록 하였다 UCI machine learning repository로부터 얻어진 데이터를 이용하여 제안 방법의 성능을 확인하였다.

Abstract ▼ AI-Helper

In various fields as web mining, bioinformatics, statistical data analysis, and so forth, very diversely missing values are found. These values make training data to be sparse. Largely, the missing values are replaced by predicted values using mean and mode. We can used the advanced missing value imputation methods as conditional mean, tree method, and Markov Chain Monte Carlo algorithm. But general imputation models have the property that their predictive accuracy is decreased according to increase the ratio of missing in training data. Moreover the number of available imputations is limited by increasing missing ratio. To settle this problem, we proposed statistical learning theory to preprocess for missing values. Our statistical learning theory is the support vector regression by Vapnik. The proposed method can be applied to sparsely training data. We verified the performance of our model using the data sets from UCI machine learning repository.

주제어

참고문헌 (11)

G. Casella, R. L. Berger, “Statistical Inference”, Duxbury Press, (1990).
C. Cortes, V. Vapnik, “Support Vector Networks”, Machine Learning, vol. 20, 273-297, 1995.

상세보기
J. Han, K. Kamber, "Data Mining: concepts and Techniques", Morgan Kaufmann Publishers, 2000.
D. C. Hoaglin, F. Mosteller, J. W. Tukey, nderstanding robust and exploratory data analysis”, John Wiley & Sons Inc. 2000.
R. J. A. Lavori, R. Dawson, D. Shera, “A Multiple Imputation Strategy for Clinical Trials with Truncation of Patent Data”, Statistics in Medicine, vol. 14, 1913-1925, 1995.

상세보기
R. J. A. Little, D. B. Rubin, “Statistical Analysis with Missing Data”, Wiley Interscience, 2002.
D. B. Rubin, “Multiple Imputation for Nonresponse in Surveys”, John Wiley & Sons, 1987.
J. L. Schafer, “Analysis of Incomplete Multivariate Data”, Chapman and Hall, 1997.
V. N. Vapnik, “The Nature of Statistical Learning Theory”, Springer, 1995.
V. N. Vapnik, “Statistical Learning Theory”, Hohn Wiley & Sons, 1998.
UCI Machine Learning Repository, www.ics.uci. edu/mlearn

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Support Vector Regression을 이용한 희소 데이터의 전처리
A Sparse Data Preprocessing Using Support Vector Regression 원문보기

초록
AI-Helper

Abstract ▼ AI-Helper

주제어

참고문헌 (11)

이 논문을 인용한 문헌

저자의 다른 논문 :

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Support Vector Regression을 이용한 희소 데이터의 전처리 A Sparse Data Preprocessing Using Support Vector Regression 원문보기

초록 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

참고문헌 (11)

이 논문을 인용한 문헌

저자의 다른 논문 :

전성해 (35) 박정은 (2) 오경환 (20)

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

Support Vector Regression을 이용한 희소 데이터의 전처리
A Sparse Data Preprocessing Using Support Vector Regression 원문보기

초록
AI-Helper