$\require{mediawiki-texvc}$

연합인증

연합인증 가입 기관의 연구자들은 소속기관의 인증정보(ID와 암호)를 이용해 다른 대학, 연구기관, 서비스 공급자의 다양한 온라인 자원과 연구 데이터를 이용할 수 있습니다.

이는 여행자가 자국에서 발행 받은 여권으로 세계 각국을 자유롭게 여행할 수 있는 것과 같습니다.

연합인증으로 이용이 가능한 서비스는 NTIS, DataON, Edison, Kafe, Webinar 등이 있습니다.

한번의 인증절차만으로 연합인증 가입 서비스에 추가 로그인 없이 이용이 가능합니다.

다만, 연합인증을 위해서는 최초 1회만 인증 절차가 필요합니다. (회원이 아닐 경우 회원 가입이 필요합니다.)

연합인증 절차는 다음과 같습니다.

최초이용시에는
ScienceON에 로그인 → 연합인증 서비스 접속 → 로그인 (본인 확인 또는 회원가입) → 서비스 이용

그 이후에는
ScienceON 로그인 → 연합인증 서비스 접속 → 서비스 이용

연합인증을 활용하시면 KISTI가 제공하는 다양한 서비스를 편리하게 이용하실 수 있습니다.

PCA document reconstruction for email classification 원문보기

Computational statistics & data analysis, v.56 no.3, 2012년, pp.741 - 751  

Gomez, J.C. (KULEUVEN, Computer Science Department, Celestijnenlaan 200A, B-3001 Heverlee, Belgium) ,  Moens, M.F.

Abstract AI-Helper 아이콘AI-Helper

This paper presents a document classifier based on text content features and its application to email classification. We test the validity of a classifier which uses Principal Component Analysis Document Reconstruction (PCADR), where the idea is that principal component analysis (PCA) ca...

주제어

참고문헌 (48)

  1. Abu-Nimeh 60 2007 Proceedings of the Anti-Phishing Working Groups 2nd Annual eCrime Researchers Summit: eCrime 2007 A comparison of machine learning techniques for phishing detection 

  2. Anderson 2003 An Introduction to Multivariate Statistical Analysis 

  3. Androutsopoulos 9 2000 Proceedings of the 11th European Conference on Machine Learning: ECML 2009, Workshop on Machine Learning in the New Information Age An evaluation of naive Bayesian anti-spam filtering 

  4. Barman 703 2006 Proceedings of the 13th International Conference ICONIP 2006 Non-negative matrix factorization based text mining: feature extraction and classification 

  5. Berry 2782 2009 Proceedings of the IEEE International Symposium on Circuits and Systems 2009 Document classification using nonnegative matrix factorization and underapproximation 

  6. Biro 29 2008 Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web: AIRWeb 2008 Latent Dirichlet allocation in web spam filtering 

  7. Journal of Machine Learning Research Blei 3 993 2003 Latent Dirichlet allocation 

  8. Journal of Machine Learning Research Bratko 7 2673 2006 Spam filtering using statistical data compression models 

  9. Brutlag 2000 Proceedings of the 17th International Conference on Machine Learning: ICML 2000 Challenges of the email domain for text classification 

  10. Carreras 58 2001 Proceedings of the 4th International Conference on Recent Advances in Natural Language Processing: RANLP 2001 Boosting trees for anti-spam email filtering 

  11. Cormack, G.V., 2007. Spam track overview. In: Proceedings of the 16th Text REtrieval Conference: TREC-2007. National Institute of Standards and Technology (NIST). 

  12. Journal of the American Society for Information Science Deerwester 41 391 1990 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9 Indexing by latent semantic analysis 

  13. IEEE Transactions on Neural Networks Drucker 10 5 1048 1999 10.1109/72.788645 Support vector machines for spam categorization 

  14. SIGKDD Explorations Fawcett 5 2 203 2003 10.1145/980972.980990 In vivo spam filtering: a challenge problem for data mining 

  15. Fette 649 2007 Proceedings of the 16th International World Wide Web Conference: WWW 2007 Learning to detect phishing emails 

  16. Annals of Eugenics Fisher 7 179 1936 10.1111/j.1469-1809.1936.tb02137.x The use of multiple measurements in taxonomic problems 

  17. Gansterer, W.N., Ilger, M., Lechner, P., Neumayer, R., Strauss, J., 2005. Anti-spam methods - state of the art. Tech. rep. 

  18. Gansterer 165 2007 Survey of Text Mining II: Clustering, Classification, and Retrieval Spam filtering based on latent semantic indexing 

  19. Gansterer 449 2009 Proceedings of the 31st European Conference on Information Retrieval: ECIR 2009 E-mail classification for phishing defense 

  20. Gee 460 2003 Proceedings of the 2003 ACM Symposium on Applied Computing, Data Minning Track Using latent semantic indexing to filter spam 

  21. Gomez 566 2010 Proceedings of the 14th International Conference KES 2010 Using biased discriminant analysis for email filtering 

  22. Gomez, J.C., Moens, M.-F., 2011. Highly discriminative statistical features for email classification. Knowledge and Information Systems, in press (doi:10.1007/s10115-011-0403-7). 

  23. Scientific American Goodman 292 4 42 2005 10.1038/scientificamerican0405-42 Stopping spam 

  24. Expert Systems with Applications Guzella 36 10206 2009 10.1016/j.eswa.2009.02.037 A review of machine learning approaches to spam filtering 

  25. Hartley, R., Schaffalitzky, F., 2004. PowerFactorization: 3d reconstruction with missing or uncertain data. In: Proceedings of the Australia-Japan Advanced Workshop on Computer Vision: AJAW 2003. 

  26. Pattern Recognition Hoffmann 40 863 2007 10.1016/j.patcog.2006.07.009 Kernel PCA for novelty detection 

  27. Hofmann 50 1999 Proceedings of the 22nd Annual International ACM SIGIR Probabilistic latent semantic indexing 

  28. Journal of Educational Psychology Hotelling 24 7 498 1933 10.1037/h0070888 Analysis of a complex of statistical variables into principal components 

  29. 10.1007/11893004_51 Ishii, N., Murai, T., Yamada, T., Bao, Y., Suzuki, S., 2006. Text classification: combining grouping, LSA and kNN vs support vector machine. In: Knowledge-Based Intelligent Information and Engineering Systems, Lecture Notes in Computer Science, vol. 4252, pp. 393-400. 

  30. Janecek 2010 Utilizing Nonnegative Matrix Factorization for Email Classification Problems 

  31. Jolliffe 1986 Principal Component Analysis 

  32. International Journal on Artificial Intelligence Tools Kanaris 16 6 1047 2007 10.1142/S0218213007003692 Words vs. character n-grams for anti-spam filtering 

  33. Journal of Machine Learning Research Kim 6 37 2005 Dimension reduction in text classification with support vector machines 

  34. Image and Vision Computing Malagon-Borja 27 1-2 2 2009 10.1016/j.imavis.2007.03.004 Object detection using image reconstruction with PCA 

  35. Annals of Mathematical Statistics Mann 18 1 50 1947 10.1214/aoms/1177730491 On a test of whether one of two random variables is stochastically larger than the other 

  36. SIAM: Journal of Numerical Analysis Moler 10 2 241 1973 10.1137/0710024 An algorithm for generalized matrix eigenvalue problems 

  37. IEEE Transactions on Pattern Analysis and Machine Intelligence Morita 19 8 858 1997 10.1109/34.608289 A sequential factorization method for recovering shape and motion from image streams 

  38. Philosophical Magazine Pearson 2 6 559 1901 10.1080/14786440109462720 On lines and planes of closest fit to systems of points in space 

  39. Platt 1998 Fast Training of Support Vector Machines Using Sequential Minimal Optimization 

  40. 10.1007/11760023_39 Pu, Q., Yang, G.-W., 2006. Short-text classification based on ICA and LSA. In: Advances in Neural Networks, vol. 3972, Lecture Notes in Computer Science. pp. 265-270. 

  41. Linux Journal Robinson 2003 107 58 2003 A statistical approach to the spam problem 

  42. Sculley 9 2007 Proceedings of the 30th Annual International ACM SIGIR Conference Relaxed online SVMs for spam filtering 

  43. Silva 300 2009 Proceedings of the 10th International Conference IDEAL 2009 Knowledge extraction with non-negative matrix factorization for text classification 

  44. Torkkola 2001 Proceedings of the 2001 IEEE ICDM Workshop on Text Mining Linear discriminant analysis in document classification 

  45. International Journal of Computer Vision Vidal 79 1 85 2008 10.1007/s11263-007-0099-z Multiframe motion segmentation with missing data using PowerFactorization and GPCA 

  46. Witten 2000 Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations 

  47. Xia 474 2006 Proceedings of the 23rd Annual ACM symposium on Applied Computing: SAC 2008 Binarization approaches to email categorization 

  48. Knowledge-Based Systems Yu 21 4 355 2008 10.1016/j.knosys.2008.01.001 A comparative study for content-based dynamic spam classification using four machine learning algorithms 

관련 콘텐츠

오픈액세스(OA) 유형

GREEN

저자가 공개 리포지터리에 출판본, post-print, 또는 pre-print를 셀프 아카이빙 하여 자유로운 이용이 가능한 논문

저작권 관리 안내
섹션별 컨텐츠 바로가기

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

AI-Helper 아이콘
AI-Helper
안녕하세요, AI-Helper입니다. 좌측 "선택된 텍스트"에서 텍스트를 선택하여 요약, 번역, 용어설명을 실행하세요.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.

선택된 텍스트

맨위로