$\require{mediawiki-texvc}$

연합인증

연합인증 가입 기관의 연구자들은 소속기관의 인증정보(ID와 암호)를 이용해 다른 대학, 연구기관, 서비스 공급자의 다양한 온라인 자원과 연구 데이터를 이용할 수 있습니다.

이는 여행자가 자국에서 발행 받은 여권으로 세계 각국을 자유롭게 여행할 수 있는 것과 같습니다.

연합인증으로 이용이 가능한 서비스는 NTIS, DataON, Edison, Kafe, Webinar 등이 있습니다.

한번의 인증절차만으로 연합인증 가입 서비스에 추가 로그인 없이 이용이 가능합니다.

다만, 연합인증을 위해서는 최초 1회만 인증 절차가 필요합니다. (회원이 아닐 경우 회원 가입이 필요합니다.)

연합인증 절차는 다음과 같습니다.

최초이용시에는
ScienceON에 로그인 → 연합인증 서비스 접속 → 로그인 (본인 확인 또는 회원가입) → 서비스 이용

그 이후에는
ScienceON 로그인 → 연합인증 서비스 접속 → 서비스 이용

연합인증을 활용하시면 KISTI가 제공하는 다양한 서비스를 편리하게 이용하실 수 있습니다.

Machine learning on big data: Opportunities and challenges 원문보기

Neurocomputing, v.237, 2017년, pp.350 - 361  

Zhou, Lina (Information Systems Department, UMBC, Baltimore, MD 21250, United States) ,  Pan, Shimei (Information Systems Department, UMBC, Baltimore, MD 21250, United States) ,  Wang, Jianwu (Information Systems Department, UMBC, Baltimore, MD 21250, United States) ,  Vasilakos, Athanasios V. (Department of Computer Science, Electrical and Space Engineering, Luleå)

Abstract AI-Helper 아이콘AI-Helper

Abstract Machine learning (ML) is continuously unleashing its power in a wide range of applications. It has been pushed to the forefront in recent years partly owing to the advent of big data. ML algorithms have never been better promised while challenged by big data. Big data enables ML algorithms...

주제어

참고문헌 (112)

  1. Science Jordan 349 255 2015 10.1126/science.aaa8415 Machine learning: trends, perspectives, and prospects 

  2. J. Big Data Tsai 2 1 2015 10.1186/s40537-015-0030-3 Big data analytics: a survey 

  3. J. Big Data Najafabadi 2 1 2015 10.1186/s40537-014-0007-7 Deep learning applications and challenges in big data analytics 

  4. Japkowicz 2011 Evaluating Learning Algorithms: a Classification Perspective 

  5. Russell 2010 Artificial Intelligence: A Modern Approach 

  6. IEEE Trans. on Pattern Anal. Mach. Intell., Trans. Bengio 35 1798 2013 10.1109/TPAMI.2013.50 Representation learning: a review and new perspectives 

  7. Dekel 377 2008 NIPS From Online to Batch Learning with Cutoff-Averaging 

  8. AI Mag. Amershi 35 105 2014 Power to the people: the role of humans in Interactive machine learning 

  9. Expert Syst. Mirchevska 31 163 2014 10.1111/exsy.12019 Combining domain knowledge and machine learning for robust fall detection 

  10. Yu 2007 Computing Sciences Incorporating Prior Domain Knowledge into Inductive Machine Learning 

  11. Proc. ACM Ninth Int. Workshop Data Text. Min. Biomed. Inform. Chen 4 2015 10.1145/2811163.2811175 Evaluation of a machine learning duplicate detection method for bioinformatics Databases 

  12. ACM Trans. Knowl. Discov. Data Rakthanmanon 7 10 2013 10.1145/2500489 Addressing Big data time series: mining Trillions of time series subsequences Under dynamic time Warping 

  13. 10.1145/2736277.2741668 J.J.Pfeiffer , III, J.Neville, P.N.Bennett, Overcoming relational learning biases to accurately predict preferences in large scale networks, in: Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 853-863. 

  14. 10.1145/2783258.2783387 L.Cao, M.Wei, D.Yang, E.A.Rundensteiner, Online outlier exploration over large datasets, in: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, pp. 89-98. 

  15. Int. J. Inf. Manag. Gandomi 35 137 2015 10.1016/j.ijinfomgt.2014.10.007 Beyond the hype: Big data concepts, methods, and analytics 

  16. X.Cai, F.Nie, H.Huang, Multi-view K-means clustering on big data, in: Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, 2013, pp. 2598-2604. 

  17. 10.1002/widm.1173 S. Ramírez-Gallego, S. García, H. Mouriño-Talín, D. Martínez-Rego, V. Bolón-Canedo, A. Alonso-Betanzos, et al., "Data discretization: taxonomy and big data challenge," Wiley Interdisciplinary Reviews, Data Mining and Knowledge Discovery, vol. 6, pp. 5-21, 2016. 

  18. Y.Z.Y.-M.Cheung, Discretizing Numerical Attributes in Decision Tree for Big Data Analysis, in: Proceedings of the 2014 IEEE International Conference on Data Mining Workshop (ICDMW), 2014. 

  19. Proc. 1st ACM Int. Workshop Pers. data meets Distrib. Multimed. Nguyen-Dinh 35 2013 10.1145/2509352.2509396 Combining crowd-generated media and personal data: semi-supervised learning for context recognition 

  20. Science Lake 350 1332 2015 10.1126/science.aab3050 Human-level concept learning through probabilistic program induction 

  21. Int. J. Comput. Healthc. Zhang 2 98 2015 10.1504/IJCIH.2015.069788 Semi-supervised learning methods for large scale healthcare data analysis 

  22. J. Suzuki, H. Isozaki, and M. Nagata, Learning condensed feature representations from large unsupervised data sets for supervised learning, in: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Human Language Technologies, short papers, 2, 2011, pp. 636-641. 

  23. Proc. VLDB Endow. Mozafari 8 125 2014 10.14778/2735471.2735474 Scaling up crowd-sourcing to very large datasets: a case for active learning 

  24. Clust. Comput. Su 17 1081 2014 10.1007/s10586-014-0360-5 Effective and efficient data sampling using bitmap indices 

  25. Appl. Soft Comput. Bolón-Canedo 30 136 2015 10.1016/j.asoc.2015.01.035 Distributed feature selection 

  26. Inf. Fusion Sun 26 36 2015 10.1016/j.inffus.2015.03.001 A review of Nyström methods for large-scale machine learning 

  27. J. Mach. Learn. Res. Tan 15 1371 2014 Towards ultrahigh dimensional feature selection for big data 

  28. 10.1145/1273496.1273641 Z.Zhao, H.Liu, Spectral feature selection for supervised and unsupervised learning, in: Proceedings of the 24th international conference on Machine learning, 2007, pp. 1151-1157. 

  29. 10.1007/11925231_54 J. Cervantes, X. Li, W. Yu, Support vector machine classification based on fuzzy clustering for large data sets, in: Proceedings of the 5th MICAI, 2015, pp. 572-582. 

  30. 10.1109/ICDCSW.2014.14 O. Y. S. Al-Jarrah, A., M. Elsalamouny, P. D. Yoo, S. Muhaidat, and K. Kim, Machine-Learning-Based Feature Selection Techniques for Large-Scale Network Intrusion Detection, in: Proceedings of the 2014 IEEE 34th International Conference on in Distributed Computing Systems Workshops (ICDCSW). 

  31. Soft Comput. - A Fusion Found., Methodol. Appl. Azar 19 1115 2015 Dimensionality reduction of medical big data using neural-fuzzy classifier 

  32. J. Mach. Learn. Res. Vincent 11 3371 2010 Stacked denoising Autoencoders: learning useful representations in a deep network with a local denoising criterion 

  33. Neurocomputing Liou 139 84 2014 10.1016/j.neucom.2013.09.055 Autoencoder for words 

  34. Proc. 23rd Int. Conf. Mach. Learn. Collobert 201 2006 Trading convexity for scalability 

  35. Bengio 2007 Large Scale Kernel Machines Scaling learning algorithms towards, AI 

  36. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, et al., "TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems," CoRR, 2016. 

  37. J. Parallel Distrib. Comput. You 76 16 2015 10.1016/j.jpdc.2014.09.005 Scaling support vector machines on modern HPC platforms 

  38. Proc. VLDB Endow. Panda 2 1426 2009 10.14778/1687553.1687569 PLANET: massively parallel learning of tree ensembles with MapReduce 

  39. IEEE Trans. Big Data Xing 49 2015 10.1109/TBDATA.2015.2472014 Petuum: a new platform for distributed machine learning on Big data 

  40. R. Collobert, K. Kavukcuoglu, and C. Farabet, Torch7: A Matlab-like Environment for Machine Learning, in: Proceedings of the Neural Information Processing Systems (NIPS) Workshop on BigLearn, 2011. 

  41. 10.1145/2783258.2789989 T.Yang, Q.Lin, R.Jin, Big data analytics: Optimization and randomization, in: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, pp. 2327-2327. 

  42. W. Xu, Towards Optimal one pass large scale learning with averaged stochastic gradient descent, 2011. Available at: arXiv:1107.2490. 

  43. 10.1007/978-3-7908-2604-3_16 L. Bottou, Large-Scale Machine Learning with Stochastic Gradient Descent, in: Proceedings of COMPSTAT, 2010, pp. 177-186. 

  44. Proc. 2014 IEEE/ACM Int. Symp. Big Data Comput. Wang 16 2014 10.1109/BDC.2014.10 A Scalable data Science workflow approach for Big data Bayesian network learning 

  45. Neurocomputing Yue 219 364 2017 10.1016/j.neucom.2016.09.042 A data-intensive approach for discovering user similarities in social behavioral interactions based on the bayesian network 

  46. A. Kumar, A. Beutel, Q. Ho, E.P. Xing, Fugue: Slow-Worker-Agnostic Distributed Learning for Big Models on Big Data, in: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS), Reykjavik, Iceland, 2014, pp. 531-539. 

  47. Sankar 2015 Fast Data Processing with Spark 

  48. Owen 2011 Mahout in Action 

  49. NIPS Chu 281 2006 Map-reduce for machine learning on multicore 

  50. 10.1109/ICDE.2011.5767930 A.K.Ghoting, R.E.Pednault, B.Reinwald, V.Sindhwani, S.Tatikonda, Y.Tian, et al., SystemML: Declarative machine learning on MapReduce, in: Proceedings of the 27th International Conference on Data Engineering (ICDE), 2011. 

  51. IEEE Data Eng. Bull. Borkar 35 24 2012 Declarative systems for large-scale machine learning 

  52. Proc. VLDB Endow. Low 5 716 2012 10.14778/2212351.2212354 Distributed GraphLab: a framework for machine learning and data mining in the cloud 

  53. Theano Development Team, Theano: A Python framework for fast computation of mathematical expression. Available: arXiv:1605.02688. 

  54. 10.1145/2647868.2654889 Y.Jia, E.Shelhamer, J.Donahue, S.Karayev, J.Long, R.Girshick, et al., Caffe: Convolutional Architecture for Fast Feature Embedding, in: Proceedings of the 22nd ACM international conference on Multimedia, Orlando, Florida, USA, 2014. 

  55. IEEE Trans. Pattern Anal. Mach. Intell. Dong 27 603 2005 10.1109/TPAMI.2005.77 Fast SVM training algorithm with decomposition on very large data sets 

  56. J. Dean, G. S. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, et al., Large scale distributed deep networks, in: Proceedings of the Neural Information Processing Systems, Lake Tahoe, Nevada, United States, 2012, pp. 1232-1240. 

  57. Mason 2016 Machine Learning Techniques for Gait Biometric Recognition: Using the Ground Reaction Force 

  58. Q.V.Le, J.Ngiam, A.Coates, A.Lahiri, B.Prochnow, A.Y.Ng, On optimization methods for deep learning, in: Proceedings of the 28th International Conference on Machine Learning, Bellevue, WA, USA, 2011. 

  59. Proc. Third Workshop Large Scale Data Min.: Theory Appl. Ganjisaffar 2 2011 Distributed tuning of machine learning algorithms using MapReduce Clusters 

  60. C.Dijun Luo, Ding, H.Huang, Parallelization with ultiplicative algorithms for big data mining, in: Proceedings of the 12th International Conference on Data Mining (ICDM), 2012, pp. 489-498. 

  61. 10.1109/BigData.Congress.2014.14 J.S.Yoo, D.Boulware, D.Kimmey, A Parallel Spatial Co-location Mining Algorithm Based on MapReduce, in: proceedings of the 2014 IEEE International Congress on Big Data, 3rd, pp. 25-31. 

  62. Neurocomputing Triguero 150 331 2015 10.1016/j.neucom.2014.04.078 MRPR: A MapReduce solution for prototype reduction in big data classification 

  63. J. Big Data Landset 2 1 2015 10.1186/s40537-015-0032-1 A survey of open source tools for machine learning with big data in the Hadoop ecosystem 

  64. 10.1145/2020408.2020426 R.Gemulla, E.Nijkamp, P.J.Haas, Y.Sismanis, Large-scale matrix factorization with distributed stochastic gradient descent, in: Proceedings of the 17th ACM SIGKDD international conference ion Knowledge discovery and data mining, San Diego, California, USA, 2011, pp. 69-77. 

  65. Hsu 2011 Scaling up machine learning: Parallel and distributed approaches Parallel online learning 

  66. P.Domingos, G.Hulten, A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering, presented at Proceedings of the Eighteenth International Conference on Machine Learning, 2001, pp. 106-113. 

  67. 2012 Scaling up Machine Learning: Parallel and Distributed Approaches 

  68. Proc. 1st Int. Workshop Big Data, Streams Heterog. Source Min.: Algorithms, Syst., Program. Models Appl. Parker 1 2012 Unexpected challenges in large scale machine learning 

  69. Prog. Artif. Intell. Peteiro-Barral 2 1 2013 10.1007/s13748-012-0035-5 A survey of methods for distributed machine learning 

  70. K.L.C.Zhu, M.Savvides, Distributed class dependent feature analysis - A big data approach, in: proceedings of the 2014 IEEE International Conference on Big Data, 2014. 

  71. IEEE Int. Congr. Big Data (BigData Congr.) Yui 1 2013 A database-Hadoop hybrid approach to Scalable machine learning 

  72. Soft Comput. Çatak 1 2015 Classification with boosting of extreme learning machine over arbitrarily partitioned data 

  73. 10.1145/2287076.2287111 M. Hefeeda, F. Gao, and W. Abd-Almageed, Distributed approximate spectral clustering for large-scale datasets, in: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing, 2012, pp. 223-234. 

  74. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. Cavallaro 8 4634 2015 10.1109/JSTARS.2015.2458855 On Understanding Big data impacts in remotely sensed image classification using support vector machine methods 

  75. J.Zhu, J.Chen, W.Hu, Big Learning with Bayesian Methods. Available: 〈http://arxiv.org/pdf/1411.6370〉, 2014. 

  76. L.Bagheri, H.Goote, A.Hasan, G.Hazard, Risk adjustment of patient expenditures: A big data analytics approach, in Proceedings of the 2013 IEEE International Conference on Big Data, 2013. 

  77. Imagen. Classif. Deep convolutional Neural Netw. Krizhevsky 2012 

  78. 10.1109/ISCAS.2010.5537907 Y. LeCun, K. Kavukcuoglu, and C. Farabet, Convolutional networks and applications in vision, in: Proceedings of IEEE International Symposium on Circuits and Systems, 2010, pp. 253-256. 

  79. Vis. Sci. Soc. Deng 1 2009 Construction and analysis of a large scale image ontology 

  80. Neurocomputing Guo 187 27 2016 10.1016/j.neucom.2015.09.116 Deep learning for visual understanding: a review 

  81. Neurocomputing Jiang 185 163 2016 10.1016/j.neucom.2015.12.042 Speed up deep neural network based pedestrian detection by sharing features across multi-scale models 

  82. R. Socher, A. Perelygin, J. Wu, J. Chuang, C. Manning, A. Ng, et al., Recursive deep models for semantic compositionality over a sentiment treebank, in: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2013. 

  83. Neurocomputing Zhou 120 536 2013 10.1016/j.neucom.2013.04.017 Active deep learning method for semi-supervised sentiment classification 

  84. Cogn. Comput. Zeng 8 684 2016 10.1007/s12559-016-9404-x Deep belief networks for quantitative analysis of a gold immunochromatographic strip 

  85. 10.1145/1273496.1273592 R.Raina, A.Battle, H.Lee, B.Packer, A.Y.Ng, Self-taught learning: transfer learning from unlabeled data, in: Proceedings of the 24th international conference on Machine learning, Corvalis, Oregon, USA, 2007. 

  86. Goodfellow 2016 Deep Learning 

  87. The J. Mach. Learn. Res. Erhan 11 625 2010 Why does Unsupervised Pre-training help deep learning? 

  88. T.Mikolov, I.Sutskever, K.Chen, G.S.Corrado, J.Dean, Distributed Representations of Words and Phrases and their Compositionality, presented at the NIPS, Stateline, NV, 2013. 

  89. Access, IEEE Chen 2 514 2014 10.1109/ACCESS.2014.2325029 Big data deep learning: challenges and perspectives 

  90. 47th Annu. IEEE/ACM Int. Symp. Micro. Chen 609 2014 DaDianNao: a machine-learning Supercomputer 

  91. IEEE Int. Symp. High. Perform. Comput. Archit. (HPCA) Mahajan 14 2016 TABLA: a unified template-based framework for accelerating statistical machine learning 

  92. M.Zaharia, M.Chowdhury, M.J.Franklin, S.Shenker, I.Stoica, Spark: cluster computing with working sets, presented at in: Proceedings of the 2nd USENIX conference on Hot topics in Cloud Computing, Boston, MA, 2010. 

  93. E.Bortnikov, A.Frank, E.Hillel, S.Rao, Predicting execution bottlenecks in map-reduce clusters, in: Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing, 2012, pp. 18-18. 

  94. 10.1109/ICDCS.2015.40 K. Xu, H. Yue, L. Guo, Y. Guo, Y. Fang, Privacy-preserving machine learning algorithms for big data systems, in: Proceedings of the 2015 IEEE 35th International Conference on Distributed Computing Systems (ICDCS), 2015, pp. 318-327. 

  95. Knowledge Inf. Syst. Vaidya 14 161 2008 10.1007/s10115-007-0073-7 Privacy-preserving SVM classification 

  96. Proc. VLDB Endow. Popescu 6 1678 2013 10.14778/2556549.2556553 PREDIcT: towards predicting the runtime of large scale iterative analytics 

  97. Machine Learn. Breiman 36 85 1999 10.1023/A:1007563306331 Pasting small votes for classification in large databases and On-Line 

  98. Big Data Anal. Bioinforma.: A Mach. Learn. Perspect. Kashyap 2015 

  99. J.Xu, C.Tekin, M.van der Schaar, Learning optimal classifier chains for real-time big data mining, in Proceedings 51st Annu. Allerton Conference Comm., Control and Comput. (Allerton'13), 2013. 

  100. 10.1145/2487788.2488042 G.De Francisci Morales, SAMOA: a platform for mining big data streams, in: Proceedings of the 22nd International Conference on World Wide Web, 2013, pp. 777-778. 

  101. 10.1145/2433396.2433459 Q.Yang, Big data, lifelong machine learning and transfer learning, in: Proceedings of the sixth ACM international conference on Web search and data mining, 2013, pp. 505-506. 

  102. J. Mach. Learn. Res. Lu 17 1 2016 Large scale online kernel learning 

  103. The J. Mach. Learn. Res. Wang 13 3103 2012 Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale SVM training 

  104. IEEE Comput. Intell. Mag. Zhai 9 14 2014 10.1109/MCI.2014.2326099 The emerging big dimensionality 

  105. 10.1145/2647868.2654926 T.Xiao, J.Zhang, K.Yang, Y.Peng, Z.Zhang, Error-Driven Incremental Learning in Deep Convolutional Neural Network for Large-Scale Image Classification, in: Proceedings of the ACM International Conference on Multimedia, 2014, pp. 177-186. 

  106. J. Big Data Singh 2 1 2014 A survey on platforms for big data analytics 

  107. T.Kraska, A.Talwalkar, J.Duchi, R.Griffith, M.J.Franklin, M.I.Jordan, MLbase: A Distributed Machine-learning System, in: Proceedings of the 6th Biennial Conference on Innovative Data Systems Research, Asilomar, California, USA, 2013. 

  108. Proc. VLDB Endow. Markl 7 1730 2014 10.14778/2733004.2733075 Breaking the chains: on declarative data analysis and data independence in the big data era 

  109. Tong 2016 2010 

  110. IEEE Autotestcon Armes, M 2013 Using Big data and predictive machine learning in aerospace test environments 

  111. 10.1145/2699026.2699136 B.Thuraisingham, Big Data Security and Privacy, in: Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, San Antonio, Texas, USA, 2015. 

  112. 10.1109/BigData.2016.7841037 B.Nelson, T.Olovsson, Security and Privacy for Big Data: A Systematic Literature Review, in: Proceedings of the 2016 IEEE International Conference on Big Data, Washington, D.C, 2016, pp. 3693-3702. 

관련 콘텐츠

오픈액세스(OA) 유형

BRONZE

출판사/학술단체 등이 한시적으로 특별한 프로모션 또는 일정기간 경과 후 접근을 허용하여, 출판사/학술단체 등의 사이트에서 이용 가능한 논문

저작권 관리 안내
섹션별 컨텐츠 바로가기

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

AI-Helper 아이콘
AI-Helper
안녕하세요, AI-Helper입니다. 좌측 "선택된 텍스트"에서 텍스트를 선택하여 요약, 번역, 용어설명을 실행하세요.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.

선택된 텍스트

맨위로