$\require{mediawiki-texvc}$

연합인증

연합인증 가입 기관의 연구자들은 소속기관의 인증정보(ID와 암호)를 이용해 다른 대학, 연구기관, 서비스 공급자의 다양한 온라인 자원과 연구 데이터를 이용할 수 있습니다.

이는 여행자가 자국에서 발행 받은 여권으로 세계 각국을 자유롭게 여행할 수 있는 것과 같습니다.

연합인증으로 이용이 가능한 서비스는 NTIS, DataON, Edison, Kafe, Webinar 등이 있습니다.

한번의 인증절차만으로 연합인증 가입 서비스에 추가 로그인 없이 이용이 가능합니다.

다만, 연합인증을 위해서는 최초 1회만 인증 절차가 필요합니다. (회원이 아닐 경우 회원 가입이 필요합니다.)

연합인증 절차는 다음과 같습니다.

최초이용시에는
ScienceON에 로그인 → 연합인증 서비스 접속 → 로그인 (본인 확인 또는 회원가입) → 서비스 이용

그 이후에는
ScienceON 로그인 → 연합인증 서비스 접속 → 서비스 이용

연합인증을 활용하시면 KISTI가 제공하는 다양한 서비스를 편리하게 이용하실 수 있습니다.

프라이버시를 보호하는 분산 기계 학습 연구 동향
Systematic Research on Privacy-Preserving Distributed Machine Learning 원문보기

The Transactions of the Korea Information Processing Society, v.13 no.2, 2024년, pp.76 - 90  

이민섭 (고려대학교 정보보호대학원) ,  신영아 (고려대학교 정보보호대학원) ,  천지영 (서울사이버대학교 빅데이터.정보보호학과)

초록
AI-Helper 아이콘AI-Helper

인공지능 기술스마트 시티, 자율 주행, 의료 분야 등 다양한 분야에서 활용 가능성을 높이 평가받고 있으나, 정보주체의 개인정보 및 민감정보의 노출 문제로 모델 활용이 제한되고 있다. 이에 따라 데이터를 중앙 서버에 모아서 학습하지 않고, 보유 데이터셋을 바탕으로 일차적으로 학습을 진행한 후 글로벌 모델을 최종적으로 학습하는 분산 기계 학습의 개념이 등장하였다. 그러나, 분산 기계 학습은 여전히 협력하여 학습을 진행하는 과정에서 데이터 프라이버시 위협이 발생한다. 본 연구는 분산 기계 학습 연구 분야에서 프라이버시를 보호하기 위한 연구를 서버의 존재 유무, 학습 데이터셋의 분포 환경, 참여자의 성능 차이 등 현재까지 제안된 분류 기준들을 바탕으로 유기적으로 분석하여 최신 연구 동향을 파악한다. 특히, 대표적인 분산 기계 학습 기법인 수평적 연합학습, 수직적 연합학습, 스웜 학습에 집중하여 활용된 프라이버시 보호 기법을 살펴본 후 향후 진행되어야 할 연구 방향을 모색한다.

Abstract AI-Helper 아이콘AI-Helper

Although artificial intelligence (AI) can be utilized in various domains such as smart city, healthcare, it is limited due to concerns about the exposure of personal and sensitive information. In response, the concept of distributed machine learning has emerged, wherein learning occurs locally befor...

주제어

표/그림 (9)

참고문헌 (76)

  1. A. Shamir, "How to share a secret," Communications of the ACM, Vol.22, No.11, pp.612-613, 1979.? 

  2. W. Diffie and M. E. Hellman, "New directions in cryptography," Democratizing Cryptography: The Work of Whitfield Diffie and Martin Hellman, pp.365-390, 2022.? 

  3. P. Paillier, ''Public-key cryptosystems based on composite degree residuosity classes,'' in International Conference on the Theory and Applications of Cryptographic Techniques, pp.223-238, 1999.? 

  4. Q. Li, Z. Wen, Z. Wu, S. Hu, N. Wang, and Y. Li, "A survey on federated learning systems: Vision, hype and reality for data privacy and protection," IEEE Transactions on Knowledge and Data Engineering, 2021.? 

  5. "What is Data Cleansing?" [Internet], https://aws.amazon.com/ko/what-is/data-cleansing/? 

  6. L. Ma, Q. Pei, L. Zhou, H. Zhu, L. Wang, and Y. Ji, "Federated Data Cleaning: Collaborative and Privacy-Preserving Data Cleaning for Edge Intelligence," in IEEE Internet of Things Journal, Vol.8, No.8, pp.6757-6770, 2021. doi: 10.1109/JIOT.2020.3027980.? 

  7. A. Koufakou, E. G. Ortiz, M. Georgiopoulos, G. C. Anagnostopoulos, and K. M. Reynolds, "A scalable and efficient outlier detection strategy for categorical data," in 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI), Vol.2, pp.210-217, 2007.? 

  8. S. D. Bay and M. Schwabacher, "Mining distance-based outliers in near linear time with randomization and a simple pruning rule," in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.29-38, 2003.? 

  9. F. Jiang, G. Liu, J. Du, and Y. Sui, "Initialization of K-modes clustering using outlier detection techniques," Information Sciences, Vol.332, pp.167-183, 2016.? 

  10. M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, "LOF: Identifying density-based local outliers," Proceedings of the 2000 ACM SIGMOD international conference on Management of data, Vol.29, No.2, pp.93-104, 2000.? 

  11. A. Arasu, M. Gotz, and R. Kaushik, "On active learning of record matching packages," in Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp.783-794, 2010.? 

  12. S. Mudgal et al., "Deep learning for entity matching: A design space exploration," in Proceedings of the 2018 International Conference on Management of Data, pp.19-34, 2018.? 

  13. T. Rekatsinas, X. Chu, I. F. Ilyas, and C. Re, "Holoclean: Holistic data repairs with probabilistic inference," Proceeding VLDB Endowment, Vol.10, No.11, pp.1190-1201, 2017.? 

  14. M. Yakout, L. Berti-Equille, and A. K. Elmagarmid, "Don't be SCAREd: Use SCalable automatic REpairing with maximal likelihood and bounded changes," in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp.553-564, 2013.? 

  15. M. Yakout, A. K. Elmagarmid, J. Neville, M. Ouzzani, and I. F. Ilyas, "Guided data repair," Proceeding VLDB Endowment, Vol.4, No.5, pp.279-289, 2011.? 

  16. S. Krishnan, J. Wang, M. J. Franklin, K. Goldberg, and T. Kraska, "PrivateClean: Data cleaning and differential privacy," in Proceedings of the 2016 International Conference on Management of Data, pp.937-951, 2016.? 

  17. R. A. Popa, C. Redfield, N. Zeldovich, and H. Balakrishnan, "CryptDB: Protecting confidentiality with encrypted query processing," in Proceedings of the twenty-third ACM symposium on operating systems principles, pp.85-100, 2011.? 

  18. P. Mohassel and Y. Zhang, "SecureML: A system for scalable privacypreserving machine learning," in IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, pp.19-38, 2017.? 

  19. D. Demmler, T. Schneider, and M. Zohner, "Aby-a framework for efficient mixed-protocol secure two-party computation," in Network and Distributed System Security (NDSS), pp.59, 2015.? 

  20. H. L. Dunn, "Record linkage," American Journal of Public Health Nations Health, Vol.36, No.12, pp.1412-1416, 1946.? 

  21. I. P. Fellegi and A. B. Sunter, "A Theory for Record Linkage", Journal of the American Statistical Association, Vol.64, No.328, pp.1183-1210, 1969.? 

  22. 가명정보결합종합지원시스템 [Internet], https://link.privacy.go.kr/nadac/organ/introData.do? 

  23. D. Vatsalan, Z. Sehili, P. Christen, and E. Rahm, "Privacy-Preserving Record Linkage for Big Data: Current Approaches and Research Challenges," In: Zomaya, A., Sakr, S. (eds) Handbook of Big Data Technologies. Springer, Cham. 2017. https://doi.org/10.1007/978-3-319-49340-4_25? 

  24. A. Gkoulalas-Divanis, D. Vatsalan, D. Karapiperis, and M. Kantarcioglu, "Modern privacy-preserving record linkage techniques: An overview," in IEEE Transactions on Information Forensics and Security, Vol.16, pp.4966-4987, 2021. doi: 10.1109/TIFS.2021.3114026? 

  25. S. Gomatam, R. Carter, M. Ariet, and G. Mitchell, "An empirical comparison of record linkage procedures," Statistics in Medicine, Vol.21, No.10, pp.1485-1496, 2002.? 

  26. Peter Christen, "Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection," Springer Science & Business Media, 2012.? 

  27. A. P. Brown, C. Borgs, S. M. Randall, and R. Schnell, "Evaluating privacy-preserving record linkage using cryptographic long-term keys and multibit trees on large medical datasets," BMC Medical Informatics and Decision Making, Vol.17, pp.1-7, 2017. https://doi.org/10.1186/s12911-017-0478-5? 

  28. I. Lazrig, T. C. Ong, I. Ray, I. Ray, X. Jiang, and J. Vaidya, "Privacy preserving probabilistic record linkage without trusted third party," in 2018 16th Annual Conference on Privacy, Security and Trust (PST), pp.1-10, 2018.? 

  29. B. H. Bloom, "Space/time trade-offs in hash coding with allowable errors," Communications of the ACM, Vol.13, No.7, pp.422-426, 1970.? 

  30. R. Schnell, T. Bachteler, and J. Reiher, "A novel error-tolerant anonymous linking code," Social Science Research Network, WP-GRLC-2011-02, 2011.? 

  31. Christine M. O'Keefe, Ming Yung, Lifang Gu, and Rohan Baxter. 2004. "Privacy-preserving data linkage protocols," In Proceedings of the 2004 ACM Workshop on Privacy in the Electronic Society (WPES '04). Association for Computing Machinery,NY,USA,94-102. https://doi.org/10.1145/1029179.1029203? 

  32. S. B. Dusetzina, S. Tyree, A.-M. Meyer, A. Meyer, L. Green, and W. R. Carpenter, "An Overview of Record Linkage Methods," 2014.? 

  33. S. B. Johnson, G. Whitney, M. McAuliffe, H. Wang, E. McCreedy, L. Rozenblit, and C. C. Evans, "Using global unique identifiers to link autism collections," Journal of the American Medical Informatics Association, Vol.17, No.6, pp.689-695, 2010.? 

  34. A. Inan, M. Kantarcioglu, G. Ghinita, and E. Bertino, "Private record matching using differential privacy," in Proceeding EDBT, pp.123-134, 2010.? 

  35. M. Kuzu, M. Kantarcioglu, A. Inan, E. Bertino, E. Durham, and B. Malin, "Efficient privacy-aware record integration," in Proceeding EDBT, Genoa, Italy, pp.167-178, 2013.? 

  36. A. L. Potosky, G. F. Riley, J. D. Lubitz, R. M. Mentnech, and L. G. Kessler, "Potential for cancer related health services research using a linked Medicare-tumor registry database," Medical Care, Vol.31, No.8, pp.732-748, 1993.? 

  37. S. J. Grannis, J. M. Overhage, and C. J. McDonald, "Analysis of identifier performance using a deterministic linkage algorithm," Proceeding of AMIA Symposium, pp.305-309, 2002.? 

  38. B. McMahan, E. Moore, D. Ramage, S. Hampson, and y Arcas, "Communication-efficient learning of deep networks from decentralized data," Artificial Intelligence and Statistics, Vol.54, 2017.? 

  39. S. Hardy, W. Henecka, H. Ivey-Law, R. Nock, G. Patrini, G. Smith, and B. Thorne, "Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption," arXiv preprint arXiv:1711.10677, 2017.? 

  40. R. Xu, N. Baracaldo, Y. Zhou, A. Anwar, J. Joshi, and H. Ludwig, "Fedv: Privacy-preserving federated learning over vertically partitioned data," Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security, 2021.? 

  41. D. Romanini, A. J. Hall, P. Papadopoulos et al., "Pyvertical: A vertical federated learning framework for multi-headed splitnn," arXiv:2104.00489, 2021.? 

  42. S. Stammler et al., "Mainzelliste SecureEpiLinker (MainSEL): Privacypreserving record linkage using secure multi-party computation," Bioinformatics, Vol.2020, pp.1-12, 2020.? 

  43. A, Southwell et al., "Validating a novel deterministic privacy-preserving record linkage between administrative & clinical data: applications in stroke research," International Journal of Population Data Science, Vol.7, No.4, pp.1755, 2022. doi: 10.23889/ijpds.v7i4.1755. PMID: 37152407; PMCID: PMC10161965.? 

  44. D. Morales, I. Agudo, and J. Lopez, "Private set intersection: A systematic literature review," Computer Science Review, Vol.49, pp.100567, 2023, https://doi.org/10.1016/j.cosrev.2023.100567.? 

  45. A. Adir, E. Aharoni, N. Drucker, E. Kushnir, R. Masalha, M. Mirkin and O. Soceanu, "Privacy-preserving record linkage using local sensitive hash and private set intersection," ArXiv:2203.14284v1, 2022.? 

  46. B. McMahan, E. Moore, D. Ramage, S. Hampson, and y Arcas, "Communication-efficient learning of deep networks from decentralized data," Artificial Intelligence and Statistics, PMLR, 2017.? 

  47. K. Bonawitz et al., "Practical secure aggregation for privacy-preserving machine learning," Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017.? 

  48. S. Truex, "A hybrid approach to privacy-preserving federated learning," Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, 2019.? 

  49. R. Xu, N. Baracaldo, Y. Zhou, A. Anwar and H. Ludwig, "Hybridalpha: An efficient approach for privacy-preserving federated learning," Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, 2019.? 

  50. J. Zhang, B. Chen, S. Yu, and H. Deng, "PEFL: A privacy-enhanced federated learning scheme for big data analytics," 2019 IEEE Global Communications Conference (GLOBECOM), IEEE, 2019.? 

  51. C. Zhang, S. Li, J. Xia, and W. Wang, "{BatchCrypt}: Efficient homomorphic encryption for {Cross-Silo} federated learning," 2020 USENIX Annual Technical Conference (USENIX ATC 20), 2020.? 

  52. G. Xu, H. Li, S. Liu, K. Yang and X. Lin, "Verifynet: Secure and verifiable federated learning," IEEE Transactions on Information Forensics and Security, Vol.15, pp.911-926, 2019.? 

  53. X. Guo et al., "VeriFL: Communication-Efficient and Fast Verifiable Aggregation for Federated Learning," IEEE Transactions on Information Forensics and Security, Vol.16, pp.1736-1751, 2020.? 

  54. H. Fereidooni et al., "SAFELearn: Secure aggregation for private federated learning," 2021 IEEE Security and Privacy Workshops (SPW), IEEE, 2021.? 

  55. J. Park and H. Lim, "Privacy-preserving federated learning using homomorphic encryption," Applied Sciences, Vol.12, No.2, pp.734, 2022.? 

  56. Y. A. Shin, G. Noh, I. R. Jeong, and J. Y. Chun, "Securing a local training dataset size in federated learning," IEEE Access, Vol.10, pp.104135-104143, 2022.? 

  57. J. Ma, SA. Naas, S. Sigg, and X. Lyu, "Privacy-preserving federated learning based on multi-key homomorphic encryption," International Journal of Intelligent Systems, Vol.37, No.9, pp.5880-5901, 2022.? 

  58. Y. Cheng, Y. Liu, T. Chen, and Q. Yang, "Federated learning for privacy-preserving AI," Communications of the ACM, Vol.63, No.12, pp.33-36, 2020.? 

  59. M. G. Poirot, P. Vepakomma, K. Chang, J. K.Cramer, R. Gupta, and R. Raskar, "Split Learning for collaborative deep learning in healthcare," NeurIPS, 2019.? 

  60. B. McMahan and D. Ramage, Google Research, Apr. 2017, [Online] Available: https://blog.research.google/2017/04/federated-learning-collaborative.html? 

  61. A. Hard et al., "Federated learning for mobile keyboard prediction," arXiv preprint arXiv:1811.03604, 2018.? 

  62. A. Gascon, P. Schoppmann, B. Balle, M. Raykova, J. Doemer, S. Zahur and D. Evans, "Secure linear regression on vertically partitioned datasets," International Association for Cryptologic Research Cryptology ePrint Archive, 892, 2016.? 

  63. K. Yang, T. Fan, T. Chen, Y. Shi, and Q. Yang, "A quasi-newton method based vertical federated learning framework for logistic regression," arXiv preprint arXiv:1912.00513, 2019.? 

  64. B. Gu, Z. Dang, X. Li, and H. Huang, "Federated doubly stochastic kernel learning for vertically partitioned data," Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020.? 

  65. T. Chen, X. Jin, Y. Sun, and W. Yin, "Vafl: a method of vertical asynchronous federated learning," arXiv preprint arXiv:2007.06081, 2020.? 

  66. C. Wang, J. Liang, M. Huang, B. Bai, K. Bai, and H. Li, "Hybrid differentially private federated learning on vertically partitioned data," arXiv preprint arXiv:2009.02763, 2020.? 

  67. K. Cheng et al., "Secureboost: A lossless federated learning framework," IEEE Intelligent Systems, Vol.36, No.6, pp.87-98, 2021.? 

  68. Q. Zhang, B. Gu, C. Deng, and H. Huang, "Secure bilevel asynchronous vertical federated learning with backward updating," Proceedings of the AAAI Conference on Artificial Intelligence. Vol.35, No.12, 2021.? 

  69. S. Warnat-Herresthal et al., "Swarm Learning for decentralized and confidential clinical machine learning," Nature, Vol.594, pp.265-270, 2021.? 

  70. O. L. Saldanha et al., "Swarm learning for decentralized artificial intelligence in cancer histopathology," Nature Medicine, Vol.28, No.6, pp.1232-1239, 2022.? 

  71. H. Basak, R. Kundu, PK. Singh, MF. Ijaz, M. Wozniak, and R. Sarkar, "A union of deep learning and swarm-based optimization for 3D human action recognition," Scientific Reports, Vol.12, No.1, pp.5494, 2022.? 

  72. F. Wang, X. Wang, and S. Sun, "A reinforcement learning level-based particle swarm optimization algorithm for large-scale optimization," Information Sciences, Vol.602, pp.298-312, 2022.? 

  73. M. Al-Rubaie and J. M. Chang, "Privacy-preserving machine learning: Threats and solutions," IEEE Security & Privacy, Vol.17, No.2, pp.49-58, 2019.? 

  74. R. Xu, N. Baracaldo, and J. Joshi. "Privacy-preserving machine learning: Methods, challenges and directions," arXiv preprint arXiv:2108.04417, 2021.? 

  75. G. A. Kaissis, Kaissis, M. R. Makowski, D. Ruckert, and R. F. Braren, "Secure, privacy-preserving and federated machine learning in medical imaging," Nature Machine Intelligence, Vol.2, No.6, pp.305-311, 2020.? 

  76. A. Lau, and J. Passerat-Palmbach. "Statistical privacy guarantees of machine learning preprocessing techniques," arXiv preprint arXiv:2109.02496, 2021. 

섹션별 컨텐츠 바로가기

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

AI-Helper 아이콘
AI-Helper
안녕하세요, AI-Helper입니다. 좌측 "선택된 텍스트"에서 텍스트를 선택하여 요약, 번역, 용어설명을 실행하세요.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.

선택된 텍스트

맨위로