그래디언트 기반 재복원공격을 활용한 배치상황에서의 연합학습 프라이버시 침해연구
Federated Learning Privacy Invasion Study in Batch Situation Using Gradient-Based Restoration Attack

情報保護學會論文誌 = Journal of the Korea Institute of Information Security and Cryptology, v.31 no.5, 2021년, pp.987 - 999  

장진혁 (숭실대학교) ,  류권상 (숭실대학교) ,  최대선 (숭실대학교)



최근 데이터로 인한 개인정보 침해로 인해 연합학습이 이슈화되고 있다. 연합학습은 학습데이터를 요구하지 않기 때문에 프라이버시 침해로부터 안전하다. 이로 인해 분산된 디바이스, 데이터를 활용하여 효율을 내기 위한 응용 방법에 대한 연구들이 진행되고 있다. 그러나 연합학습과정에서 전송되는 그래디언트로부터 학습데이터를 복원하는 재복원공격에 대한 연구가 진행됨에 따라 더는 연합학습도 안전하다고 볼 수 없다. 본 논문은 다양한 데이터 상황에서 데이터 복원 공격이 얼마나 잘되는지 수치적, 시각적으로 확인하는 것이다. 데이터가 1개만 존재할 때부터 크게는 클래스 안에 데이터가 여러 개 분포해 있을 때로 나누어 재복원공격이 얼마나 되는지 확인을 위해 MSE, LOSS, PSNR, SSIM인 평가지표로 MNIST 데이터를 활용해 수치로 확인한다. 알게 된 사실로 클래스와 데이터가 많아질수록 MSE, LOSS,이 높아지고 PSNR, SSIM이 낮아져 복원성능이 떨어지지만 몇 개의 복원된 이미지로 충분히 프라이버시 침해가 가능하다는 것을 확인할 수 있다.



Recently, Federated learning has become an issue due to privacy invasion caused by data. Federated learning is safe from privacy violations because it does not need to be collected into a server and does not require learning data. As a result, studies on application methods for utilizing distributed...


  1. Latanya Sweeney, "k-anonymity: a model for protecting privacy," International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, vol. 10, no. 5, pp. 557-570, Jul. 2002. 

  2. Cynthia Dwork, Aaron Roth, "The Algorithmic Foundations of Differential Privacy", Foundations and Trends in Theoretical Computer Science, vol. 9, no. (3-4), pp. 211-407, Aug. 2014. 

  3. Max Jaderberg, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman, "Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition", arXiv, Dec. 2014. 

  4. Joungyoun Kim, Min-jeong Park, "Multiple imputation and synthetic data", The Korean Journal of Applied Statistics, vol. 32, no. 1, pp. 83-97, Feb. 2019. 

  5. Jooseok Park, "A Comparative Study of Big Data, Open Data, and My Data", The Korea Journal of BigData, vol. 3, no. 1, pp. 41-46, Aug. 2018. 

  6. Google AI Blog, Federated Learning: Collaborative Machine Learning witho ut Centralized Training Data, Available:https://ai.googleblog.com/2017/04/federated-learning-collaborative.html, A ccessed: Jul. 2019. [Online] 

  7. Yue Zhao, Meng Li, Liangzhen Lai, and Naveen Suda, "Federated Learning with Non-IID Data", arXiv, Jun. 2018. 

  8. Tian Li, Anit Kumar Sahu, and Ameet Talwalkar, "Federated Learning: Challenges, methods, and future directions", IEEE SIGNAL PROCESSING MAGAZINE, vol. 37, no 3, pp. 50-60, May. 2020. 

  9. Keith Bonawitz, Hubert Eichner, and Wolfgang Grieskamp, "TOWARDS FEDERATED LEARNING AT SCALE: SYSTEM DESIGN", Proceedings of the 2nd SysML Conference, Mar. 2019. 

  10. H. Brendan, McMahan Eider, Moore Daniel Ramage et. al., "Communication-Efficient Learning of Deep Networks from Decentralized Data", Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 54, pp. 1273-1282, Feb. 2017. 

  11. Vale Tolpegin, Stacey Truex, Mehmet Emre Gursoy, and Ling Liu, "Data Poisoning Attacks Against Federated Learning Systems", European Symposium on Research in Computer Security, pp. 480-501, Sep, 2020. 

  12. Clement Fung, Chris J.M. Yoon, Ivan Beschastnikh, "Mitigating Sybils in Federated Learning Poisoning", arXiv, Jul. 2020. 

  13. Eugene Bagdasaryan, Andreas Veit, and Yiqing Hua, "How To Backdoor Federated Learning", Proceedings of the 23rdInternational Conference on Artificial Intelligence and Statistics (AISTATS)2020, vol. 108, pp. 2938-2948, Aug. 2020. 

  14. Ligeng Zhu, Zhijian Liu, and Song Han, "Deep Leakage from Gradients", 33rd Conference on Neural Information Processing Systems NeurIPS, pp. 17-31, Dec. 2019. 

  15. Jonas Geiping, Hartmut Bauermeister, and Hannah Drog, "Inverting Gradients - How easy is it to break privacy in federated learning?", 34th Conference on Neural Information Processing Systems NeurIPS, Dec. 2020. 

  16. Wenqi Wei, Ling Liu, Margaret Loper, and Ka-Ho Chow, "A Framework for Evaluating Clinet Privacy Leakages in Federated Learning", 25th European Symposium on Research in Computer Security, pp. 545-566, Sep. 2020. 

  17. Andrew Hard, Kanishka Rao, and Rajiv Mathews, "FEDERATED LEARNING FOR MOBILE KEYBOARD PREDICTION", arXiv, Feb. 2019. 

  18. Qiang Yang, Yang Liu, and Tianjian Chen,"Federated Machine Learning: Concept and Applications", ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 12, pp. 1-19, Jan. 2019. 

  19. Tian Li, Anit Kumar Sahu, and Ameet Talwalkar," Federated Learning: Challenges, methods, and future directions", IEEE SIGNAL PROCESSING MAGAZINE, vol. 37, no. 3, pp. 50-60, May. 2020. 

  20. Xin Yao, Tianchi Huang, and Chenglei Wu, "Federated Learning with Additional Mechanisms on Clients to Reduce Communication Costs", arXiv, Sep. 2019. 

  21. H. Brendan McMahan, Eider Moore, and Daniel Ramage, "Communication-Efficient Learning of Deep Networks from Decentralized Data", Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, vol. 54, pp. 1273-1282, Feb. 2017. 

  22. Tian Li, Anit Kumar Sahu, and Manzil Zaheer, "Federated Optimization in Heterogeneous Networks", Proceedings of the 3rd MLSys Conference, Apr. 2020. 

  23. Z. Wang, A.C. Bovik, and H.R. Sheikh, "Image quality assessment: from error visibility to structural similarity", IEEE transactions on image processing, vol. 13, no. 4, pp. 600-612, Apr. 2004. 

  24. Python Pillow Library (Pillow - Pillow (PIL Fork)8.3.1 documentation ), Available:https://pillow.readthedocs.io/en/stable/, Accessed: Aug. 2021. [Online] 

