[논문]인공지능 서비스 운영을 위한 시스템 측면에서의 연구

홍용근

doi:10.3745/ktccs.2022.11.10.323

인공지능 서비스 운영을 위한 시스템 측면에서의 연구
A Study on the System for AI Service Production 원문보기

정보처리학회논문지. KIPS transactions on computer and communication systems 컴퓨터 및 통신 시스템, v.11 no.10, 2022년, pp.323 - 332

홍용근 (대전대학교 AI융합학과)

초록
AI-Helper

AI 기술을 활용한 다양한 서비스가 개발되면서, AI 서비스 운영에 많은 관심이 집중되고 있다. 최근에는 AI 기술도 하나의 ICT 서비스를 보고, 범용적인 AI 서비스 운영을 위한 연구가 많이 진행되고 있다. 본 논문에서는 일반적인 기계학습 개발 절차의 마지막 단계인 기계학습 모델 배포 및 운영에 초점을 두고 AI 서비스 운영을 위한 시스템 측면에서의 연구 결과를 기술하였다. 3대의 서로 다른 Ubuntu 시스템을 구축하고, 이 시스템상에서 서로 다른 AI 모델(RFCN, SSD-Mobilenet)과 서로 다른 통신 방식(gRPC, REST)의 조합으로 2017 validation COCO dataset의 데이터를 이용하여 객체 검출 서비스를 Tensorflow serving을 통하여 AI 서비스를 요청하는 부분과 AI 서비스를 수행하는 부분으로 나누어 실험하였다. 다양한 실험을 통하여 AI 모델의 종류가 AI 머신의 통신 방식보다 AI 서비스 추론 시간에 더 큰 영향을 미치고, 객체 검출 AI 서비스의 경우 검출하려는 이미지의 파일 크기보다는 이미지 내의 객체 개수와 복잡도에 따라 AI 서비스 추론 시간이 더 큰 영향을 받는다는 것을 알 수 있었다. 그리고, AI 서비스를 로컬이 아닌 원격에서 수행하면 성능이 좋은 머신이라고 하더라도 로컬에서 수행하는 경우보다 AI 서비스 추론 시간이 더 걸린다는 것을 확인할 수 있었다. 본 연구 결과를 통하여 서비스 목표에 적합한 시스템 설계와 AI 모델 개발 및 효율적인 AI 서비스 운영이 가능해질 것으로 본다.

Abstract ▼ AI-Helper

As various services using AI technology are being developed, much attention is being paid to AI service production. Recently, AI technology is acknowledged as one of ICT services, a lot of research is being conducted for general-purpose AI service production. In this paper, I describe the research results in terms of systems for AI service production, focusing on the distribution and production of machine learning models, which are the final steps of general machine learning development procedures. Three different Ubuntu systems were built, and experiments were conducted on the system, using data from 2017 validation COCO dataset in combination of different AI models (RFCN, SSD-Mobilenet) and different communication methods (gRPC, REST) to request and perform AI services through Tensorflow serving. Through various experiments, it was found that the type of AI model has a greater influence on AI service inference time than AI machine communication method, and in the case of object detection AI service, the number and complexity of objects in the image are more affected than the file size of the image to be detected. In addition, it was confirmed that if the AI service is performed remotely rather than locally, even if it is a machine with good performance, it takes more time to infer the AI service than if it is performed locally. Through the results of this study, it is expected that system design suitable for service goals, AI model development, and efficient AI service production will be possible.

주제어

표/그림 (18)

그림 Fig. 1. General Procedure of AI Service Development[6]
그림 Fig. 2. Architecture of Tensorflow Serving[2]
그림 Fig. 3. System for AI Service Production
그림 Fig. 4. Code for Execution of Tensorflow Serving
그림 Fig. 5. Original Images for AI Service Request
그림 Fig. 6. Result of Object Detection with RFCN
그림 Fig. 7. Result of Object Detection with SSD-Mobilenet
그림 Fig. 8. Latency of Object Detection in Client Machine
그림 Fig. 9. Latency of Object Detection in Cloud Server
그림 Fig. 10. Latency of Object Detection in Edge Device
표 Table 1. Average Latency Time of Each Machine
그림 Fig. 11. AI Service Execution in Client Machine
그림 Fig. 12. AI Service Execution in Cloud Server
그림 Fig. 13. AI Service Execution in Edge Device
그림 Fig. 14. Latency of Object Detection in Client Machine, Cloud Server, Edge Device of Image-2
그림 Fig. 15. Latency of Object Detection in Client Machine, Cloud Server, Edge Device of Image-7
표 Table 2. Average Latency Time of Each Machine of Image-2 in Remote
표 Table 3. Average Latency Time of Each Machine of Image-7 in Remote

참고문헌 (35)

T. Brown et al., "Language models are few-shot learners," Advances in Neural Information Processing Systems, Vol.33, pp.1877-1901, 2020.
Tensorflow serving [Internet], https://www.tensorflow.org/tfx/guide/serving.
TorchServe [Internet], https://pytorch.org/serve/.
Nvidia Trion Server [Internet], https://developer.nvidia.com/nvidia-triton-inference-server.
Intel OpenVINO [Internet], https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html.
ITU-T Y.3531, "Cloud computing - Functional requirements for machine learning as a service," 2020.
Sungpil Shin, "MLaas(Machine Learning as a Service) Market Trend and Standards for functional requirement," TTA ICT Standard Weekly 1065, 2022.
Flask [Internet], https://flask.palletsprojects.com/en/2.0.x.
Django [Internet], https://www.djangoproject.com/.
FastAPI [Internet], https://fastapi.tiangolo.com/.
H. M. Park and T. H. Hwang, "Changes and trends of Edge computing technology," KICS Information and Communication Magazine, Vol.36, No.2, pp.41-47, 2019.
W. Yu, F. Liang, X. He, W. Grant Hatcher, C. Lu, J. Lin and X. Yang, "A survey on the edge computing for the Internet of Things," IEEE Access, Vol.6, pp.6900-6919, 2017.
S. Maheshwari, D. Raychaudhuri, I. Seskar, and F. Bronzino, "Scalability and performance evaluation of edge cloud systems for latency constrained applications," In 2018 IEEE/ACM Symposium on Edge Computing (SEC), pp. 286-299. IEEE, 2018.
K. H. Kim, Y. G. Hong, and C. S. Pyo, "Standard technology and Trend of Edge computing for IoT and AI," KICS Information and Communication Magazine.
E. H. Kim, K. Ha Lee, and W. Kyung Sung, "Technology trends of deep-learning model lightweight," Communication of KIISE, Vol.38, No.8, pp.18-29, 2020.
F. Wang, M. Zhang, X. Wang, X. Ma, and J. Liu, "Deep learning for edge computing applications: A state-of-the-art survey," IEEE Access, Vol.8, pp.58322-58336, 2020.

상세보기
Y. Jun Choi and H. S. Eom, "Deep learning model compression for embedded system," KIISE KCC 2019, pp.1044-1046, 2019.
A. G. Howard et al., "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
M. Algabri, H. Mathkour, M. Abdelkader Bencherif, M. Alsulaiman, and M. Amine Mekhtiche, "Towards deep object detection techniques for phoneme recognition," IEEE Access, Vol.8, pp.54663-54680,2020.

상세보기
R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.580-587, 2014.
R. Girshick, "Fast r-cnn," In Proceedings of the IEEE International Conference on Computer Vision, pp.1440-1448. 2015.
S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems, Vol.28, 2015.
J. Dai, Y. Li, K. He, and J. Sun, "R-fcn: Object detection via region-based fully convolutional networks," Advances in Neural Information Processing Systems, Vol.29, 2016.
S. H. Park, H. S. Yoon, and K. R. Park, "Faster R-CNN and geometric transformation-based detection of driver's eyes using multiple near-infrared camera sensors," Sensors, Vol.19, No.1, pp.197, 2019.

상세보기
K. Surya Vara Prasad, K. B. D'souza, and V. K. Bhargava, "A downscaled faster-RCNN framework for signal detection and time-frequency localization in wideband RF systems," IEEE Transactions on Wireless Communications, Vol.19, No.7, pp.4847-4862,2020.

상세보기
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.779-788, 2016.
J. Redmon and A. Farhadi, "YOLO9000: Better, faster, stronger," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.7263-7271, 2017.
J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C-Y Fu, and A. C. Berg, "Ssd: Single shot multibox detector," In European Conference on Computer Vision, pp.21-37. Springer, Cham, 2016.
T-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," In Proceedings of the IEEE International Conference on Computer Vision, pp.2980-2988, 2017.
S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li, "Single-shot refinement neural network for object detection," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.4203-4212, 2018.
L. Zhou, W. Min, D. Lin, Q. Han, and R. Liu, "Detecting motion blurred vehicle logo in IoV using filter-DeblurGAN and VL-YOLO," IEEE Transactions on Vehicular Technology, Vol.69, No.4, pp.3604-3614, 2020.

상세보기
H. Zhang, L. Qin, J. Li, Y. Guo, Y. Zhou, J. Zhang, and Z. Xu, "Real-time detection method for small traffic signs based on Yolov3," IEEE Access, Vol.8, pp.64145-64156, 2020.

상세보기
A. G. Howard et al., "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
Intel AI Object Detection [Internet], https://github.com/IntelAI/models/blob/master/docs/object_detection/tensorflow_serving/Tutorial.md.

표제어: PCR

동의어: Packet Collision Rate

용어 설명 출처 목록 (6)

용어 설명: PCR은 세균 특이성이 있는 primer를 이용하여 적은 수의 세균이 있을지라도 쉽게 검출할 수 있는 유용한 방법이며, 이를 이용하여 구강 내 치면세균막이나 타액에서 직접 세균을 검출할 수 있게 되었다[8].

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증