[논문]Transformer를 활용한 인공신경망의 경량화 알고리즘 및 하드웨어 가속 기술 동향

김혜지; 여준기

doi:10.22648/etri.2023.j.380502

Transformer를 활용한 인공신경망의 경량화 알고리즘 및 하드웨어 가속 기술 동향
Trends in Lightweight Neural Network Algorithms and Hardware Acceleration Technologies for Transformer-based Deep Neural Networks 원문보기

전자통신동향분석 = Electronics and telecommunications trends, v.38 no.5, 2023년, pp.12 - 22

김혜지 (초거대AI반도체연구실) , 여준기 (초거대AI반도체연구실)

Abstract ▼ AI-Helper

The development of neural networks is evolving towards the adoption of transformer structures with attention modules. Hence, active research focused on extending the concept of lightweight neural network algorithms and hardware acceleration is being conducted for the transition from conventional convolutional neural networks to transformer-based networks. We present a survey of state-of-the-art research on lightweight neural network algorithms and hardware architectures to reduce memory usage and accelerate both inference and training. To describe the corresponding trends, we review recent studies on token pruning, quantization, and architecture tuning for the vision transformer. In addition, we present a hardware architecture that incorporates lightweight algorithms into artificial intelligence processors to accelerate processing.

주제어

참고문헌 (20)

T. Brown et al., "Language models are few-shot？learners," in Proc. NeurIPS 2020, (Vancouver, Canada),？Dec. 2020, pp. 1877-1901.
C.H. Lin et al., "Magic3d: High-resolution text-to-3d content creation," in Proc. IEEE/CVF CVPR 2023,？(Vancouver, Canada), June 2023, pp. 300-309.
U. Singer et al., "Make-a-video: Text-to-video？generation without text-video data," arXiv preprint,？CoRR, 2022, arXiv: 2209.14792.
R. Huang et al., "Make-an-audio: Text-to-audio？generation with prompt-enhanced diffusion models,"？arXiv preprint, CoRR, 2023, arXiv: 2301.12661.
A. Vaswani et al., "Attention is all you need," in Proc.？NIPS 2017, (Long Beach, CA, USA), Dec. 2017.
https://openai.com/blog/chatgpt
R. Rombach et al., "High-resolution image synthesis？with latent diffusion models," in Proc. IEEE/CVF CVPR？2022, (New Orleans, LA, USA), June 2022, pp. 10684-10695.
C. Yu et al., "Boost Vision Transformer with GPU-Friendly Sparsity and Quantization," in Proc. IEEE/CVF？CVPR 2023, (Vancouver, Canada), June 2023, pp.？22658-22668.
J. Shin et al., "NIPQ: Noise proxy-based integrated？pseudo-quantization," in Proc. IEEE/CVF CVPR 2023,？(Vancouver, Canada), June 2023, pp. 3852-3861.
G. Fang et al., "Depgraph: Towards any structural？pruning," in Proc. IEEE/CVF CVPR 2023, (Vancouver,？Canada), June 2023, pp. 16091-16101.
S. Wei et al., "Joint token pruning and squeezing？towards more aggressive compression of vision？transformers," in Proc. IEEE/CVF CVPR 2023,？(Vancouver, Canada), June 2023, pp. 2092-2101.
Y. Rao et al., "Dynamicvit: Efficient vision transformers？with dynamic token sparsification," in Proc. NeurIPS？2021, (Virtual-only), Dec. 2021, pp. 13937-13949.
E. Jang, S. Gu, and B. Poole, "Categorical？reparameterization with gumbel-softmax," arXiv？preprint, CoRR, 2016, arXiv: 1611.01144.
L. Youwei et al., "Not all patches are what you？need: Expediting vision transformers via token？reorganizations," arXiv preprint, CoRR, 2022, arXiv:？2202.07800.
H. Yang et al., "Global vision transformer pruning with？hessian-aware saliency," in Proc. IEEE/CVF CVPR 2023,？(Vancouver, Canada), June 2023, pp. 18547-18557.
S.M. Moosavi-Dezfooli et al., "Robustness via curvature？regularization, and vice versa," in Proc. IEEE/CVF CVPR？2019, (Long Beach, CA, USA), June 2019, pp. 9078-9086.
Y. Huanrui et al., "Hero: Hessian-enhanced robust？optimization for unifying and improving generalization？and quantization performance," arXiv preprint, CoRR,？2021, arXiv: 2111.11986.
Y. Shixing et al., "Hessian-aware pruning and optimal？neural implant," in Proc. IEEE/CVF WACVi 2022, (Waikoloa, HI, USA), Jan. 2022, pp. 3880-3891.
H. Wang, Z. Zhang, and S. Han, "Spatten: Efficient？sparse attention architecture with cascade token and？head pruning," in Proc. IEEE HPCA 2021, (Seoul, Rep.？of Korea), Feb. 2021.
Y. Qin et al., "FACT: FFN-attention Co-optimized？transformer architecture with eager correlation？prediction," in Proc. ISCA 2023, (Orlando, FL, USA),？June 2023, pp. 1-14.

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증