As the increasing expectations of a practical AI (Artificial Intelligence) service makes AI algorithms more complicated, an efficient processor to process AI algorithms is required. To meet this requirement, processors optimized for parallel processing, such as GPUs (Graphics Processing Units), have been widely employed. However, the GPU has a generalized structure for various applications, so it is not optimized for the AI algorithm. Therefore, research on the development of AI processors optimized for AI algorithm processing has been actively conducted. This paper briefly introduces an AI processor especially for inference acceleration, developed by the Electronics and Telecommunications Research Institute, South Korea., and other global vendors for mobile and server platforms. However, the GPU has a generalized structure for various applications, so it is not optimized for the AI algorithm. Therefore, research on the development of AI processors optimized for AI algorithm processing has been actively conducted.
S. Windsor, "Snapdragon 865 vs Kirin 990 5G vs Exynos 990 (Exynos 9830) vs MediaTek Dimensity 1000 (MT6889): which one is the best 5G processor?" www.gearbest.com, Dec. 10, 2019.
H. Liao et al., "DaVinci: A Scalable Architecture for Neural Network Computing," in Proc. Hot Chips 31 Symp., Cupertino, CA, USA, Aug. 18-20, 2019, doi: 10.1109/HOTCHIPS.2019.8875654.
J. Song et al., "7.1 An 11.5 TOPS/W 1024-MAC butterfly structure dual-core sparsity-aware neural processing unit in 8nm flagship mobile SoC," in Proc. IEEE Int. Solid-State Circuits Conf.-(ISSCC), San Francisco, CA, USA, Feb. 17-21, 2019, doi: 10.1109/ISSCC.2019.8662476.
H. Orr Danon, "Introducing Hailo-8: The Most Efficient Deep Learning Processor for Edge Devices," 2019 Embedded Vision Summit, May 2019.
E. Lindholm et al., "NVIDIA Tesla: A Unified Graphics and Computing Architecture," IEEE Micro, vol. 28, no. 2, 2008, pp. 39-55.
NVIDIA, "Nvidia Tesla V100 GPU Architecture," WP-08608-001_v1.1, 2017, https://images.nvidia.com/content/voltaarchitecture/pdf/volta-architecture-whitepaper.pdf.
D. Patterson, "Domain Specific Architectures for Deep Neural Networks: Three Generations of Tensor Processing Units (TPUs)," Allen School Distinguished Lecture: David Patterson (UC Berkeley/Google)
N. P. Jouppi et al., "In-Datacenter Performance Analysis of a Tensor Processing Unit," in Proc. Annu. Int. Symp. Comput. Architect., Toronto, Canada, June 2017 doi: 10.1145/3079856.3080246.
Y. Kwon et al., "Function-Safe Vehicular AI Processor with Nano Core-In-Memory Architecture," in Proc. Annu. Int. Conf. Artif. Intell. Circuits Syst., Hsinchu, Taiwan, Mar. 2019, doi: 10.1109/AICAS.2019.8771603 .