[논문]밝기 변화에 강인한 적대적 음영 생성 및 훈련 글자 인식 알고리즘

서민석; 김대한; 최동걸

doi:10.7746/jkros.2021.16.3.276

밝기 변화에 강인한 적대적 음영 생성 및 훈련 글자 인식 알고리즘
Adversarial Shade Generation and Training Text Recognition Algorithm that is Robust to Text in Brightness 원문보기

로봇학회논문지 = The journal of Korea Robotics Society, v.16 no.3, 2021년, pp.276 - 282

서민석 (Department of Information and Communication Engineering, Hanbat National University) , 김대한 (Department of Information and Communication Engineering, Hanbat National University) , 최동걸 (Department of Information and Communication Engineering, Hanbat National University)

Abstract ▼ AI-Helper

The system for recognizing text in natural scenes has been applied in various industries. However, due to the change in brightness that occurs in nature such as light reflection and shadow, the text recognition performance significantly decreases. To solve this problem, we propose an adversarial shadow generation and training algorithm that is robust to shadow changes. The adversarial shadow generation and training algorithm divides the entire image into a total of 9 grids, and adjusts the brightness with 4 trainable parameters for each grid. Finally, training is conducted in a adversarial relationship between the text recognition model and the shaded image generator. As the training progresses, more and more difficult shaded grid combinations occur. When training with this curriculum-learning attitude, we not only showed a performance improvement of more than 3% in the ICDAR2015 public benchmark dataset, but also confirmed that the performance improved when applied to our's android application text recognition dataset.

주제어

참고문헌 (20)

J. Baek, G. Kim, J. Lee, S. Park, D. Han, S. Yun, S. J. Oh, and H. Lee, "What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 2019, DOI: 10.1109/iccv.2019.00481.
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," IEEE, vol. 86, no. 11, pp. 2278-2324, 1998, DOI: 10.1109/5.726791.

상세보기
B. Shi, M. Yang, X. Wang, P. Lyu, C. Yao, and X. Bai, "ASTER: An Attentional Scene Text Recognizer with Flexible Rectification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 9, pp. 2035-2048, Sep., 2019, DOI: 10.1109/tpami.2018.2848939.

상세보기
F. F. Borisyuk, A. Gordo, and V. Sivakumar, "Rosetta: Large Scale System for Text Detection and Recognition in Images," 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, DOI: 10.1145/3219819.3219861.
SHI, Baoguang; BAI, Xiang; YAO, Cong. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE transactions on pattern analysis and machine intelligence, 2016, 39.11: 2298-2304, DOI: 10.1109/TPAMI.2016.2646371.

상세보기
J.-H. Kim and J. Lim, "License Plate Detection and Recognition Algorithm using Deep Learning," Journal of IKEEE, vol. 23, no. 2, pp. 642-651, Jun., 2019, DOI: 10.7471/IKEEE.2019.23.2.642.

원문보기 상세보기
M. Seo, S. Lee, and D.-G. Choi, "Spatial-temporal Ensemble Method for Action Recognition," Journal of Korea Robotics Society, vol. 15, no. 4, pp. 385-391, Dec., 2020, DOI: 10.7746/jkros.2020.15.4.385.

원문보기 상세보기
D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, J. Matas, L. Neumann, V. R. Chandrasekhar, S. Lu, F. Shafait, S. Uchida, and E. Valveny, "ICDAR 2015 competition on Robust Reading," 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 2015, DOI: 10.1109/icdar.2015.7333942.
A. Gupta, A. Vedaldi, and A. Zisserman, "Synthetic Data for Text Localisation in Natural Images," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, DOI: 10.1109/cvpr.2016.254.
M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, "Synthetic data and artificial neural networks for natural scene text recognition," NIPS DLW, 2014, [Online], https://arxiv.org/pdf/1406.2227.pdf.
D. Hendrycks and T. Dietterich, "Benchmarking neural network robustness to common corruptions and perturbations," International Conference on Learning Representations (ICLR), 2019, [Online], https://arxiv.org/pdf/1903.12261.pdf.
E. Rusak, L. Schott, R. S. Zimmermann, J. Bitterwolf, O. Bringmann, M. Bethge, and W. Brendel, "A Simple Way to Make Neural Networks Robust Against Diverse Image Corruptions," Lecture Notes in Computer Science, pp. 53-69, 2020, DOI: 10.1007/978-3-030-58580-8_4.
K. Wang, B. Babenko, and S. Belongie, "End-to-end scene text recognition," 2011 International Conference on Computer Vision, Barcelona, Spain, 2011, DOI: 10.1109/iccv.2011.6126402.
B. Shi, X. Wang, P. Lyu, C. Yao, and X. Bai, "Robust Scene Text Recognition with Automatic Rectification," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, DOI: 10.1109/cvpr.2016.452.
M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, "Spatial transformer networks," NIPS, 2015, [Online], https://proceedings.neurips.cc/paper/2015/file/33ceb07bf4eeb3da587e268d663aba1a-Paper.pdf.
W. Liu, C. Chen, K. Wong, Z. Su, and J. Han, "STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition," British Machine Vision Conference 2016, 2016, DOI: 10.5244/c.30.43.
Y. Mou, L. Tan, H. Yang, J. Chen, L. Liu, P. Yan, and Y. Huang, "PlugNet: Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-Resolution Unit," European Conference on Computer Vision, pp. 158-174, 2020, DOI: 10.1007/978-3-030-58555-6_10.
J.-H. Kim, "Automatic Recognition of Bank Security Card Using Smart Phone," The Journal of the Korea Contents Association, vol. 16, no. 12, pp. 19-26, Dec. 2016, DOI: 10.5392/JKCA.2016.16.12.019.

원문보기 상세보기
S. Lee and G. Park, "Proposal for License Plate Recognition Using Synthetic Data and Vehicle Type Recognition System," Journal of Broadcast Engineering, vol. 25, no. 5, pp. 776-788, Sep., 2020, DOI: 10.5909/JBE.2020.25.5.776.

원문보기 상세보기
C. -Y. Lee and S. Osindero, "Recursive recurrent nets with attention modeling for ocr in the wild," In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2231-2239, DOI: 10.1109/CVPR.2016.245.

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

밝기 변화에 강인한 적대적 음영 생성 및 훈련 글자 인식 알고리즘
Adversarial Shade Generation and Training Text Recognition Algorithm that is Robust to Text in Brightness 원문보기

Abstract ▼ AI-Helper

주제어

참고문헌 (20)

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

연관된 기능

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

밝기 변화에 강인한 적대적 음영 생성 및 훈련 글자 인식 알고리즘 Adversarial Shade Generation and Training Text Recognition Algorithm that is Robust to Text in Brightness 원문보기

Abstract ▼ AI-Helper

주제어

참고문헌 (20)

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

원문 URL 링크

오픈액세스(OA) 유형

연관된 기능

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

밝기 변화에 강인한 적대적 음영 생성 및 훈련 글자 인식 알고리즘
Adversarial Shade Generation and Training Text Recognition Algorithm that is Robust to Text in Brightness 원문보기