[논문]분산맵을 이용한 웹 이미지 텍스트 영역 추출

정인숙; 오일석

doi:10.5392/jkca.2009.9.9.068

분산맵을 이용한 웹 이미지 텍스트 영역 추출
Text Region Segmentation from Web Images using Variance Maps 원문보기

한국콘텐츠학회논문지 = The Journal of the Korea Contents Association, v.9 no.9, 2009년, pp.68 - 79

정인숙 (전북대학교 전자정보공학부 컴퓨터공학.영상정보신기술연구소) , 오일석 (전북대학교 전자정보공학부 컴퓨터공학.영상정보신기술연구소)

초록
AI-Helper

분산맵은 텍스트 영역이 주변과의 색상 혹은 밝기 변화가 심하다는 특징을 이용하는 방법으로 특히 잦은 포맷 변환에 의하여 해상도가 낮거나 일정하지 않은 웹 이미지의 텍스트 영역을 추출하는 데 적용할 수 있다. 그러나 이전의 분산맵을 적용한 방법들은 입력 영상 전역에 고정된 마스크를 한 번만 적용하는 광역 분산맵을 사용하므로 텍스트 크기가 매우 작거나 큰 경우, 획의 색상에 gradation효과가 있는 경우, 각도, 위치, 색상 등이 복잡한 경우 텍스트 추출 성능이 안정 적이지 못하다. 본 논문은 2단계 분산맵을 사용하여 Web 이미지에서 텍스트 영역을 안정적으로 추출하는 방법을 제안한다. 제안된 방법은 광역 및 지역 분산맵이 각 단계에서 적용되며 서로 계층적 관계를 가진다. 1단계는 텍스트 영역 추출 재현율을 높일 수 있도록, 충분히 큰 글자 혹은 작은 글자도 추출할 수 있는 일정한 마스크 크기를 가진 광역의 수직 및 수평 색 분산맵을 적용하여 유사 텍스트 영역을 추출한다. 2단계에서는 1단계의 각 연결요소영역에 새로운 마스크 크기를 가진 명암 분산맵을 적용하여 최종적인 텍스트 영역을 추출한다. 2단계 분산맵 적용에 의하여 1단계에서 구한 유사 텍스트 영역에 남아 있는 배경 부분이 많이 사라지게 되어 추출 정확률이 높아진다. 제안한 방법을 400개의 Web 이미지에 적용한 결과 배경이 복잡해도 비교적 안정적으로 텍스트 영역을 추출하는 것을 확인할 수 있었다.

Abstract ▼ AI-Helper

A variance map can be used to detect and distinguish texts from background in images. However, previous variance maps work at one level and they suffer a limitation in dealing with varieties in text size, slant, orientation, translation, and color. We present a method for robustly segmenting text regions in complex color Web images using two-level variance maps. The two-level variance maps work hierarchically. The first level finds the approximate locations of text regions using global horizontal and vertical color variances with the specific mask sizes. The second level then segments each text region using intensity variance with a local mask size, which is determined adaptively. By the second process, backgrounds tend to disappear in each region and segmentation can be accurate. Highly promising experimental results have established the effectiveness of our approach.

주제어

AI 본문요약
AI-Helper

* AI 자동 식별 결과로 적합하지 않은 문장이 있을 수 있으니, 이용에 유의하시기 바랍니다.

문제 정의

The performance at that stage affects quite crucially the degree of success in subsequent recognition. This paper presents a new approach to the segmentation, especially in conplex Web images (e. g. those in [Figure 1]).

제안 방법

In [12], Murguia proved that a document image can be segmented into regions of texts, and regions of graphics and/or pictures using gray-level spatial variance of low resolution images. It was designed to work with free format documents, text in background other than white, and skew greater than 10 degrees. Moreover, it requires less confutation than the segmentation methods using the other textures described in other papers.
images. The proposed method is less sensitive to user parameters and can deal with segmentations where shadows, noneunifonn character sizes, low resolution, and skewing occur. After the local approach, our method demonstrates superior performance on Web images using visual criteria.

후속연구

Our method has the additional advantage that it can be applied diraetly to the line segment. Further research will focus on developing the text or non-text classifier and character segmentations.

참고문헌 (18)

H. K. Kim, "Efficient Automatic Text Location Method and Content-Based Indexing and Structuring of Video Database," Journal of Visual Communication and Image Representation, Vol.7,No.4, pp.336-344, 1996.

상세보기
R. Lienhart and F. Stuber, "Automatic Text Recognition In Digital Videos," Proceedings of SPIE, vol.2666, pp.180-188, 1996
Y. Zhong, K. Karu, and A. K. Jain, "Locating Text Complex Color Images," Pattern Recognition, Vol.28, No.10, pp.1523-1535, 1995.

상세보기
R. Lienhart and W. Effelsberg, "Automatic Text Segmentation and Text Recognition for Video Indexing," Technical Report TR-98-009, Praktische Informatik IV, University of Mannheim, 2000.
K. C. Jung and J. H. Han, "Hybrid Approach to Efficient Text Extraction in Complex Color Images," Pattern Recognition Letters, Vol.25,pp.679-699, 2004.

상세보기
K. C. Jung, K. I. Kim, and A. K. Jain, "Text Information Extraction in Images and Video: A Survey," Pattern Recognition, Vol.37, No.5, pp.977-997, 2004.

상세보기
J. Zhou and A. D. Lopresti, "Extracting Text from WWW Images," Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR'97), Vol.1, pp.248-252, 1997.
A. Antonacopoulos and F. Delporte, "Automated Interpretation of Visual Representations: Extracting Textual Information from WWW Images," Visual Representations and Interpretations, R.Paton and I.Neilson(eds.), Springer, London, 1999.
J. Zhou, A. D. Lopresti, and T. Tasdizen, "Finding Text in Color Images," Proceedings of the IS&T/SPIE Symposium on Electronic Imaging, SanJose, California, Vol.3305, pp.130-140, 1998.
A. D. Lopresti and J. Zhou, "Locating and Recognizing Text in WWW Images," Information Retrieval, Vol.2, pp.177-206, 2000.

상세보기
A. K. Jain and Y. Bin, "Automatic Text Locationin Images and Video Frames," Pattern Recognition, Vol.2, pp.1497-1499, 1998.
M. I. C. Murguia, "Document Segmentation Using Texture Variance and Low Resolution Images," Image Analysis and Interpretation, 1998 IEEE Southwest Symposiumon, pp. 164-167, 1998.
D. Karatzas and A. Antonacopoulos, Colour "Text Segmentation in Web Images Based on Human Perception," Image and Vision Computing, Elsevier, Vol.25, pp.564-577, 2007.

상세보기
Y. J. Song, K. C. Kim, Y. W. Choi, H. R. Byun, S. H. Kim, S. Y. Chi, D. K. Jang, and Y. K. Chung, "Text Region Extraction and Text Segmentation on Camera Captured Document Style Images," Proceedings of the Eight International Conference on Document Analysis and Recognition (ICDAR'05), Vol.1, pp.172-176, 2005.
I. S. Jung, D. S. Ham, and I. S. Oh, "Empirical Evaluation of Color Variance Method for Text Retrieval from Web Images," Proceedings of the 19th Workshop on Image Processing and Image Understanding (IPIU'08), pp.36-41, 2008.
Y. Li, Y. Zheng, D. Doermann, and S. Jaeger, "Script-Independent Text Line Segmentation in Freestyle Handwritten Documents," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.30, No.8, pp.1313-1329, 2008.

상세보기
M. Makridis, N. Nikolaou, and B. Gatos, "An Efficient Word Segmentation Technique for Historical and Degraded Machine-Printed Documents," Proceedings of the Eight International Conference on Document Analysis and Recognition (ICDAR'07), Vol.1, pp.178-182, 2007.
I. Nwogu and G. H. Kim, "Word Separation of Unconstrained Handwritten Text Lines in PCR Forms," Proceedings of the Eight International Conference on Document Analysis and Recognition (ICDAR'05), Vol.2, pp.715-719, 2005.

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

분산맵을 이용한 웹 이미지 텍스트 영역 추출
Text Region Segmentation from Web Images using Variance Maps 원문보기

초록
AI-Helper

Abstract ▼ AI-Helper

주제어

AI 본문요약
AI-Helper

문제 정의

제안 방법

후속연구

참고문헌 (18)

이 논문을 인용한 문헌

저자의 다른 논문 :

관련 콘텐츠

원문 보기

원문 URL 링크

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

분산맵을 이용한 웹 이미지 텍스트 영역 추출 Text Region Segmentation from Web Images using Variance Maps 원문보기

초록 용어보기논문에서 용어와 풀이말을 자동 추출한 결과로, 시범 서비스 중입니다. AI-Helper

Abstract ▼ AI-Helper

주제어

AI 본문요약 엑셀 다운로드 AI-Helper

문제 정의

제안 방법

후속연구

참고문헌 (18)

이 논문을 인용한 문헌

저자의 다른 논문 :

정인숙 (1) 오일석 (41)

관련 콘텐츠

원문 보기

원문 URL 링크

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

분산맵을 이용한 웹 이미지 텍스트 영역 추출
Text Region Segmentation from Web Images using Variance Maps 원문보기

초록
AI-Helper

AI 본문요약
AI-Helper