Global geographic information retrieval, validation, and normalization
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06K-009/68
G06K-009/00
G06F-017/30
G06K-009/18
G06K-009/03
G06K-009/48
G06Q-010/10
H04N-001/40
출원번호
US-0146848
(2016-05-04)
등록번호
US-9767354
(2017-09-19)
발명자
/ 주소
Thompson, Stephen Michael
Amtrup, Jan W.
Macciola, Anthony
출원인 / 주소
KOFAX, INC.
대리인 / 주소
Zilka-Kotab, P.C.
인용정보
피인용 횟수 :
7인용 특허 :
341
초록▼
According to one embodiment, a computer-implemented method includes: capturing an image of a document using a camera of a mobile device; performing optical character recognition (OCR) on the image of the document; extracting an identifier of the document from the image based at least in part on the
According to one embodiment, a computer-implemented method includes: capturing an image of a document using a camera of a mobile device; performing optical character recognition (OCR) on the image of the document; extracting an identifier of the document from the image based at least in part on the OCR; comparing the identifier with content from one or more reference data sources, wherein the content from the one or more reference data sources comprises global address information; and determining whether the identifier is valid based at least in part on the comparison. The method may optionally include normalizing the extracted identifier, retrieving additional geographic information, correcting OCR errors, etc. based on comparing extracted information with reference content. Corresponding systems and computer program products are also disclosed.
대표청구항▼
1. A computer-implemented method, comprising: capturing an image of a document using a camera of a mobile device;performing optical character recognition (OCR) on the image of the document;extracting an identifier of the document from the image based at least in part on the OCR;comparing the identif
1. A computer-implemented method, comprising: capturing an image of a document using a camera of a mobile device;performing optical character recognition (OCR) on the image of the document;extracting an identifier of the document from the image based at least in part on the OCR;comparing the identifier with content from one or more reference data sources, wherein the content from the one or more reference data sources comprises global address information; and wherein the content from the one or more reference data sources is derived from geographic information organized in one or more of a proprietary address database and an open source address database; and wherein deriving the content from the geographic information comprises: obtaining the geographic information from one or more of the proprietary address database and an open source address database; andparsing the geographic information according to a set of predefined heuristic rules, wherein the set of predefined heuristic rules are configured to normalize the global address information obtained from the one or more sources according to a single convention for representing address information; anddetermining whether the identifier is valid based at least in part on the comparison. 2. The method as recited in claim 1, wherein the identifier consists of characters selected from a predefined alphabet, wherein the predefined alphabet consists of one or more of numerals, alphabetic characters, and symbols. 3. The method as recited in claim 1, wherein the identifier comprises a partial or complete address. 4. The method as recited in claim 1, wherein the identifier comprises one or more of: a street name, a street number, a block number, a unit number, a city name, a county name, a municipality name, a state name, a state abbreviation, a country name, a country abbreviation, and a ZIP code. 5. The method as recited in claim 1, the comparing comprising fuzzy matching the identifier with the content from the one or more data sources. 6. The method as recited in claim 1, wherein the identifier is validated based at least in part on determining a fuzzy match exists between the identifier and at least a portion of the global address information, wherein the fuzzy match is characterized by no more than two character mismatches between the identifier and at least the portion of the global address information. 7. The method as recited in claim 1, comprising locating the identifier within the image based on a connected components analysis. 8. The method as recited in claim 1, wherein the OCR is performed only on a portion of the image determined to depict the identifier. 9. The method as recited in claim 1, comprising determining a locality associated with the identifier; and wherein the set of predefined heuristic rules are selected based on the locality determined to be associated with the extracted identifier. 10. The method as recited in claim 1, wherein deriving the content from the geographic information comprises populating the one or more data sources with the content, wherein the content consists of geographic information parsed using the set of predefined heuristic rules. 11. The method as recited in claim 1, wherein deriving the content from the geographic information comprises normalizing the geographic information to expand one or more abbreviations present in the geographic information; and wherein the content excludes abbreviated geographic information. 12. The method as recited in claim 1, comprising normalizing the extracted identifier prior to comparing the identifier with content from one or more reference data sources, wherein the normalizing is performed according to one or more predefined business rules corresponding to a particular locality. 13. The method as recited in claim 1, comprising determining a locality corresponding to the extracted identifier, and retrieving additional geographic information associated with a location corresponding to the identifier based at least in part on the locality. 14. The method as recited in claim 13, wherein determining the locality is based at least in part on one or more of a content and a format of the identifier. 15. The method as recited in claim 13, wherein retrieving the additional geographic information is based at least in part on latitude and longitude coordinates corresponding to the identifier. 16. The method as recited in claim 1, comprising at least one of: detecting one or more OCR errors based at least in part on textual information from a complementary document;detecting one or more OCR errors using one or more predefined business rules;detecting one or more OCR errors based at least in part on textual information from the complementary document and one or more of the predefined business rules;correcting at least one detected OCR error using one or more of the predefined business rules;correcting at least one detected OCR error using textual information from the complementary document;correcting at least one detected OCR error using textual information from the complementary document and one or more of the predefined business rules;normalizing data from a complementary document using at least one of the predefined business rules;normalizing data from the document using at least one of the predefined business rules; andnormalizing data from the document using textual information from the complementary document and at least one of the predefined business rules. 17. A computer program product, comprising a non-transitory computer readable storage medium having stored/encoded thereon computer readable program instructions configured to cause a processor, upon execution thereof, to: receive an image of a document;perform optical character recognition (OCR) on the image of the document;extract an identifier of the document from the image based at least in part on the OCR;compare the identifier with content from one or more reference data sources, wherein the content from the one or more reference data sources comprises global address information; and wherein the content from the one or more reference data sources is derived from geographic information organized in one or more of a proprietary address database and an open source address database; and wherein deriving the content from the geographic information comprises: obtaining the geographic information from one or more of the proprietary address database and an open source address database; andparsing the geographic information according to a set of predefined heuristic rules, wherein the set of predefined heuristic rules are configured to normalize the global address information obtained from the one or more sources according to a single convention for representing address information; anddetermine whether the identifier is valid based at least in part on the comparison. 18. A computer-implemented method, comprising: capturing an image using a camera of a mobile device;classifying the image as an image of a document, wherein the classifying comprises: generating a first feature vector representative of the document, based on analyzing the image; andcomparing the first feature vector to a plurality of reference feature matrices;performing optical character recognition (OCR) on the image of the document;extracting an identifier of the document from the image based at least in part on the OCR;comparing the identifier with content from one or more reference data sources, wherein the content from the one or more reference data sources comprises global address information; and wherein the content from the one or more reference data sources is derived from geographic information organized in one or more of a proprietary address database and an open source address database; and wherein deriving the content from the geographic information comprises: obtaining the geographic information from one or more of the proprietary address database and an open source address database; andparsing the geographic information according to a set of predefined heuristic rules, wherein the set of predefined heuristic rules are configured to normalize the global address information obtained from the one or more sources according to a single convention for representing address information;determining whether the identifier is valid based at least in part on the comparison;associating the image of the document with metadata descriptive of one or more of the document and information relating to the document; andstoring the image of the document and the associated metadata to a memory of the mobile device.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (341)
Kawasaki, Somei; Goden, Tatsuhito, Active matrix type display apparatus and driving method thereof.
Gaborski Roger Stephen (Pittsford NY) Pawlicki Thaddeus Francis (Rochester NY), Apparatus and method for identifying specific bone regions in digital X-ray images.
Nakatsuka Kimihiro,JPX, Apparatus for determining image processing parameter, method of the same, and computer program product for realizing the method.
Barrett Terence W. (Vienna VA), Automata networks and methods for obtaining optimized dynamically reconfigurable computational architectures and control.
Block, James; Graef, H. Thomas; Magee, Paul D.; Nelson, Donald S.; Meek, James; McIntyre, Daniel S.; DiPietro, Mark; Ramachandran, Natarajan, Automated banking machine with remote user assistance.
Sang ; Jr. Henry W. (Cupertio CA) Tahn Whei-Tsu H. (Sunnyvale CA) Zhang Xiao B. (Foster City CA), Automated method for creating templates in a forms recognition and processing system.
Iwai, Yoshiaki; Yoshigahara, Takayuki, Camera calibration apparatus and method, image processing apparatus and method, program providing medium, and camera.
McElroy, John F.; Chorvat, Robert J., Cannabinoid receptor antagonists/inverse agonists useful for treating metabolic disorders, including obesity and diabetes.
Nishimura Kazuyuki (Ichikawa JPX) Sato Shinichi (Yokohama JPX), Color picture processing apparatus for reproducing a color picture having a smoothly changed gradation.
Suzuki,Masahiro; Tamune,Michihiro; Chen,Zhe Hong; Juen,Masahiro, Digital camera, storage medium for image signal processing, carrier wave and electronic camera.
Rowe Edward R. ; Priyadarshan Eswar ; Anderson Kenneth S. ; Al-Shamma Nabeel A. ; Taft Edward A. ; McQuarrie Elizabeth M. ; Cohn Richard, Displaying electronic documents with substitute fonts.
Nagatsuka,Tetsuro; Miyachi,Tatsuo; Shimada,Atsuo; Takeya,Kazutoshi; Kemmochi,Eiji; Nakajima,Akiko; Yamasaki,Makoto; Fujita,Katsuhiko, Document classification system and method for classifying a document according to contents of the document.
Borrey Roland G. (19251 Canyon Dr. Villa Park CA 92667) Borrey Daniel G. (19251 Canyon Dr. Villa Park CA 92667), Document identification by characteristics matching.
Clark ; Jr. Louis George (St. Charles MO) Gummow ; Jr. Donald Romaine (O\Fallon MO) Vanacht Marc (St. Louis MO), Hand-held GUI PDA with GPS/DGPS receiver for collecting agronomic and GPS position data.
LeBrun Thomas Q. (Dallas TX) Cage Kerry (Carrollton TX) Arnold Dennis D. (Carrollton TX), Image based document processing and information management system and apparatus.
Mino, Kazuhiro; Yoda, Akira; Ohtsuka, Shuichi; Ono, Shuji; Ito, Wataru; Yamada, Masahiko, Image displaying system and apparatus for displaying images by changing the displayed images based on direction or direction changes of a displaying unit.
Naofumi Yamamoto JP; Haruko Kawakami JP; Gururaj Rao JP, Image processing apparatus for discriminating image field of original document plural times and method therefor.
Appelt, Douglas E.; Arnold, James Frederick; Bear, John S.; Hobbs, Jerry Robert; Israel, David J.; Kameyama, Megumi; Martin, David L.; Myers, Karen Louise; Ravichandran, Gopalan; Stickel, Mark Edward, Information retrieval by natural language querying.
David L. Patton ; John R. Fredlund ; John D. Buhr, Method and apparatus for modifying a portion of an image in accordance with colorimetric parameters.
Walnut David Francis ; Berenstein Carlos Alberto ; Liu K. J. Ray ; Rashid-Farrokhi Farrokh, Method and apparatus for processing data from a tomographic imaging system.
Withers,William Douglas, Method and apparatus for recognizing a digitized form, extracting information from a filled-in form, and generating a corrected filled-in form.
Guberman Shelja A. (Moscow RUX) Lossev Ilia (Moscow RUX) Pashintsev Alexander V. (Moscow RUX), Method and apparatus for recognizing cursive writing from sequential input information.
Guberman Shelja A. (Moscow RUX) Lossev Ilia (Moscow RUX) Pashintsev Alexander V. (Moscow RUX), Method and apparatus for recognizing cursive writing from sequential input information.
Polyakov Vladislav G. (Moscow RUX) Ryleev Mikhail A. (Moscow RUX), Method and apparatus for representing image data using polynomial approximation method and iterative transformation-repa.
Green, Stephen J.; Lamere, Paul B.; Alexander, Jeffrey L.; Haberl, Karl R., Method and apparatus for searching and resource discovery in a distributed enterprise system.
Winkelman Kurt-Helfried (Kiel DEX), Method and apparatus for the automatic analysis of density range, color cast, and gradation of image originals on the Ba.
Berman, Arie; Vlahos, Paul; Dadourian, Arpag, Method and apparatus for the automatic generation of subject to background transition area boundary lines and subject shadow retention.
Verstraelen,Boudewijn Joseph Angelus; Verstraelen,Sebastiaan Paul, Method and apparatus for visualization of biological structures with use of 3D position information from segmentation results.
Ejiri Koichi,JPX ; Guan Haike,JPX ; Aoki Shin,JPX, Method and system for generating a composite image from partially overlapping adjacent images taken along a plurality of axes.
Tischler, Karl M., Method arrangement and computer software for the printing of a separator sheet by means of an electrophotographic printer or copier.
Raskar, Ramesh; Willwacher, Thomas H.; van Baar, Jeroen, Method for determining a largest inscribed rectangular image within a union of projected quadrilateral images.
Kanda Shinji (Kawasaki JPX) Wakitani Jun (Kawasaki JPX) Maruyama Tsugito (Kawasaki JPX) Morita Toshihiko (Kawasaki JPX), Method for determining orientation of contour line segment in local area and for determining straight line and corner.
Kurosu Yasuo (Yokosuka JPX) Yokoyama Yoshihiro (Yokohama JPX) Nishikawa Kenichi (Yokohama JPX) Masuzaki Hidefumi (Hadano JPX) Fujinawa Masaaki (Tokyo JPX), Method for determining the amount of skew of image, method for correcting the same, and image data processing system.
Henderson Todd R. ; Spaulding Kevin E. ; Couwenhoven Douglas W., Method for segmenting a digital image into a foreground region and a key color region.
Kohchi Tsukasa JP, Method of and system for extracting predetermined elements from input document based upon model which is adaptively modified according to variable amount in the input document.
Beaulieu Dennis N. (Churchville NY) Compton John T. (LeRoy NY) Wojtanik Eugene R. (Plano TX), Method of calibration of image scanner signal processing circuits.
Dumais Susan T. ; Heckerman David ; Horvitz Eric ; Platt John Carlton ; Sahami Mehran, Methods and apparatus for classifying text and for building a text classifier.
Cheong, Cheol Ho; Han, Tack Don; Kim, Jong Young; Kim, Eui Jae; Jeong, Seong Hun; Kim, Jae Yun; Choi, Han Yeong, Mixed code, and method and apparatus for generating the same.
Fast Bruce B. (2600 Prindle Rd. Belmont CA 94402) Allen Dana R. (1745 Hunt Dr. Burlingame CA 94010), OCR image preprocessing method for image enhancement of scanned documents.
Michimoto Yasuyuki,JPX ; Onda Katsumasa,JPX ; Nishizawa Masato,JPX, Object detecting apparatus in which the position of a planar object is estimated by using hough transform.
Wong, Patrick, System and a method for web-based editing of documents online with an editing interface and concurrent display to webpages and print documents.
Ellis, Stephen M.; Kennedy, Michael J.; Kurani, Ashish Bhoopen; Lowry, Melissa; Meyyappan, Uma; Sahni, Bipin; Stroke, Nikolai, System and method for a mobile wallet.
Woolf,Susan D.; Baird,Andrew; Jiang,Sheng; Beezer,John L.; Rubin,Darryl E., System and method for annotating an electronic document independently of its content.
Pizano Arturo (Milpitas CA) Tan May-Inn (Saratoga CA) Gambo Naoto (Tanashi JPX), System and method for automatically classifying heterogeneous business forms.
Vazquez, Nicolas; Kodosky, Jeffrey L.; Kudukoli, Ram; Schultz, Kevin L.; Nair, Dinesh; Caltagirone, Christophe, System and method for automatically generating a graphical program to perform an image processing algorithm.
Oppenlander, Timothy J.; Underhill, James; Jackson, Elizabeth; Cook, Rebecca Ann; Dimel, Gary R.; Ortize, Carlos, System and method for electronic document generation and delivery.
Emerson,Geoffrey A.; Moon,Rodney G.; Rector,Gerald C.; Stokes,Raymond F.; Sutton,Andrew H., System and method of sorting document images based on image quality.
Heidenreich,James R.; Higgins,Linda S., System and method to customize the facilitation of development of user thinking about and documenting of an arbitrary problem.
Sampath, Meera; Nichols, Stephen J.; Richenderfer, Elizabeth A., Systems and methods for automated image quality based diagnostics and remediation of document processing systems.
Amtrup, Jan W.; Macciola, Anthony; Thompson, Stephen Michael; Ma, Jiyong, Systems and methods for classifying objects in digital images captured using mobile devices.
Amtrup, Jan Willers; Macciola, Anthony; Thompson, Steve; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher W., Systems and methods for classifying objects in digital images captured using mobile devices.
Amtrup, Jan W.; Ma, Jiyong; Kilby, Steven; Macciola, Anthony, Systems and methods for identification document processing and business workflow integration.
Amtrup, Jan W.; Thompson, Stephen Michael; Kilby, Steven; Macciola, Anthony, Systems and methods for identification document processing and business workflow integration.
Ferlitsch,Andrew Rodney; DeVore,Darwin Alan, Systems and methods for manipulating electronic information using a three-dimensional iconic representation.
Amtrup, Jan Willers; Macciola, Anthony; Shustorovich, Alexander; Thrasher, Christopher W., Systems and methods for mobile image capture and processing.
Macciola, Anthony; Amtrup, Jan Willers; Shustorovich, Alexander; Thrasher, Christopher W., Systems and methods for mobile image capture and processing.
Roach, John J.; Nepomniachtchi, Grisha; Couch, Robert; Avergun, Mikhail, Systems and methods for obtaining financial offers using mobile image capture.
Macciola, Anthony; Amtrup, Jan W.; Ma, Jiyong; Borrey, Roland G.; Schmidtler, Mauritius A. R.; Asuri, Hari S.; Fechter, Joel S.; Taylor, Robert A., Systems and methods for processing video data.
Gorski, Nikolai D.; Semenov, Andrey V.; Anisimov, Valery; Maksimov, Sergey K.; Sashov, Sergey N., Systems and methods for recognizing information in objects using a mobile device.
Macciola, Anthony; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher W.; Amtrup, Jan, Systems and methods for three dimensional geometric reconstruction of captured image data.
Borrey, Roland G.; Schmidtler, Mauritius A. R.; Taylor, Robert A.; Fechter, Joel S.; Asuri, Hari S., Systems and methods of accessing random access cache for rescanning.
Schmidtler, Mauritius A. R.; Borrey, Roland G.; Amtrup, Jan W.; Thompson, Stephen Michael, Systems, methods and computer program products for determining document validity.
Schmidtler, Mauritius A. R.; Borrey, Roland G.; Amtrup, Jan W.; Thompson, Stephen Michael, Systems, methods and computer program products for determining document validity.
Schmidtler, Mauritius A. R.; Borrey, Roland G.; Amtrup, Jan W.; Thompson, Stephen Michael, Systems, methods, and computer program products for determining document validity.
Macciola, Anthony; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher; Amtrup, Jan W., Determining distance between an object and a capture device based on captured image data.
Thrasher, Christopher W.; Shustorovich, Alexander; Thompson, Stephen Michael; Amtrup, Jan W.; Macciola, Anthony, Iterative recognition-guided thresholding and data extraction.
Amtrup, Jan W.; Macciola, Anthony; Thompson, Steve; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher W., Systems and methods for classifying objects in digital images captured using mobile devices.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.