Systems and methods for generating composite images of long documents using mobile video data
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06K-009/00
H04N-005/265
H04N-005/14
출원번호
US-0191442
(2016-06-23)
등록번호
US-9747504
(2017-08-29)
발명자
/ 주소
Ma, Jiyong
Macciola, Anthony
Amtrup, Jan W.
출원인 / 주소
KOFAX, INC.
대리인 / 주소
Zilka-Kotab, P.C.
인용정보
피인용 횟수 :
7인용 특허 :
336
초록▼
Techniques for capturing long document images and generating composite images therefrom include: detecting a document depicted in image data; tracking a position of the detected document within the image data; selecting a plurality of images, wherein the selection is based at least in part on the tr
Techniques for capturing long document images and generating composite images therefrom include: detecting a document depicted in image data; tracking a position of the detected document within the image data; selecting a plurality of images, wherein the selection is based at least in part on the tracked position of the detected document; and generating a composite image based on at least one of the selected plurality of images. The tracking and selection are optionally but preferably based in whole or in part on motion vectors estimated at least partially based on analyzing image data such as test and reference frames within the captured video data/images. Corresponding systems and computer program products are also disclosed.
대표청구항▼
1. A computer program product comprising a non-transitory computer readable medium having stored thereon instructions executable by a processor of a mobile device, the instructions being configured to cause the processor, upon execution thereof, to generate a composite image of a long document with
1. A computer program product comprising a non-transitory computer readable medium having stored thereon instructions executable by a processor of a mobile device, the instructions being configured to cause the processor, upon execution thereof, to generate a composite image of a long document with sufficient resolution for downstream processing by: detecting a long document depicted in image data;tracking a position of the detected long document within the image data;selecting a plurality of images, wherein the selection is based at least in part on the tracked position of the detected long document; andgenerating a composite image of the long document based on at least two of the selected plurality of images, wherein the composite image of the long document is characterized by a resolution greater than a resolution of any of the selected plurality of images, wherein the resolution of the composite image is at least about 200 dots per inch (DPI) or at least about 200 pixels per inch (PPI). 2. The computer program product as recited in claim 1, further comprising instructions configured to cause the processor to identify at least one edge of the document depicted in the image data. 3. The computer program product as recited in claim 1, wherein each of the selected plurality of images depicts a portion of the document, and wherein the composite image depicts an entirety of the document. 4. The computer program product as recited in claim 1, wherein the tracking comprises generating, using the processor, alignment hypotheses between at least some of the plurality of frames of image data, wherein the alignment hypotheses are generated based on matching sampled features between frames of the image data. 5. The computer program product as recited in claim 1, further comprising instructions configured to cause the processor to: estimate one or more motion vectors corresponding to motion of an image capture component used to capture the image data. 6. The computer program product as recited in claim 5, wherein the selection is further based at least in part on the one or more estimated motion vector. 7. The computer program product as recited in claim 5, wherein the tracking is based exclusively on the estimated motion vector(s). 8. The computer program product as recited in claim 5, further comprising instructions configured to cause the processor to: determine at least one motion displacement based on some or all of the estimated motion vector(s);either terminate or pause a capture operation in response to determining one of the motion displacement(s) is characterized by a value exceeding a predefined motion displacement threshold; and either initiate a new capture operation in response to terminating the capture operation; orresume the capture operation in response to pausing the capture operation. 9. The computer program product as recited in claim 1, further comprising instructions configured to cause the processor to: identify, based on the composite image, one or more portions of the document depicting textual information;classify each identified portion of the document based on the textual information depicted therein;determine whether each classified portion is relevant to a financial transaction or irrelevant to the financial transaction, the determination being based on the portion classification; andremove each portion determined to be irrelevant to the financial transaction from the composite image. 10. The computer program product as recited in claim 1, wherein generating the composite image comprises: estimating a homograph transform matrix or an affine transform matrix, wherein the estimation is based on text block matching between the selected plurality of images; andtransforming one of the plurality of images to a coordinate system of another of the plurality of images using the homograph transform matrix or the affine transform matrix. 11. The computer program product as recited in claim 1, the instructions configured to cause the processor to select the plurality of images further comprising instructions configured to cause the processor to define at least one frame pair, wherein each frame pair consists of a reference frame and a test frame, andwherein each reference frame and each test frame are selected from the image data. 12. The computer program product as recited in claim 11, the instructions configured to cause the processor to generate the composite image further comprising instructions configured to cause the processor to: detect a skew angle in one or more of the reference frame and the test frame of at least one of the frame pairs, the skew angle corresponding to the document and having a magnitude of >0.0 degrees; andcorrect the skew angle in at least one of the reference frame and the test frame,wherein the document depicted in the composite image is characterized by a skew angle of approximately 0.0 degrees. 13. The computer program product as recited in claim 11, the instructions configured to cause the processor to select the plurality of images further comprising instructions configured to cause the processor to: determine an amount of overlap between the reference frame and the test frame of at least one frame pair; andselect an image corresponding to at least one frame pair for which the amount of overlap between the reference frame and the test frame is greater than a predetermined overlap threshold. 14. The computer program product as recited in claim 13, wherein the amount of overlap corresponds to the document; and wherein the predetermined overlap threshold is a distance of at least 40% of a length of the reference frame. 15. The computer program product as recited in claim 11, the instructions configured to cause the processor to generate the composite image further comprising instructions configured to cause the processor to: detect textual information in each of the reference frame and the test frame of at least one frame pair, the textual information being depicted in the document. 16. The computer program product as recited in claim 15, the instructions configured to cause the processor to detect textual information in each of the reference frame and the test frame of at least one frame pair further comprising instructions configured to cause the processor to: define, in the reference frame, at least one rectangular portion of the document depicting some or all of the textual information;define, in the test frame, at least one corresponding rectangular portion of the document depicting some or all of the textual information; andalign the document depicted in the test frame with the document depicted in the reference frame. 17. The computer program product as recited in claim 16, wherein the textual information comprises at least one feature selected from a group consisting of: an identity of one or more characters represented in the rectangular portion;an identity of one or more characters represented in the corresponding rectangular portion;a sequence of characters represented in the rectangular portion;a sequence of characters represented in the corresponding rectangular portion;a position of one or more characters represented in the rectangular portion;a position of one or more characters represented in the corresponding rectangular portion;an absolute size of one or more characters represented in the rectangular portion;an absolute size of one or more characters represented in the corresponding rectangular portiona size of one or more characters represented in the rectangular portion relative to a size of one or more characters represented in the corresponding rectangular portion;a size of one or more characters represented in the corresponding rectangular portion relative to a size of one or more characters represented in the rectangular portion;a color of one or more characters represented in the rectangular portion;a color of one or more characters represented in the corresponding rectangular portion;a shape of one or more characters represented in the rectangular portion; anda shape of one or more characters represented in the corresponding rectangular portion. 18. The computer program product as recited in claim 16, the instructions configured to cause the processor to align the document depicted in the test frame with the document depicted in the reference frame further comprising instructions configured to cause the processor to perform optical character recognition (OCR) on at least the rectangular portion and the corresponding rectangular portion. 19. A mobile device having logic embodied therewith, the logic being configured to cause the mobile device, upon execution thereof, to generate an image of a long document sufficient for downstream processing by: detecting, using a processor of the mobile device, a long document depicted in image data;tracking, using the processor of the mobile device, a position of the detected long document within the image data;selecting, using the processor of the mobile device, a plurality of images, wherein the selection is based at least in part on the tracked position of the detected long document; andgenerating, using the processor of the mobile device, a composite image of the long document based on at least two of the selected plurality of images wherein the composite image of the long document is characterized by a resolution greater than a resolution of any of the selected plurality of images, wherein the resolution of the composite image is at least about 200 dots per inch (DPI) or at least about 200 pixels per inch (PPI). 20. A computer-implemented method for generating a composite image of a long document suitable for downstream processing, the method comprising: tracking, using a processor of a mobile device, a long document within a plurality of frames of image data;selecting, using the processor, a subset of the plurality of frames of the image data based on the tracking;generating, using the processor, alignment hypotheses between at least some of the selected subset of frames of image data, wherein the alignment hypotheses are generated based on matching sampled features of one or more reference frames of the image data with sampled features of one or more test frames of the image data;storing at least some of the selected frames of the image data to a memory of the mobile device; andgenerating, using the processor, a composite image of the long document by stitching together at least two of the selected subset of frames;wherein the at least two of the selected subset of frames are characterized by an overlap greater than a predefined overlap threshold; andwherein the composite image is characterized by a resolution of at least about 200 dots per inch (DPI) or a resolution of at least about 200 pixels per inch (PPI).
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (336)
Kawasaki, Somei; Goden, Tatsuhito, Active matrix type display apparatus and driving method thereof.
Gaborski Roger Stephen (Pittsford NY) Pawlicki Thaddeus Francis (Rochester NY), Apparatus and method for identifying specific bone regions in digital X-ray images.
Nakatsuka Kimihiro,JPX, Apparatus for determining image processing parameter, method of the same, and computer program product for realizing the method.
Barrett Terence W. (Vienna VA), Automata networks and methods for obtaining optimized dynamically reconfigurable computational architectures and control.
Block, James; Graef, H. Thomas; Magee, Paul D.; Nelson, Donald S.; Meek, James; McIntyre, Daniel S.; DiPietro, Mark; Ramachandran, Natarajan, Automated banking machine with remote user assistance.
Sang ; Jr. Henry W. (Cupertio CA) Tahn Whei-Tsu H. (Sunnyvale CA) Zhang Xiao B. (Foster City CA), Automated method for creating templates in a forms recognition and processing system.
Iwai, Yoshiaki; Yoshigahara, Takayuki, Camera calibration apparatus and method, image processing apparatus and method, program providing medium, and camera.
McElroy, John F.; Chorvat, Robert J., Cannabinoid receptor antagonists/inverse agonists useful for treating metabolic disorders, including obesity and diabetes.
Nishimura Kazuyuki (Ichikawa JPX) Sato Shinichi (Yokohama JPX), Color picture processing apparatus for reproducing a color picture having a smoothly changed gradation.
Suzuki,Masahiro; Tamune,Michihiro; Chen,Zhe Hong; Juen,Masahiro, Digital camera, storage medium for image signal processing, carrier wave and electronic camera.
Rowe Edward R. ; Priyadarshan Eswar ; Anderson Kenneth S. ; Al-Shamma Nabeel A. ; Taft Edward A. ; McQuarrie Elizabeth M. ; Cohn Richard, Displaying electronic documents with substitute fonts.
Nagatsuka,Tetsuro; Miyachi,Tatsuo; Shimada,Atsuo; Takeya,Kazutoshi; Kemmochi,Eiji; Nakajima,Akiko; Yamasaki,Makoto; Fujita,Katsuhiko, Document classification system and method for classifying a document according to contents of the document.
Borrey Roland G. (19251 Canyon Dr. Villa Park CA 92667) Borrey Daniel G. (19251 Canyon Dr. Villa Park CA 92667), Document identification by characteristics matching.
Clark ; Jr. Louis George (St. Charles MO) Gummow ; Jr. Donald Romaine (O\Fallon MO) Vanacht Marc (St. Louis MO), Hand-held GUI PDA with GPS/DGPS receiver for collecting agronomic and GPS position data.
LeBrun Thomas Q. (Dallas TX) Cage Kerry (Carrollton TX) Arnold Dennis D. (Carrollton TX), Image based document processing and information management system and apparatus.
Mino, Kazuhiro; Yoda, Akira; Ohtsuka, Shuichi; Ono, Shuji; Ito, Wataru; Yamada, Masahiko, Image displaying system and apparatus for displaying images by changing the displayed images based on direction or direction changes of a displaying unit.
Naofumi Yamamoto JP; Haruko Kawakami JP; Gururaj Rao JP, Image processing apparatus for discriminating image field of original document plural times and method therefor.
Appelt, Douglas E.; Arnold, James Frederick; Bear, John S.; Hobbs, Jerry Robert; Israel, David J.; Kameyama, Megumi; Martin, David L.; Myers, Karen Louise; Ravichandran, Gopalan; Stickel, Mark Edward, Information retrieval by natural language querying.
David L. Patton ; John R. Fredlund ; John D. Buhr, Method and apparatus for modifying a portion of an image in accordance with colorimetric parameters.
Walnut David Francis ; Berenstein Carlos Alberto ; Liu K. J. Ray ; Rashid-Farrokhi Farrokh, Method and apparatus for processing data from a tomographic imaging system.
Withers,William Douglas, Method and apparatus for recognizing a digitized form, extracting information from a filled-in form, and generating a corrected filled-in form.
Guberman Shelja A. (Moscow RUX) Lossev Ilia (Moscow RUX) Pashintsev Alexander V. (Moscow RUX), Method and apparatus for recognizing cursive writing from sequential input information.
Guberman Shelja A. (Moscow RUX) Lossev Ilia (Moscow RUX) Pashintsev Alexander V. (Moscow RUX), Method and apparatus for recognizing cursive writing from sequential input information.
Polyakov Vladislav G. (Moscow RUX) Ryleev Mikhail A. (Moscow RUX), Method and apparatus for representing image data using polynomial approximation method and iterative transformation-repa.
Green, Stephen J.; Lamere, Paul B.; Alexander, Jeffrey L.; Haberl, Karl R., Method and apparatus for searching and resource discovery in a distributed enterprise system.
Winkelman Kurt-Helfried (Kiel DEX), Method and apparatus for the automatic analysis of density range, color cast, and gradation of image originals on the Ba.
Berman, Arie; Vlahos, Paul; Dadourian, Arpag, Method and apparatus for the automatic generation of subject to background transition area boundary lines and subject shadow retention.
Verstraelen,Boudewijn Joseph Angelus; Verstraelen,Sebastiaan Paul, Method and apparatus for visualization of biological structures with use of 3D position information from segmentation results.
Ejiri Koichi,JPX ; Guan Haike,JPX ; Aoki Shin,JPX, Method and system for generating a composite image from partially overlapping adjacent images taken along a plurality of axes.
Tischler, Karl M., Method arrangement and computer software for the printing of a separator sheet by means of an electrophotographic printer or copier.
Raskar, Ramesh; Willwacher, Thomas H.; van Baar, Jeroen, Method for determining a largest inscribed rectangular image within a union of projected quadrilateral images.
Kanda Shinji (Kawasaki JPX) Wakitani Jun (Kawasaki JPX) Maruyama Tsugito (Kawasaki JPX) Morita Toshihiko (Kawasaki JPX), Method for determining orientation of contour line segment in local area and for determining straight line and corner.
Kurosu Yasuo (Yokosuka JPX) Yokoyama Yoshihiro (Yokohama JPX) Nishikawa Kenichi (Yokohama JPX) Masuzaki Hidefumi (Hadano JPX) Fujinawa Masaaki (Tokyo JPX), Method for determining the amount of skew of image, method for correcting the same, and image data processing system.
Henderson Todd R. ; Spaulding Kevin E. ; Couwenhoven Douglas W., Method for segmenting a digital image into a foreground region and a key color region.
Kohchi Tsukasa JP, Method of and system for extracting predetermined elements from input document based upon model which is adaptively modified according to variable amount in the input document.
Beaulieu Dennis N. (Churchville NY) Compton John T. (LeRoy NY) Wojtanik Eugene R. (Plano TX), Method of calibration of image scanner signal processing circuits.
Dumais Susan T. ; Heckerman David ; Horvitz Eric ; Platt John Carlton ; Sahami Mehran, Methods and apparatus for classifying text and for building a text classifier.
Cheong, Cheol Ho; Han, Tack Don; Kim, Jong Young; Kim, Eui Jae; Jeong, Seong Hun; Kim, Jae Yun; Choi, Han Yeong, Mixed code, and method and apparatus for generating the same.
Fast Bruce B. (2600 Prindle Rd. Belmont CA 94402) Allen Dana R. (1745 Hunt Dr. Burlingame CA 94010), OCR image preprocessing method for image enhancement of scanned documents.
Michimoto Yasuyuki,JPX ; Onda Katsumasa,JPX ; Nishizawa Masato,JPX, Object detecting apparatus in which the position of a planar object is estimated by using hough transform.
Wong, Patrick, System and a method for web-based editing of documents online with an editing interface and concurrent display to webpages and print documents.
Ellis, Stephen M.; Kennedy, Michael J.; Kurani, Ashish Bhoopen; Lowry, Melissa; Meyyappan, Uma; Sahni, Bipin; Stroke, Nikolai, System and method for a mobile wallet.
Woolf,Susan D.; Baird,Andrew; Jiang,Sheng; Beezer,John L.; Rubin,Darryl E., System and method for annotating an electronic document independently of its content.
Pizano Arturo (Milpitas CA) Tan May-Inn (Saratoga CA) Gambo Naoto (Tanashi JPX), System and method for automatically classifying heterogeneous business forms.
Vazquez, Nicolas; Kodosky, Jeffrey L.; Kudukoli, Ram; Schultz, Kevin L.; Nair, Dinesh; Caltagirone, Christophe, System and method for automatically generating a graphical program to perform an image processing algorithm.
Oppenlander, Timothy J.; Underhill, James; Jackson, Elizabeth; Cook, Rebecca Ann; Dimel, Gary R.; Ortize, Carlos, System and method for electronic document generation and delivery.
Emerson,Geoffrey A.; Moon,Rodney G.; Rector,Gerald C.; Stokes,Raymond F.; Sutton,Andrew H., System and method of sorting document images based on image quality.
Heidenreich,James R.; Higgins,Linda S., System and method to customize the facilitation of development of user thinking about and documenting of an arbitrary problem.
Sampath, Meera; Nichols, Stephen J.; Richenderfer, Elizabeth A., Systems and methods for automated image quality based diagnostics and remediation of document processing systems.
Amtrup, Jan W.; Macciola, Anthony; Thompson, Stephen Michael; Ma, Jiyong, Systems and methods for classifying objects in digital images captured using mobile devices.
Amtrup, Jan Willers; Macciola, Anthony; Thompson, Steve; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher W., Systems and methods for classifying objects in digital images captured using mobile devices.
Amtrup, Jan W.; Ma, Jiyong; Kilby, Steven; Macciola, Anthony, Systems and methods for identification document processing and business workflow integration.
Amtrup, Jan W.; Thompson, Stephen Michael; Kilby, Steven; Macciola, Anthony, Systems and methods for identification document processing and business workflow integration.
Ferlitsch,Andrew Rodney; DeVore,Darwin Alan, Systems and methods for manipulating electronic information using a three-dimensional iconic representation.
Amtrup, Jan Willers; Macciola, Anthony; Shustorovich, Alexander; Thrasher, Christopher W., Systems and methods for mobile image capture and processing.
Macciola, Anthony; Amtrup, Jan Willers; Shustorovich, Alexander; Thrasher, Christopher W., Systems and methods for mobile image capture and processing.
Roach, John J.; Nepomniachtchi, Grisha; Couch, Robert; Avergun, Mikhail, Systems and methods for obtaining financial offers using mobile image capture.
Macciola, Anthony; Amtrup, Jan W.; Ma, Jiyong; Borrey, Roland G.; Schmidtler, Mauritius A. R.; Asuri, Hari S.; Fechter, Joel S.; Taylor, Robert A., Systems and methods for processing video data.
Gorski, Nikolai D.; Semenov, Andrey V.; Anisimov, Valery; Maksimov, Sergey K.; Sashov, Sergey N., Systems and methods for recognizing information in objects using a mobile device.
Macciola, Anthony; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher W.; Amtrup, Jan, Systems and methods for three dimensional geometric reconstruction of captured image data.
Borrey, Roland G.; Schmidtler, Mauritius A. R.; Taylor, Robert A.; Fechter, Joel S.; Asuri, Hari S., Systems and methods of accessing random access cache for rescanning.
Schmidtler, Mauritius A. R.; Borrey, Roland G.; Amtrup, Jan W.; Thompson, Stephen Michael, Systems, methods and computer program products for determining document validity.
Schmidtler, Mauritius A. R.; Borrey, Roland G.; Amtrup, Jan W.; Thompson, Stephen Michael, Systems, methods and computer program products for determining document validity.
Schmidtler, Mauritius A. R.; Borrey, Roland G.; Amtrup, Jan W.; Thompson, Stephen Michael, Systems, methods, and computer program products for determining document validity.
Macciola, Anthony; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher; Amtrup, Jan W., Determining distance between an object and a capture device based on captured image data.
Thrasher, Christopher W.; Shustorovich, Alexander; Thompson, Stephen Michael; Amtrup, Jan W.; Macciola, Anthony, Iterative recognition-guided thresholding and data extraction.
Amtrup, Jan W.; Macciola, Anthony; Thompson, Steve; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher W., Systems and methods for classifying objects in digital images captured using mobile devices.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.