Systems and methods for generating composite images of long documents using mobile video data
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
H04N-005/262
H04N-005/265
G06K-009/00
H04N-005/14
출원번호
US-0542157
(2014-11-14)
등록번호
US-9386235
(2016-07-05)
발명자
/ 주소
Ma, Jiyong
Macciola, Anthony
Amtrup, Jan W.
출원인 / 주소
Kofax, Inc.
대리인 / 주소
Zilka-Kotab, PC
인용정보
피인용 횟수 :
13인용 특허 :
249
초록▼
Systems, methods, and computer program products are disclosed and include: initiating a capture operation using an image capture component of the mobile device, the capture operation comprising; capturing video data; and estimating a plurality of motion vectors corresponding to motion of the image c
Systems, methods, and computer program products are disclosed and include: initiating a capture operation using an image capture component of the mobile device, the capture operation comprising; capturing video data; and estimating a plurality of motion vectors corresponding to motion of the image capture component during the capture operation. The systems, techniques, and computer program products also include detecting a document depicted in the video data; tracking a position of the detected document throughout the video data; selecting a plurality of images using the image capture component of the mobile device, wherein the selection is based at least in part on: the tracked position of the detected document; and the estimated motion vectors; and generating a composite image based on at least some of the selected plurality of images.
대표청구항▼
1. A computer program product comprising a non-transitory computer readable medium having stored thereon instructions executable by a mobile device, the instructions being configured to cause the mobile device, upon execution thereof, to: initiate a capture operation using an image capture component
1. A computer program product comprising a non-transitory computer readable medium having stored thereon instructions executable by a mobile device, the instructions being configured to cause the mobile device, upon execution thereof, to: initiate a capture operation using an image capture component of the mobile device, the capture operation comprising; capturing video data; andestimating a plurality of motion vectors corresponding to motion of the image capture component during the capture operation;detect a document depicted in the video data;track a position of the detected document throughout the video data;select a plurality of images using the image capture component of the mobile device, wherein the selection is based at least in part on: the tracked position of the detected document; andthe estimated motion vectors; andgenerate a composite image based on at least some of the selected plurality of images. 2. The computer program product as recited in claim 1, wherein the document is a long document. 3. The computer program product as recited in claim 1, wherein the tracking is based exclusively on the estimated plurality of motion vectors. 4. The computer program product as recited in claim 1, the instructions configured to cause the mobile device to detect the document further comprising instructions configured to cause the mobile device to identify at least one edge of the document depicted in the captured video data. 5. The computer program product as recited in claim 1, wherein each of the selected plurality of images depicts a portion of the document, and wherein the composite image depicts an entirety of the document. 6. The computer program product as recited in claim 1, wherein the composite image is characterized by at least one of: an image resolution greater than an image resolution of any of the selected plurality of images; andan image size greater than an image size of any of the selected plurality of images. 7. The computer program product as recited in claim 1, further comprising instructions configured to cause the mobile device to downsample the video data, and wherein the instructions configured to cause the mobile device to detect the document, track the position of the document, and select the plurality of images is configured to perform the detecting, the tracking, and the selecting using the downsampled video data. 8. The computer program product as recited in claim 1, further comprising instructions configured to cause the mobile device to: determine at least one motion displacement based on some or all of the estimated plurality of motion vectors, each motion displacement corresponding to the image capture component during the capture operation; andeither terminate or pause the capture operation in response to determining one of the motion displacement(s) is characterized by a value exceeding a predefined motion displacement threshold; and either initiate a new capture operation in response to terminating the capture operation; orresume the capture operation in response to pausing the capture operation. 9. The computer program product as recited in claim 8, wherein the predefined motion displacement threshold has a value in a range from about 5 pixels to about 25 pixels. 10. The computer program product as recited in claim 1, wherein each selected image depicts a portion of the document, and wherein the composite image depicts only portion(s) of the document that correspond to a financial transaction memorialized by the document. 11. The computer program product as recited in claim 10, further comprising instructions configured to cause the mobile device to: identify, based on the composite image, one or more portions of the document depicting textual information;classify each identified portion of the document based on the textual information depicted therein; determine whether each classified portion is relevant to the financial transaction or irrelevant to the financial transaction, the determining being based on the portion classification; andremove each portion determined to be irrelevant to the financial transaction from the composite image. 12. The computer program product as recited in claim 11, further comprising instructions configured to cause the mobile device to: align the portions determined to be relevant to the financial transaction; andgenerate a second composite image, wherein the second composite image is characterized by: approximately a same image size as an image size of the composite image;approximately a same image resolution as an image resolution of the composite image;excluding textual information irrelevant to the financial transaction; andincluding textual information relevant to the financial transaction, wherein a plurality of characters comprising the textual information relevant to the financial transaction are aligned. 13. The computer program product as recited in claim 1, the instructions configured to cause the mobile device to select the plurality of images further comprising instructions configured to cause the mobile device to define a plurality of frame pairs, wherein each frame pair consists of a reference frame and a test frame, andwherein each reference frame and each test frame is selected from the video data. 14. The computer program product as recited in claim 13, the instructions configured to cause the mobile device to generate the composite image further comprising instructions configured to cause the mobile device to: detect a skew angle in one or more of the reference frame and the test frame of at least one of the frame pairs, the skew angle corresponding to the document and having a magnitude of >0.0 degrees; andcorrect the skew angle in at least one of the reference frame and the test frame,wherein the document depicted in the composite image is characterized by a skew angle of approximately 0.0 degrees. 15. The computer program product as recited in claim 13, the instructions configured to cause the mobile device to select the plurality of images further comprising instructions configured to cause the mobile device to: determine an amount of overlap between the reference frame and the test frame of each frame pair; andselect an image corresponding to each frame pair for which the amount of overlap between the reference frame and the test frame is greater than a predetermined overlap threshold. 16. The computer program product as recited in claim 15, wherein the amount of overlap corresponds to the document, wherein the amount of overlap does not correspond to a background depicted in the reference frame, andwherein the amount of overlap does not correspond to a background depicted in the test frame. 17. The computer program product as recited in claim 15, wherein the predetermined overlap threshold corresponds to a distance of at least 40% of a length of the reference frame. 18. The computer program product as recited in claim 13, the instructions configured to cause the mobile device to generate the composite image further comprising instructions configured to cause the mobile device to: detect textual information in each of the reference frame and the test frame of at least one frame pair, the textual information being depicted in the document. 19. The computer program product as recited in claim 18, the instructions configured to cause the mobile device to detect textual information in each of the reference frame and the test frame of at least one frame pair further comprising instructions configured to cause the mobile device to: define, in the reference frame, at least one rectangular portion of the document depicting some or ail of the textual information;define, in the test frame, at least one corresponding rectangular portion of the document depicting some or all of the textual information; andalign the document depicted in the test frame with the document depicted in the reference frame, the alignment being based on: the textual information depicted in at least one of the rectangular portion(s); andthe textual information depicted in at least one of the corresponding rectangular portion(s). 20. The computer program product as recited in claim 19, wherein the textual information comprises at least one of: an identity of one or more characters represented in the rectangular portion;an identity of one or more characters represented in the corresponding rectangular portion;a sequence of characters represented in the rectangular portion;a sequence of characters represented in the corresponding rectangular portion;a position of one or more characters represented in the rectangular portion;a position of one or more characters represented in the corresponding rectangular portion;an absolute size of one or more characters represented in the rectangular portion;an absolute size of one or more characters represented in the corresponding rectangular portiona size of one or more characters represented in the rectangular portion relative to a size of one or more characters represented in the corresponding rectangular portion;a size of one or more characters represented in the corresponding rectangular portion relative to a size of one or more characters represented in the rectangular portion;a color of one or more characters represented in the rectangular portion;a color of one or more characters represented in the corresponding rectangular portion;a shape of one or more characters represented in the rectangular portion; anda shape of one or more characters represented in the corresponding rectangular portion. 21. The computer program product as recited in claim 19, the instructions configured to cause the mobile device to align the document depicted in the test frame with the document depicted in the reference frame further comprising instructions configured to cause the mobile device to perform optical character recognition (OCR) on at least the rectangular portion and the corresponding rectangular portion.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (249)
Kawasaki, Somei; Goden, Tatsuhito, Active matrix type display apparatus and driving method thereof.
Nakatsuka Kimihiro,JPX, Apparatus for determining image processing parameter, method of the same, and computer program product for realizing the method.
Barrett Terence W. (Vienna VA), Automata networks and methods for obtaining optimized dynamically reconfigurable computational architectures and control.
Sang ; Jr. Henry W. (Cupertio CA) Tahn Whei-Tsu H. (Sunnyvale CA) Zhang Xiao B. (Foster City CA), Automated method for creating templates in a forms recognition and processing system.
McElroy, John F.; Chorvat, Robert J., Cannabinoid receptor antagonists/inverse agonists useful for treating metabolic disorders, including obesity and diabetes.
Nishimura Kazuyuki (Ichikawa JPX) Sato Shinichi (Yokohama JPX), Color picture processing apparatus for reproducing a color picture having a smoothly changed gradation.
Suzuki,Masahiro; Tamune,Michihiro; Chen,Zhe Hong; Juen,Masahiro, Digital camera, storage medium for image signal processing, carrier wave and electronic camera.
Rowe Edward R. ; Priyadarshan Eswar ; Anderson Kenneth S. ; Al-Shamma Nabeel A. ; Taft Edward A. ; McQuarrie Elizabeth M. ; Cohn Richard, Displaying electronic documents with substitute fonts.
Nagatsuka,Tetsuro; Miyachi,Tatsuo; Shimada,Atsuo; Takeya,Kazutoshi; Kemmochi,Eiji; Nakajima,Akiko; Yamasaki,Makoto; Fujita,Katsuhiko, Document classification system and method for classifying a document according to contents of the document.
Borrey Roland G. (19251 Canyon Dr. Villa Park CA 92667) Borrey Daniel G. (19251 Canyon Dr. Villa Park CA 92667), Document identification by characteristics matching.
Clark ; Jr. Louis George (St. Charles MO) Gummow ; Jr. Donald Romaine (O\Fallon MO) Vanacht Marc (St. Louis MO), Hand-held GUI PDA with GPS/DGPS receiver for collecting agronomic and GPS position data.
LeBrun Thomas Q. (Dallas TX) Cage Kerry (Carrollton TX) Arnold Dennis D. (Carrollton TX), Image based document processing and information management system and apparatus.
Naofumi Yamamoto JP; Haruko Kawakami JP; Gururaj Rao JP, Image processing apparatus for discriminating image field of original document plural times and method therefor.
Appelt, Douglas E.; Arnold, James Frederick; Bear, John S.; Hobbs, Jerry Robert; Israel, David J.; Kameyama, Megumi; Martin, David L.; Myers, Karen Louise; Ravichandran, Gopalan; Stickel, Mark Edward, Information retrieval by natural language querying.
Walnut David Francis ; Berenstein Carlos Alberto ; Liu K. J. Ray ; Rashid-Farrokhi Farrokh, Method and apparatus for processing data from a tomographic imaging system.
Withers,William Douglas, Method and apparatus for recognizing a digitized form, extracting information from a filled-in form, and generating a corrected filled-in form.
Guberman Shelja A. (Moscow RUX) Lossev Ilia (Moscow RUX) Pashintsev Alexander V. (Moscow RUX), Method and apparatus for recognizing cursive writing from sequential input information.
Guberman Shelja A. (Moscow RUX) Lossev Ilia (Moscow RUX) Pashintsev Alexander V. (Moscow RUX), Method and apparatus for recognizing cursive writing from sequential input information.
Polyakov Vladislav G. (Moscow RUX) Ryleev Mikhail A. (Moscow RUX), Method and apparatus for representing image data using polynomial approximation method and iterative transformation-repa.
Green, Stephen J.; Lamere, Paul B.; Alexander, Jeffrey L.; Haberl, Karl R., Method and apparatus for searching and resource discovery in a distributed enterprise system.
Winkelman Kurt-Helfried (Kiel DEX), Method and apparatus for the automatic analysis of density range, color cast, and gradation of image originals on the Ba.
Berman, Arie; Vlahos, Paul; Dadourian, Arpag, Method and apparatus for the automatic generation of subject to background transition area boundary lines and subject shadow retention.
Verstraelen,Boudewijn Joseph Angelus; Verstraelen,Sebastiaan Paul, Method and apparatus for visualization of biological structures with use of 3D position information from segmentation results.
Tischler, Karl M., Method arrangement and computer software for the printing of a separator sheet by means of an electrophotographic printer or copier.
Kurosu Yasuo (Yokosuka JPX) Yokoyama Yoshihiro (Yokohama JPX) Nishikawa Kenichi (Yokohama JPX) Masuzaki Hidefumi (Hadano JPX) Fujinawa Masaaki (Tokyo JPX), Method for determining the amount of skew of image, method for correcting the same, and image data processing system.
Henderson Todd R. ; Spaulding Kevin E. ; Couwenhoven Douglas W., Method for segmenting a digital image into a foreground region and a key color region.
Beaulieu Dennis N. (Churchville NY) Compton John T. (LeRoy NY) Wojtanik Eugene R. (Plano TX), Method of calibration of image scanner signal processing circuits.
Dumais Susan T. ; Heckerman David ; Horvitz Eric ; Platt John Carlton ; Sahami Mehran, Methods and apparatus for classifying text and for building a text classifier.
Michimoto Yasuyuki,JPX ; Onda Katsumasa,JPX ; Nishizawa Masato,JPX, Object detecting apparatus in which the position of a planar object is estimated by using hough transform.
Ellis, Stephen M.; Kennedy, Michael J.; Kurani, Ashish Bhoopen; Lowry, Melissa; Meyyappan, Uma; Sahni, Bipin; Stroke, Nikolai, System and method for a mobile wallet.
Woolf,Susan D.; Baird,Andrew; Jiang,Sheng; Beezer,John L.; Rubin,Darryl E., System and method for annotating an electronic document independently of its content.
Vazquez, Nicolas; Kodosky, Jeffrey L.; Kudukoli, Ram; Schultz, Kevin L.; Nair, Dinesh; Caltagirone, Christophe, System and method for automatically generating a graphical program to perform an image processing algorithm.
Emerson,Geoffrey A.; Moon,Rodney G.; Rector,Gerald C.; Stokes,Raymond F.; Sutton,Andrew H., System and method of sorting document images based on image quality.
Heidenreich,James R.; Higgins,Linda S., System and method to customize the facilitation of development of user thinking about and documenting of an arbitrary problem.
Sampath, Meera; Nichols, Stephen J.; Richenderfer, Elizabeth A., Systems and methods for automated image quality based diagnostics and remediation of document processing systems.
Ferlitsch,Andrew Rodney; DeVore,Darwin Alan, Systems and methods for manipulating electronic information using a three-dimensional iconic representation.
Roach, John J.; Nepomniachtchi, Grisha; Couch, Robert; Avergun, Mikhail, Systems and methods for obtaining financial offers using mobile image capture.
Gorski, Nikolai D.; Semenov, Andrey V.; Anisimov, Valery; Maksimov, Sergey K.; Sashov, Sergey N., Systems and methods for recognizing information in objects using a mobile device.
Borrey, Roland G.; Schmidtler, Mauritius A. R.; Taylor, Robert A.; Fechter, Joel S.; Asuri, Hari S., Systems and methods of accessing random access cache for rescanning.
Schmidtler, Mauritius A. R.; Borrey, Roland G.; Amtrup, Jan W.; Thompson, Stephen Michael, Systems, methods and computer program products for determining document validity.
Schmidtler, Mauritius A. R.; Borrey, Roland G.; Amtrup, Jan W.; Thompson, Stephen Michael, Systems, methods and computer program products for determining document validity.
Schmidtler, Mauritius A. R.; Borrey, Roland G.; Amtrup, Jan W.; Thompson, Stephen Michael, Systems, methods, and computer program products for determining document validity.
Ma, Jiyong; Thompson, Stephen Michael; Amtrup, Jan W., Content-based detection and three dimensional geometric reconstruction of objects in image and video data.
Macciola, Anthony; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher; Amtrup, Jan W., Determining distance between an object and a capture device based on captured image data.
Miyake, Nobutaka; Ishida, Yuki; Song, Wei; Iguchi, Ryosuke, Image processing apparatus that determines code corresponding to a digital watermark embedded in an image.
Song, Wei; Miyake, Nobutaka; Ishida, Yuki; Iguchi, Ryosuke, Image processing apparatus, image processing method, and non-transitory computer-readable storage medium for extracting information embedded in a printed material.
Thrasher, Christopher W.; Shustorovich, Alexander; Thompson, Stephen Michael; Amtrup, Jan W.; Macciola, Anthony, Iterative recognition-guided thresholding and data extraction.
Amtrup, Jan W.; Macciola, Anthony; Thompson, Steve; Ma, Jiyong; Shustorovich, Alexander; Thrasher, Christopher W., Systems and methods for classifying objects in digital images captured using mobile devices.
Thrasher, Christopher W.; Shustorovich, Alexander; Thompson, Stephen Michael; Amtrup, Jan W.; Macciola, Anthony; Borrey, Roland G.; Schmidtler, Mauritius A. R.; Taylor, Robert A.; Fechter, Joel S.; Asuri, Hari S., Systems and methods of processing scanned data.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.