Systems and methods for measuring depth based upon occlusion patterns in images
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06T-007/00
G02B-027/00
G06T-015/20
H04N-013/02
H04N-013/00
H04N-009/097
출원번호
US-0526407
(2014-10-28)
등록번호
US-9129377
(2015-09-08)
발명자
/ 주소
Ciurea, Florian
Venkataraman, Kartik
Molina, Gabriel
Lelescu, Dan
출원인 / 주소
Pelican Imaging Corporation
대리인 / 주소
KPPB LLP
인용정보
피인용 횟수 :
56인용 특허 :
140
초록▼
Systems in accordance with embodiments of the invention can perform parallax detection and correction in images captured using array cameras. Due to the different viewpoints of the cameras, parallax results in variations in the position of objects within the captured images of the scene. Methods in
Systems in accordance with embodiments of the invention can perform parallax detection and correction in images captured using array cameras. Due to the different viewpoints of the cameras, parallax results in variations in the position of objects within the captured images of the scene. Methods in accordance with embodiments of the invention provide an accurate account of the pixel disparity due to parallax between the different cameras in the array, so that appropriate scene-dependent geometric shifts can be applied to the pixels of the captured images when performing super-resolution processing. In a number of embodiments, generating depth estimates considers the similarity of pixels in multiple spectral channels. In certain embodiments, generating depth estimates involves generating a confidence map indicating the reliability of depth estimates.
대표청구항▼
1. A camera array, comprising: a plurality of cameras configured to capture images of a scene from different viewpoints in complementary occlusion zones around a reference viewpoint;a processor; andmemory containing an image processing application;wherein the image processing application stored in m
1. A camera array, comprising: a plurality of cameras configured to capture images of a scene from different viewpoints in complementary occlusion zones around a reference viewpoint;a processor; andmemory containing an image processing application;wherein the image processing application stored in memory directs the processor to: separately configure imaging parameters for each of the plurality of cameras;read out image data from the plurality of cameras including a set of images captured from different viewpoints;store the image data in the memory;normalize the set of images to increase the similarity of corresponding pixels within the set of images;determine initial depth estimates for pixel locations in an image from the reference viewpoint based upon the disparity at which corresponding pixels in the set of images have the highest degree of similarity;compare the similarity of the corresponding pixels in the set of images to detect mismatched pixels;when an initial depth estimate does not result in the detection of a mismatch between corresponding pixels in the set of images, selecting the initial depth estimate as the depth estimate for the pixel location in the image from the reference viewpoint; andwhen an initial depth estimate results in the detection of a mismatch between corresponding pixels in the set of images, updating the depth estimate for the pixel location in the image from the reference viewpoint by: determining a set of candidate depth estimates using a plurality of competing subsets of the set of images based upon disparities at which corresponding pixels in each of the plurality of competing subsets of images have the highest degree of similarity, where the competing subsets correspond to patterns of visibility within the scene; andselecting the candidate depth of the subset having the corresponding pixels with the highest degree of similarity as the updated depth estimate for the pixel location in the image from the reference viewpoint. 2. The camera array of claim 1, wherein the plurality of cameras comprises different types of cameras that capture different wavelengths of light. 3. The camera array of claim 2, wherein at least one camera of each different type is located in each quadrant surrounding the reference viewpoint. 4. The camera array of claim 3, wherein each of the two red color cameras is located at a corner location of the 3×3 array of cameras, and wherein each of the two blue color cameras is located at a corner location of the 3×3 array of cameras. 5. The camera array of claim 3, wherein the image processing application further directs the processor to select the viewpoint of the reference camera as the reference viewpoint. 6. The camera array of claim 2, wherein the plurality of cameras includes at least a 3×3 array of cameras comprising: a reference camera at the center of the 3×3 array of cameras;two red color cameras in complementary occlusion zones located on opposite sides of the 3×3 array of cameras;two blue color cameras located in complementary occlusion zones on opposite sides of the 3×3 array of cameras; andfour green color cameras in complementary occlusion zones surrounding the reference camera. 7. The camera array of claim 6, wherein: each of the four green color cameras surrounding the reference camera is disposed at a corner location of the 3×3 array of cameras; andeach competing subsets of images only includes one of the images captured by the four green color cameras. 8. The camera array of claim 6, wherein the reference camera is a green color camera. 9. The camera array of claim 6, wherein the reference camera is one of: a camera that incorporates a Bayer filter, a camera that is configured to capture infrared light, and a camera that is configured to capture ultraviolet light. 10. The camera array of claim 2, wherein: the plurality of cameras captures images in a plurality of color channels; andthe image processing application further directs the processor to determine an initial depth estimate for a given pixel location in an image from the reference viewpoint based upon the disparity at which corresponding pixels in the set of images have the highest degree of similarity by: identifying pixels in at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;in each of a plurality of color channels, comparing the similarity of the pixels that are identified as corresponding in the selected color channel at each of the plurality of depths; andselecting the depth from the plurality of depths at which the identified corresponding pixels in each of the plurality of color channels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint. 11. The camera array of claim 1, wherein the same number of images are in each of the competing subsets of images. 12. The camera array of claim 11, wherein: the plurality of cameras captures images in a plurality of color channels; andeach competing subset of images includes the same number of images from at least one of the plurality of color channels. 13. The camera array of claim 12, wherein each competing subset of images includes at least one image from each of the plurality of color channels. 14. The camera array of claim 12, wherein each competing subset of images includes at least one image from all but one of the plurality of color channels. 15. The camera array of claim 1, wherein the image processing application stored in memory directs the processor to record in at least one visibility map that the pixel location in the image from the reference viewpoint is visible in each image in the subset of images used to determine the updated depth estimate for the pixel location. 16. The camera array of claim 15, wherein the image processing application stored in memory directs the processor to record in the at least one visibility map the visibility of the pixel location in the image from the reference viewpoint in a given image that is not part of the subset of images used to determine the updated depth estimate for the pixel location based upon the degree of similarity of the corresponding pixel in the given image to the corresponding pixels in the subset of images used to determine the updated depth estimate for the pixel location. 17. The camera array of claim 15, wherein the optics within each camera are configured so that the pixels of the camera sample the same object space with sub-pixel offsets. 18. The camera array of claim 17, wherein the image processing application further directs the processor to: fuse pixels from the set of images using the depth estimates to create a fused image having a resolution that is greater than the resolutions of the images in the set of images by: identifying the pixels from the set of images that are visible in an image from the reference viewpoint using the at least one visibility map; andapplying scene dependent geometric shifts to the pixels from the set of images that are visible in an image from the reference viewpoint to shift the pixels into the reference viewpoint, where the scene dependent geometric shifts are determined using the depth estimates; andfusing the shifted pixels from the set of images to create a fused image from the reference viewpoint having a resolution that is greater than the resolutions of the images in the set of images. 19. The camera array of claim 18, wherein the image processing application further directs the processor to synthesize an image from the reference viewpoint by performing a super-resolution process based upon the fused image from the reference viewpoint, the set of images captured from different viewpoints, the depth estimates, and the visibility information. 20. A method of estimating distances to objects within a scene from a set of images captured from different viewpoints in complementary occlusion zones around a reference viewpoint using a processor configured by an image processing application, the method comprising: selecting a reference viewpoint relative to the viewpoints of the set of images captured from different viewpoints;normalizing the set of images to increase the similarity of corresponding pixels within the set of images;determining initial depth estimates for pixel locations in an image from the reference viewpoint based upon the disparity at which corresponding pixels in the set of images have the highest degree of similarity;comparing the similarity of the corresponding pixels in the set of images to detect mismatched pixels;when an initial depth estimate does not result in the detection of a mismatch between corresponding pixels in the set of images, selecting the initial depth estimate as the depth estimate for the pixel location in the image from the reference viewpoint; andwhen an initial depth estimate results in the detection of a mismatch between corresponding pixels in the set of images, updating the depth estimate for the pixel location in the image from the reference viewpoint by: determining a set of candidate depth estimates using a plurality of competing subsets of the set of images based upon disparities at which corresponding pixels in each of the plurality of competing subsets of images have the highest degree of similarity, where the competing subsets correspond to patterns of visibility within the scene; andselecting the candidate depth of the subset having the corresponding pixels with the highest degree of similarity as the updated depth estimate for the pixel location in the image from the reference viewpoint. 21. The method of claim 20, wherein: the set of images comprises image data in a plurality of color channels; andthe image processing application further directs the processor to determine an initial depth estimate for a given pixel location in an image from the reference viewpoint based upon the disparity at which corresponding pixels in the set of images have the highest degree of similarity by: identifying pixels in at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;in each of a plurality of color channels, comparing the similarity of the pixels that are identified as corresponding in the selected color channel at each of the plurality of depths; andselecting the depth from the plurality of depths at which the identified corresponding pixels in each of the plurality of color channels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint. 22. The method of claim 20, further comprising recording in at least one visibility map that the pixel location in the image from the reference viewpoint is visible in each image in the subset of images used to determine the updated depth estimate for the pixel location. 23. The method of claim 22, further comprising recording in the at least one visibility map the visibility of the pixel location in the image from the reference viewpoint in a given image that is not part of the subset of images used to determine the updated depth estimate for the pixel location based upon the degree of similarity of the corresponding pixel in the given image to the corresponding pixels in the subset of images used to determine the updated depth estimate for the pixel location. 24. The method of claim 22, further comprising: fusing pixels from the set of images using the depth estimates to create a fused image having a resolution that is greater than the resolutions of the images in the set of images by: identifying the pixels from the set of images that are visible in an image from the reference viewpoint using the at least one visibility map; andapplying scene dependent geometric shifts to the pixels from the set of images that are visible in an image from the reference viewpoint to shift the pixels into the reference viewpoint, where the scene dependent geometric shifts are determined using the depth estimates; andfusing the shifted pixels from the set of images to create a fused image from the reference viewpoint having a resolution that is greater than the resolutions of the images in the set of images. 25. The method of claim 24, further comprising synthesizing an image from the reference viewpoint by performing a super-resolution process based upon the fused image from the reference viewpoint, the set of images captured from different viewpoints, the depth estimates, and the visibility information. 26. A camera array, comprising: an array camera module configured to capture images of a scene from different viewpoints in complementary occlusion zones around a reference viewpoint, the array camera module comprising: an imager array including an array of focal planes, where each focal plane includes a plurality of rows of pixels that also forms a plurality of columns of pixels, and each focal plane is contained within a region of the imager that does not contain pixels from another focal plane; andan optic array including an array of lens stacks, where each lens stack creates an optical channel that forms an image of the scene on an array of pixels within a corresponding focal plane;a processor; andmemory containing an image processing application;wherein the image processing application stored in memory directs the processor to: independently control the imaging parameters of the focal planes in the array camera module;read out image data from the array camera module forming a set of image images captured from different viewpoints;store the image data in the memory;normalize the set of images to increase the similarity of corresponding pixels within the set of images;determine initial depth estimates for pixel locations in an image from the reference viewpoint based upon the disparity at which corresponding pixels in the set of images have the highest degree of similarity;compare the similarity of the corresponding pixels in the set of images to detect mismatched pixels;when an initial depth estimate does not result in the detection of a mismatch between corresponding pixels in the set of images, selecting the initial depth estimate as the depth estimate for the pixel location in the image from the reference viewpoint; andwhen an initial depth estimate results in the detection of a mismatch between corresponding pixels in the set of images, updating the depth estimate for the pixel location in the image from the reference viewpoint by: determining a set of candidate depth estimates using a plurality of competing subsets of the set of images based upon disparities at which corresponding pixels in each of the plurality of competing subsets of images have the highest degree of similarity, where the competing subsets correspond to patterns of visibility within the scene; andselecting the candidate depth of the subset having the corresponding pixels with the highest degree of similarity as the updated depth estimate for the pixel location in the image from the reference viewpoint; andrecord in at least one visibility map that the pixel location in the image from the reference viewpoint is visible in each image in the subset of images used to determine the updated depth estimate for the pixel location. 27. The camera array of claim 26, wherein: the array camera module captures images in a plurality of color channels; andthe image processing application further directs the processor to determine an initial depth estimate for a given pixel location in an image from the reference viewpoint based upon the disparity at which corresponding pixels in the set of images have the highest degree of similarity by: identifying pixels in at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths;in each of a plurality of color channels, comparing the similarity of the pixels that are identified as corresponding in the selected color channel at each of the plurality of depths; andselecting the depth from the plurality of depths at which the identified corresponding pixels in each of the plurality of color channels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint. 28. The camera array of claim 26, wherein the image processing application stored in memory directs the processor to record in the at least one visibility map the visibility of the pixel location in the image from the reference viewpoint in a given image that is not part of the subset of images used to determine the updated depth estimate for the pixel location based upon the degree of similarity of the corresponding pixel in the given image to the corresponding pixels in the subset of images used to determine the updated depth estimate for the pixel location. 29. The camera array of claim 26, wherein: the optics within each camera are configured so that the pixels of the camera sample the same object space with sub-pixel offsets; andthe image processing application further directs the processor to: fuse pixels from the set of images using the depth estimates to create a fused image having a resolution that is greater than the resolutions of the images in the set of images by: identifying the pixels from the set of images that are visible in an image from the reference viewpoint using the at least one visibility map;applying scene dependent geometric shifts to the pixels from the set of images that are visible in an image from the reference viewpoint to shift the pixels into the reference viewpoint, where the scene dependent geometric shifts are determined using the depth estimates; andfusing the shifted pixels from the set of images to create a fused image from the reference viewpoint having a resolution that is greater than the resolutions of the images in the set of images. 30. The camera array of claim 29, wherein the image processing application further directs the processor to synthesize an image from the reference viewpoint by performing a super-resolution process based upon the fused image from the reference viewpoint, the set of images captured from different viewpoints, the depth estimates, and the visibility information.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (140)
Wilburn, Bennett; Joshi, Neel; Levoy, Marc C.; Horowitz, Mark, Apparatus and method for capturing a scene using staggered triggering of dense camera arrays.
Iwase Toshihiro (Nara JPX) Kanekura Hiroshi (Yamatokouriyama JPX), Apparatus for and method of converting a sampling frequency according to a data driven type processing.
Boisvert, David Michael; McMahon, Andrew Kenneth John, CCD output processing stage that amplifies signals from colored pixels based on the conversion efficiency of the colored pixels.
Venkataraman, Kartik; Jabbi, Amandeep S.; Mullis, Robert H., Capturing and processing of images using monolithic camera array with heterogeneous imagers.
Venkataraman, Kartik; Jabbi, Amandeep S.; Mullis, Robert H.; Duparre, Jacques; Hu, Shane Ching-Feng, Capturing and processing of images using monolithic camera array with heterogeneous imagers.
Yamashita,Syugo; Murata,Haruhiko; Iinuma,Toshiya; Nakashima,Mitsuo; Mori,Takayuki, Device and method for converting two-dimensional video to three-dimensional video.
Ward, Gregory John; Seetzen, Helge; Heidrich, Wolfgang, Electronic camera having multiple sensors for capturing high dynamic range images and related methods.
Abell Gurdon R. (West Woodstock CT) Cook Francis J. (Topsfield MA) Howes Peter D. (Sudbury MA), Method and apparatus for arraying image sensor modules.
Sawhney,Harpreet Singh; Tao,Hai; Kumar,Rakesh; Hanna,Keith, Method and apparatus for synthesizing new video and/or still imagery from a collection of real video and/or still imagery.
Han, Hee-chul; Choi, Yang-lim; Cho, Seung-ki, Method of generating image data by an image device including a plurality of lenses and apparatus for generating image data.
Alexander David H. (Santa Monica CA) Hershman George H. (Carlsbad CA) Jack Michael D. (Carlsbad CA) Koda N. John (Vista CA) Lloyd Randahl B. (San Marcos CA), Monolithic imager for near-IR.
Hornbaker ; III Cecil V. (New Carrolton MD) Driggers Thomas C. (Falls Church VA) Bindon Edward W. (Fairfax VA), Scanning apparatus using multiple CCD arrays and related method.
Ciurea, Florian; Venkataraman, Kartik; Molina, Gabriel; Lelescu, Dan, Systems and methods for parallax detection and correction in images captured using array cameras that contain occlusions using subsets of images to perform depth estimation.
Venkataraman, Kartik; Jabbi, Amandeep S.; Mullis, Robert H., Systems and methods for parallax measurement using camera arrays incorporating 3 x 3 camera configurations.
Ciurea, Florian; Venkataraman, Kartik; Molina, Gabriel; Lelescu, Dan, Systems and methods for performing depth estimation using image data from multiple spectral channels.
Ludwig, Lester F., Vignetted optoelectronic array for use in synthetic image formation via signal processing, lensless cameras, and integrated camera-displays.
Rieger Albert,DEX ; Barclay David ; Chapman Steven ; Kellner Heinz-Andreas,DEX ; Reibl Michael,DEX ; Rydelek James G. ; Schweizer Andreas,DEX, Watertight body for accommodating a photographic camera.
Venkataraman, Kartik; Gallagher, Paul; Jain, Ankit K.; Nisenzon, Semyon; Lelescu, Dan; Ciurea, Florian; Molina, Gabriel, Autofocus system for a conventional camera that uses depth information from an array camera.
Venkataraman, Kartik; Jabbi, Amandeep S.; Mullis, Robert H.; Duparre, Jacques; Hu, Shane Ching-Feng, Capturing and processing of images including occlusions focused on an image sensor by a lens stack array.
Venkataraman, Kartik; Jabbi, Amandeep S.; Mullis, Robert H.; Duparre, Jacques; Hu, Shane Ching-Feng, Capturing and processing of images using camera array incorperating Bayer cameras having different fields of view.
Srikanth, Manohar; Ramamoorthi, Ravi; Venkataraman, Kartik; Chatterjee, Priyam, System and methods for depth regularization and semiautomatic interactive matting using RGB-D images.
Nayar, Shree; Venkataraman, Kartik; Pain, Bedabrata; Lelescu, Dan, Systems and methods for controlling aliasing in images captured by an array camera for use in super resolution processing using pixel apertures.
Lelescu, Dan; Venkataraman, Kartik, Systems and methods for controlling aliasing in images captured by an array camera for use in super-resolution processing.
Duparre, Jacques; McMahon, Andrew Kenneth John; Lelescu, Dan; Venkataraman, Kartik; Molina, Gabriel, Systems and methods for detecting defective camera arrays and optic arrays.
Ciurea, Florian; Venkataraman, Kartik; Molina, Gabriel; Lelescu, Dan, Systems and methods for estimating depth and visibility from a reference viewpoint for pixels in a set of images captured from different viewpoints.
Venkataraman, Kartik; Lelescu, Dan; Molina, Gabriel, Systems and methods for generating compressed light field representation data using captured light fields, array geometry, and parallax information.
Venkataraman, Kartik; Lelescu, Dan; Molina, Gabriel, Systems and methods for generating compressed light field representation data using captured light fields, array geometry, and parallax information.
Venkataraman, Kartik; Jabbi, Amandeep S.; Mullis, Robert H., Systems and methods for generating depth maps using a camera arrays incorporating monochrome and color cameras.
Venkataraman, Kartik; Jabbi, Amandeep S.; Mullis, Robert H., Systems and methods for generating depth maps using a camera arrays incorporating monochrome and color cameras.
Venkataraman, Kartik; Jabbi, Amandeep S.; Mullis, Robert H., Systems and methods for generating depth maps using images captured by camera arrays incorporating cameras having different fields of view.
Duparre, Jacques; McMahon, Andrew Kenneth John; Lelescu, Dan, Systems and methods for manufacturing camera modules using active alignment of lens stack arrays and sensors.
Duparre, Jacques; McMahon, Andrew Kenneth John; Lelescu, Dan, Systems and methods for manufacturing camera modules using active alignment of lens stack arrays and sensors.
Venkataraman, Kartik; Jabbi, Amandeep S.; Mullis, Robert H., Systems and methods for measuring depth using images captured by a camera array including cameras surrounding a central camera.
Venkataraman, Kartik; Huang, Yusong; Jain, Ankit K.; Chatterjee, Priyam, Systems and methods for performing high speed video capture and depth estimation using array cameras.
Lelescu, Dan; Duong, Thang, Systems and methods for synthesizing high resolution images using image deconvolution based on motion and depth information.
Lelescu, Dan; Molina, Gabriel; Venkataraman, Kartik, Systems and methods for synthesizing high resolution images using images captured by an array of independently controllable imagers.
Venkataraman, Kartik; Nisenzon, Semyon; Chatterjee, Priyam; Molina, Gabriel, Systems and methods for synthesizing images from image data captured by an array camera using restricted depth of field depth maps in which depth estimation precision varies.
Venkataraman, Kartik; Nisenzon, Semyon; Chatterjee, Priyam; Molina, Gabriel, Systems and methods for synthesizing images from image data captured by an array camera using restricted depth of field depth maps in which depth estimation precision varies.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.