IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0221552
(2008-08-04)
|
등록번호 |
US-7742623
(2010-07-12)
|
발명자
/ 주소 |
- Moon, Hankyu
- Sharma, Rajeev
- Jung, Namsoon
|
출원인 / 주소 |
|
인용정보 |
피인용 횟수 :
47 인용 특허 :
7 |
초록
▼
The present invention is a method and system to estimate the visual target that people are looking, based on automatic image measurements. The system utilizes image measurements from both face-view cameras and top-down view cameras. The cameras are calibrated with respect to the site and the visual
The present invention is a method and system to estimate the visual target that people are looking, based on automatic image measurements. The system utilizes image measurements from both face-view cameras and top-down view cameras. The cameras are calibrated with respect to the site and the visual target, so that the gaze target is determined from the estimated position and gaze direction of a person. Face detection and two-dimensional pose estimation locate and normalize the face of the person so that the eyes can be accurately localized and the three-dimensional facial pose can be estimated. The eye gaze is estimated based on either the positions of localized eyes and irises or on the eye image itself, depending on the quality of the image. The gaze direction is estimated from the eye gaze measurement in the context of the three-dimensional facial pose. From the top-down view the body of the person is detected and tracked, so that the position of the head is estimated using a body blob model that depends on the body position in the view. The gaze target is determined based on the estimated gaze direction, estimated head pose, and the camera calibration. The gaze target estimation can provide a gaze trajectory of the person or a collective gaze map from many instances of gaze.
대표청구항
▼
What is claimed is: 1. A method for estimating a gaze target within a visual target that a person is looking based on automatic image measurements, comprising the following steps of: a) processing calibrations for at least a first means for capturing images for face-view and at least a second means
What is claimed is: 1. A method for estimating a gaze target within a visual target that a person is looking based on automatic image measurements, comprising the following steps of: a) processing calibrations for at least a first means for capturing images for face-view and at least a second means for capturing images for top-down view, b) determining a target grid of the visual target, c) detecting and tracking a face of the person from first input images captured by the first means for capturing images, d) estimating a two-dimensional pose and a three-dimensional pose of the face, e) localizing facial features to extract an eye image of the face, f) estimating eye gaze of the person and estimating gaze direction of the person based on the estimated eye gaze and the three-dimensional facial pose of the person, g) detecting and tracking the person from second input images captured by the second means for capturing images, h) estimating a head position using the top-down view calibration, and i) estimating the gaze target of the person from the estimated gaze direction and the head position of the person using the face-view calibration. 2. The method according to claim 1, wherein the method further comprises a step of taking geometric measurements of the site and the visual target to come up with specifications and the calibrations for the means for capturing images. 3. The method according to claim 1, wherein the method further comprises steps of: a) estimating a gaze direction estimation error distribution, and b) determining the target grid based on the gaze direction estimation error distribution and average distance between the person and the visual target. 4. The method according to claim 1, wherein the method further comprises a step of determining a mapping from the estimated head position and the estimated gaze direction to the target grid. 5. The method according to claim 1, wherein the method further comprises a step of determining the mapping from the second input image coordinate to the floor coordinate, based on the position and orientation of the first means for capturing images. 6. The method according to claim 1, wherein the method further comprises a step of training a plurality of first machines for estimating the three-dimensional pose of the face. 7. The method according to claim 1, wherein the method further comprises a step of training a plurality of second machines for estimating the two-dimensional pose of the face. 8. The method according to claim 1, wherein the method further comprises a step of training a plurality of third machines for localizing each facial feature of the face. 9. The method according to claim 1, wherein the method further comprises a step of training at least a fourth machine for estimating the eye gaze from the eye image. 10. The method according to claim 9, wherein the method further comprises a step of annotating the eye images with both the eye gaze and a confidence level of the eye gaze annotation. 11. The method according to claim 10, wherein the method further comprises a step of training the fourth machine so that the machine outputs both the eye gaze and the confidence level of the eye gaze estimate. 12. The method according to claim 1, wherein the method further comprises a step of training at least a fifth machine for estimating the gaze direction. 13. The method according to claim 12, wherein the method further comprises a step of training the fifth machine for estimating the gaze direction from the eye gaze and the three-dimensional facial pose. 14. The method according to claim 12, wherein the method further comprises a step of employing the fifth machine for estimating the gaze direction from the eye image and the three-dimensional facial pose. 15. The method according to claim 12, wherein the method further comprises a step of training the fifth machine so that the machine outputs both the gaze direction and the confidence level of the gaze direction estimate. 16. The method according to claim 15, wherein the method further comprises a step of estimating a gaze map by weighting each of the gaze target estimates with the confidence levels corresponding to the gaze direction estimates. 17. The method according to claim 1, wherein the method further comprises a step of selecting a stream of first input images among a plurality of streams of first input images when the person's face appears to more than one stream of first input images, based on the person's distance to each of the plurality of first means for capturing images and the three-dimensional facial poses relative to each of the plurality of first means for capturing images. 18. The method according to claim 1, wherein the method further comprises a step of utilizing a view-based body blob model to estimate the head position of the person. 19. The method according to claim 1, wherein the method further comprises a step of constructing a gaze trajectory and a gaze map based on the estimated gaze target. 20. An apparatus for estimating a gaze target within a visual target that a person is looking based on automatic image measurements, comprising: a) means for processing calibrations for at least a first means for capturing images for face-view and at least a second means for capturing images for top-down view, b) means for determining a target grid of the visual target, c) means for detecting and tracking a face of the person from first input images captured by the first means for capturing images, d) means for estimating a two-dimensional pose and a three-dimensional pose of the face, e) means for localizing facial features to extract an eye image of the face, f) means for estimating eye gaze of the person and estimating gaze direction of the person based on the estimated eye gaze and the three-dimensional facial pose of the person, g) means for detecting and tracking the person from second input images captured by the second means for capturing images, h) means for estimating a head position using the top-down view calibration, and i) means for estimating the gaze target of the person from the estimated gaze direction and the head position of the person using the face-view calibration. 21. The apparatus according to claim 20, wherein the apparatus further comprises means for taking geometric measurements of the site and the visual target to come up with specifications and the calibrations for the means for capturing images. 22. The apparatus according to claim 20, wherein the apparatus further comprises: a) means for estimating a gaze direction estimation error distribution, and b) means for determining the target grid based on the gaze direction estimation error distribution and average distance between the person and the visual target. 23. The apparatus according to claim 20, wherein the apparatus further comprises means for determining a mapping from the estimated head position and the estimated gaze direction to the target grid. 24. The apparatus according to claim 20, wherein the apparatus further comprises means for determining the mapping from the second input image coordinate to the floor coordinate, based on the position and orientation of the first means for capturing images. 25. The apparatus according to claim 20, wherein the apparatus further comprises means for training a plurality of first machines for estimating the three-dimensional pose of the face. 26. The apparatus according to claim 20, wherein the apparatus further comprises means for training a plurality of second machines for estimating the two-dimensional pose of the face. 27. The apparatus according to claim 20, wherein the apparatus further comprises means for training a plurality of third machines for localizing each facial feature of the face. 28. The apparatus according to claim 20, wherein the apparatus further comprises means for training at least a fourth machine for estimating the eye gaze from the eye image. 29. The apparatus according to claim 28, wherein the apparatus further comprises means for annotating the eye images with both the eye gaze and a confidence level of the eye gaze annotation. 30. The apparatus according to claim 29, wherein the apparatus further comprises means for training the fourth machine so that the machine outputs both the eye gaze and the confidence level of the eye gaze estimate. 31. The apparatus according to claim 20, wherein the apparatus further comprises means for training at least a fifth machine for estimating the gaze direction. 32. The apparatus according to claim 31, wherein the apparatus further comprises means for training the fifth machine for estimating the gaze direction from the eye gaze and the three-dimensional facial pose. 33. The apparatus according to claim 31, wherein the apparatus further comprises means for employing the fifth machine for estimating the gaze direction from the eye image and the three-dimensional facial pose. 34. The apparatus according to claim 31, wherein the apparatus further comprises means for training the fifth machine so that the machine outputs both the gaze direction and the confidence level of the gaze direction estimate. 35. The apparatus according to claim 34, wherein the apparatus further comprises means for estimating a gaze map by weighting each of the gaze target estimates with the confidence levels corresponding to the gaze direction estimates. 36. The apparatus according to claim 20, wherein the apparatus further comprises means for selecting a stream of first input images among a plurality of streams of first input images when the person's face appears to more than one stream of first input images, based on the person's distance to each of the plurality of first means for capturing images and the three-dimensional facial poses relative to each of the plurality of first means for capturing images. 37. The apparatus according to claim 20, wherein the apparatus further comprises means for utilizing a view-based body blob model to estimate the head position of the person. 38. The apparatus according to claim 20, wherein the apparatus further comprises means for constructing a gaze trajectory and a gaze map based on the estimated gaze target.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.