The present disclosure relates to a method for controlling a digital photography system. The method includes obtaining, by a device, image data and audio data. The method also includes identifying one or more objects in the image data and obtaining a transcription of the audio data. The method also
The present disclosure relates to a method for controlling a digital photography system. The method includes obtaining, by a device, image data and audio data. The method also includes identifying one or more objects in the image data and obtaining a transcription of the audio data. The method also includes controlling a future operation of the device based at least on the one or more objects identified in the image data, and the transcription of the audio data.
대표청구항▼
1. A computer-implemented method comprising: obtaining, by a computing device operable to capture images, (i) image data that describes a first scene that includes one or more objects and (ii) audio data that describes a human speech utterance, wherein the human speech utterance refers to at least a
1. A computer-implemented method comprising: obtaining, by a computing device operable to capture images, (i) image data that describes a first scene that includes one or more objects and (ii) audio data that describes a human speech utterance, wherein the human speech utterance refers to at least a first object of the one or more objects included in the first scene;identifying, by the computing device based at least in part on the audio data and based at least in part on the image data, at least the first object that is included in the first scene and that is referred to by the human speech utterance;defining, by the computing device, a new rule that specifies a behavior of the computing device in response to future instances of identification of the first object in future image data that is different than the current image data; andcontrolling, by the computing device, a future operation of the computing device to comply with the new rule. 2. The computer-implemented method of claim 1, wherein controlling, by the computing device, the future operation of the computing device comprises determining, by the computing device, whether to store the future image data. 3. The computer-implemented method of claim 1, wherein controlling, by the computing device, the future operation of the computing device comprises determining, by the computing device, whether to automatically upload the future image data to cloud storage. 4. The computer-implemented method of claim 1, wherein identifying, by the computing device, at least the first object comprises at least one of: identifying, by the computing device, a person using face detection, identifying, by the computing device, a gesture performed by a person in the image, or detecting, by the computing device, an action performed by a person in the image. 5. The computer-implemented method of claim 1, further comprising generating, by the computing device, the image data and the audio data. 6. The computer-implemented method of claim 1, wherein: the human speech utterance further describes the behavior of the computing device in response to future instances of identification of the first object; andthe method further comprises determining, by the computing device, the requested behavior based at least in part on the audio data. 7. The computer-implemented method of claim 6, wherein: the human speech utterance requests the computing device not capture imagery in response to future instances of identification of the first object in the future image data; anddefining, by the computing device, the new rule comprises defining, by the computing device, the new rule that specifies that the computing device does not capture imagery in response to future instances of identification of the first object in the future image data. 8. The computer-implemented method of claim 1, wherein: the human speech utterance self-references a speaker of the human speech utterance; andidentifying, by the computing device based at least in part on the audio data and based at least in part on the image data, at least the first object that is referred to by the human speech utterance comprises identifying, by the computing device based at least in part on the image data, the speaker of the human speech utterance. 9. The computer-implemented method of claim 1, wherein the computing device is a camera. 10. The computer-implemented method of claim 1, further comprising obtaining, by the computing device, a transcription of the audio data using automated speech recognition. 11. The computer-implemented method of claim 10, wherein identifying, by the computing device based at least in part on the audio data and based at least in part on the image data, at least the first object comprises identifying, by the computing device based at least in part on the transcription and based at least in part on the image data, at least the first object. 12. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: obtaining, by the one or more computers, (i) image data that describes a first scene that includes one or more objects and (ii) audio data that describes a human speech utterance, wherein the human speech utterance refers to at least a first object of the one or more objects included in the first scene;identifying, based at least in part on the audio data and based at least in part on the image data, at least the first object that is included in the first scene and that is referred to by the human speech utterance;defining a new rule that specifies a behavior of at least one of the one or more computers in response to future instances of identification of the first object in future image data that is different than the current image data; andcontrolling a future operation of the at least one of the one or more computers to comply with the new rule. 13. The system of claim 12, wherein controlling a future operation of the at least one of the one or more computers comprises determining whether to store the future image data. 14. The system of claim 12, wherein controlling a future operation of the at least one of the one or more computers comprises determining whether to automatically upload future generated image data to cloud storage. 15. The system of claim 12, wherein identifying at least the first object comprises at least one of: identifying a person using face detection, identifying a gesture performed by a person in the first scene, or detecting an action performed by a person in the first scene. 16. The system of claim 12, wherein the operations further comprise generating, by the one or more computers, the image data and the audio data. 17. The system of claim 12, wherein the one or more computers comprise a camera. 18. The system of claim 12, wherein the operations further comprise obtaining a transcription of the audio data using automated speech recognition. 19. The system of claim 18, wherein identifying at least the first object comprises identifying at least the first object based at least in part on the transcription and based at least in part on the image data. 20. A non-transitory, computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising: obtaining (i) image data that describes a first scene that includes one or more objects and (ii) audio data that describes a human speech utterance, wherein the human speech utterance refers to at least a first object of the one or more objects included in the first scene;identifying the one or more objects in the first scene based on the image data;identifying, based at least in part on the audio and based at least in part on the image data, at least the first object that is referred to by the human speech utterance;defining a new rule that specifies an image capture behavior of the one or more computers in response to future instances of identification of the first object in future image data that is different than the current image data; andcontrolling a future image capture operation of the one or more computers to comply with the new rule.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (81)
Wexler, Yonatan; Shashua, Amnon, Apparatus for adjusting image capture settings.
Kuberka, Cheryl J.; Barnum, David C.; Williams, Frances C.; Border, John N.; Johnson, Kenneth A., Camera configurable for autonomous self-learning operation.
Bernardi Bryan D. (Rochester NY) McIntyre Dale F. (Honeoye Falls NY) Dunsmore Clay A. (Fairport NY) Wolcott Dana W. (Honeoye Falls NY), Camera on-board voice recognition.
Kuchta Daniel W. (Brockport NY) Sucy Peter J. (Hamlin NY), Electronic still camera providing multi-format storage of full and reduced resolution images.
Ejima, Satoshi; Nozaki, Hirotake; Hiraide, Fumio, Image processing apparatus having image selection function, and recording medium having image selection function program.
Strub, Henry B.; Burgess, David A.; Johnson, Kimberly H.; Cohen, Jonathan R.; Reed, David P.; Aiello, G. Roberto, Low attention recording unit for use by vigorously active recorder.
Ostojic, Bojana; Glein, Christopher A; Gibson, Mark R.; Vong, William H; Flora, William T; Alton, Benjamin N; Newell, Mark S, Media user interface gallery control.
Schaffer, James David; Ali, Walid; Eshelman, Larry J.; Cohen-Bacrie, Claude; Lagrange, Jean-Michel; Levrier, Claire; Villain, Nicholas; Entrekin, Robert R., Method and apparatus for automatically developing a high performance classifier for producing medically meaningful descriptors in medical diagnosis imaging.
Lee, Il Yong; Kim, Sung Hyun; Kim, Lag Young; Hong, Yun Pyo; Byun, Seong Chan, Method for processing image data in portable electronic device, and portable electronic device having camera thereof.
Steinberg, Eran; Prilutsky, Yury; Corcoran, Peter; Bigioi, Petronel, Perfecting of digital image capture parameters within acquisition devices using face detection.
Tedesco, Daniel E.; Jorasch, James A.; Gelman, Geoffrey M.; Walker, Jay S.; Tulley, Stephen C.; O'Neil, Vincent M.; Alderucci, Dean P., System for image analysis in a network that is structured with multiple layers and differentially weighted neurons.
Isogai, Kuniaki; Kawamura, Takashi; Kawabata, Akihiro, Video analysis apparatus and method for calculating interpersonal relationship evaluation value using video analysis.
Lin,Yun Ting; Gutta,Srinivas; Brodsky,Tomas; Philomin,Vasanth, Video monitoring system employing hierarchical hidden markov model (HMM) event learning and classification.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.