[미국특허]
Auto-generation of events with annotation and indexing
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-007/00
G06F-017/30
G11B-027/034
G11B-027/10
출원번호
US-0141625
(2008-06-18)
등록번호
US-8892553
(2014-11-18)
발명자
/ 주소
Norlander, Rebecca
Gupta, Anoop
Johnson, Bruce A.
Hough, Paul J.
Czerwinski, Mary P.
Curtis, Pavel
Ozzie, Raymond E.
출원인 / 주소
Microsoft Corporation
대리인 / 주소
Choi, Dan
인용정보
피인용 횟수 :
3인용 특허 :
9
초록▼
Recording of various events in a video format that facilitates viewing and selective editing are provided. The video can be presented in a wiki-format that allows a multitude of subsequent users to add, modify and/or delete content to the original recorded event or a revision of that event. As edits
Recording of various events in a video format that facilitates viewing and selective editing are provided. The video can be presented in a wiki-format that allows a multitude of subsequent users to add, modify and/or delete content to the original recorded event or a revision of that event. As edits and annotations are applied, either automatically or manually, such edits can be indexed based on criteria such as identification of an annotator, a time stamp associated with the edit, a revision number, or combinations thereof. The edits or annotations can be provided in various formats including video, audio, text, and so forth.
대표청구항▼
1. One or more hardware computer-storage media storing computer-executable instructions that, when executed by one or more computers, cause the one or more computers to perform acts comprising: capturing a video version of a meeting having a plurality of participants including a speaker and other pa
1. One or more hardware computer-storage media storing computer-executable instructions that, when executed by one or more computers, cause the one or more computers to perform acts comprising: capturing a video version of a meeting having a plurality of participants including a speaker and other participants, the captured video version of the meeting including an indicator comprising one or more of a gesture by the speaker, a keyword spoken by the speaker, or body language of the speaker;automatically distinguishing among the plurality of participants in the meeting to identify the speaker in the captured video version of the meeting, wherein the automatically distinguishing comprises performing facial recognition to distinguish the speaker from the other participants in the meeting;recognizing the indicator in a portion of the captured video version of the meeting;automatically applying one or more edits to the captured video version of the meeting based on the gesture by the speaker, the keyword spoken by the speaker, or the body language of the speaker, the one or more edits being applied based on a rule associated with the gesture, the keyword, or the body language; andcombining the one or more edits and the captured video version of the meeting to rewrite the captured video version of the meeting into a revised video version of the meeting. 2. The one or more hardware computer-storage media according to claim 1, the acts further comprising: performing speech recognition to distinguish the speaker from the other participants in the meeting. 3. The one or more hardware computer-storage media according to claim 1, wherein the indicator comprises the body language of the speaker and the body language comprises the speaker leaving a room where the meeting is taking place. 4. The one or more hardware computer-storage media according to claim 3, wherein the one or more edits indicate that a break in the meeting occurs when the speaker leaves the room where the meeting is taking place. 5. The one or more hardware computer-storage media according to claim 1, the acts further comprising: stopping or pausing the capturing of the video version of the meeting responsive to the gesture, the keyword, or the body language, wherein the gesture, the keyword, or the body language identifies a break in the meeting. 6. The one or more hardware computer-storage media according to claim 5, the acts further comprising: automatically restarting the capturing of the video version of the meeting responsive to determining that the speaker has restarted the meeting. 7. The one or more hardware computer-storage media according to claim 1, wherein the edits are applied while the video version of the meeting is being captured. 8. The one or more hardware computer-storage media according to claim 1, wherein the edits are applied after the meeting has been captured. 9. A system comprising: at least one processing unit; andat least one computer storage media storing computer-executable instructions that, when executed by the processing unit, cause the processing unit to: obtain a captured video version of a meeting having a plurality of participants including a speaker and other participants, the captured video version of the meeting including an indicator comprising one or more of a gesture by the speaker, a keyword spoken by the speaker, or body language of the speaker;identify the speaker in the captured video version of the meeting by automatically distinguishing the speaker from the other participants in the meeting using speech recognition;recognize the indicator in a portion of the captured video version of the meeting;automatically apply one or more edits to the captured video version of the meeting based on the gesture by the speaker, the keyword spoken by the speaker, or the body language of the speaker, the one or more edits being applied based on a rule associated with the gesture, the keyword, or the body language; andcombine the one or more edits and the captured video version of the meeting to rewrite the captured video version of the meeting into a revised video version of the meeting. 10. The system of claim 9, wherein the computer-executable instructions further cause the at least one processing unit to: perform facial recognition in addition to the speech recognition to distinguish the speaker from the other participants in the meeting. 11. The system of claim 9, wherein the computer-executable instructions further cause the at least one processing unit to: accept a retraction of an individual edit; andrewrite the revised video version of the meeting without the individual edit. 12. The system of claim 11, wherein the computer-executable instructions further cause the at least one processing unit to: retain a copy of the revised video version of the meeting that includes the individual edit that was retracted. 13. The system of claim 9, wherein the computer-executable instructions further cause the at least one processing unit to: stop or pause capture of the video version of the meeting responsive to recognizing the indicator. 14. The system of claim 9, wherein the computer-executable instructions further cause the at least one processing unit to: stop or pause capturing of the video version of the meeting responsive to another indicator indicating that the speaker or the other participants are leaving a room where the meeting is taking place. 15. The system of claim 14, wherein the computer-executable instructions further cause the at least one processing unit to: restart capturing of the video version of the meeting responsive to recognizing that the speaker or the other participants have returned to the room where the meeting is taking place. 16. A method performed by at least one computer processing unit, the method comprising: obtaining a captured video version of a meeting having a plurality of participants including a speaker and other participants, the captured video version of the meeting including an indicator comprising one or more of a gesture by the speaker, a keyword spoken by the speaker, or body language of the speaker;automatically distinguishing among the plurality of participants in the meeting to identify the speaker in the captured video version of the meeting, wherein the automatically distinguishing comprises performing at least one of speech recognition or facial recognition to distinguish the speaker from the other participants in the meeting;recognizing the indicator in a portion of the captured video version of the meeting;automatically applying one or more edits to the captured video version of the meeting based on the gesture by the speaker, the keyword spoken by the speaker, or the body language of the speaker, the one or more edits being applied based on a rule associated with the gesture, the keyword, or the body language; andcombining the one or more edits and the captured video version of the meeting to rewrite the captured video version of the meeting into a revised video version of the meeting. 17. The method of claim 16, wherein obtaining the video version of the meeting comprises capturing the video version of the meeting. 18. The method of claim 16, wherein the automatically distinguishing comprises performing the speech recognition. 19. The method of claim 16, wherein the automatically distinguishing comprises performing the facial recognition. 20. The method of claim 16, wherein the indicator comprises the keyword, the keyword indicates that the meeting has reached a conclusion, and an individual edit identifies the conclusion of the meeting.
Jonathan T. Foote ; Lynn Wilcox, Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition.
Brown,Christopher Robert; Moriarty,John Michael; Smith,Sean Dare; Ackerman,Stuart M.; Laine,Leslie E.; Adams,William H., System and method for multiple screen electronic presentations.
Mauldin Michael L. (Penn Hills PA) Smith Michael A. (Pittsburgh PA) Stevens Scott M. (Pittsburgh PA) Wactlar Howard D. (Pittsburgh PA) Christel Michael G. (Wexford PA) Reddy D. Raj (Pittsburgh PA), System and method for skimming digital audio/video data.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.