IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0915584
(2010-10-29)
|
등록번호 |
US-8687941
(2014-04-01)
|
발명자
/ 주소 |
- Dirik, Ahmet Emir
- Lai, Jennifer
- Topkara, Mercan
|
출원인 / 주소 |
- International Business Machines Corporation
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
4 인용 특허 :
17 |
초록
▼
Techniques are disclosed for automatic static summarization of videos. For example, a method of creating a static summary of a video comprises the following steps. Shots in the video are detected, wherein the detected shots are frames of the video having a correlation. The detected shots are cluster
Techniques are disclosed for automatic static summarization of videos. For example, a method of creating a static summary of a video comprises the following steps. Shots in the video are detected, wherein the detected shots are frames of the video having a correlation. The detected shots are clustered into clusters based on similarity. The clusters of shots are ranked. At least a portion of the shots are selected based on cluster ranking for inclusion in the static summary. The static summary is generated by combining thumbnail images of the selected shots. Prior to the ranking step, the method may further comprise detecting a presence of slides in any of the shots, and the ranking of a given shot is based in part on whether the shot is a slide. By way of example, such static summaries can be shared in emails and in calendar applications.
대표청구항
▼
1. A method of creating a static summary of a video, comprising: detecting shots in the video, wherein the detected shots are frames of the video having a correlation; clustering the detected shots into clusters based on similarity; ranking the clusters of shots; selecting at least a portion of the
1. A method of creating a static summary of a video, comprising: detecting shots in the video, wherein the detected shots are frames of the video having a correlation; clustering the detected shots into clusters based on similarity; ranking the clusters of shots; selecting at least a portion of the shots based on cluster ranking for inclusion in the static summary; and generating the static summary by combining thumbnail images of the selected shots, wherein prior to the ranking step, the method further comprises detecting a presence of slides in any of the shots, and ranking of a given shot is based in part on whether the shot is a slide, wherein the ranking step further comprises ranking the shots that contain slides among each other, wherein the slides are assigned to higher or lower ranks based on an amount of text in the slides, wherein ranking of shots of the video that contain slides further comprises ranking based on a slide content activity metric to detect visually busy slide frames. 2. The method of claim 1, wherein the detection of shots in the video that do not contain slides further comprises: determining color and intensity histograms for frames in the video;computing a correlation between the color and intensity histograms of successive frames;comparing histogram correlation results with shot detection threshold values; andidentifying a frame as a shot based on shot detection threshold value comparison results. 3. The method of claim 1, wherein the detection of slides in the video further comprises: computing frame correlations for successive frames; andtagging successive frames as a slide when the correlation between the successive frames is at least greater than a slide transition threshold value. 4. The method of claim 3, wherein prior to frame correlation computing, frames in the video are filtered using a two dimensional high pass filter. 5. The method of claim 3, wherein, when one frame in a given shot is tagged as a slide, all other frames in the given shot are tagged as slides. 6. The method of claim 1, wherein a frame comprising a screen-share image is detected as a slide. 7. The method of claim 1, wherein clustering the detected shots into clusters based on similarity further comprises: sorting shots based on shot duration;finding shots similar to the longest non-clustered shot, wherein similarity is based on a correlation coefficient function; andassigning the longest non-clustered shot and the shots found to be similar to the longest non-clustered shot to a given cluster. 8. The method of claim 7, further comprising removing shots with unrelated content from the given cluster. 9. The method of claim 8, wherein the removal of unrelated shots from the given cluster further comprises: selecting a given number of random shots from the given cluster;comparing each random shot with other shots;generating alternative clusters based on the random shot comparisons;computing an average similarity for each of the given cluster and the alternative clusters; andselecting the cluster with the maximum average similarity as the cluster used in the ranking step. 10. The method of claim 7, wherein prior to clustering, a representative frame is assigned to each shot. 11. The method of claim 10, wherein the middle of the shot is selected as the representative frame when the shot does not contain any slides. 12. The method of claim 10, wherein the first frame of the shot is selected as the representative frame when the shot contains a slide. 13. The method of claim 1, wherein the ranking further comprises: ranking shots that do not contain slides; andfor any shots in which slides were detected, performing optical character recognition on elements in the slides. 14. The method of claim 13, wherein the ranking of shots of the video that do not contain slides further comprises ranking based on duration and diversity. 15. The method of claim 13, wherein ranking of shots of the video that contain slides further comprises ranking based on readability and human comprehension. 16. The method of claim 13, wherein ranking of shots of the video that contain slides further comprises ranking based on an optical character recognition metric. 17. The method of claim 13, wherein ranking of shots of the video that contain slides further comprises ranking based on a slide duration metric. 18. The method of claim 1, further comprising generating textual information corresponding to the thumbnail images of the static summary. 19. The method of claim 18, wherein the textual information is searchable. 20. The method of claim 18, wherein the textual information comprises a uniform resource locator which is selectable to playback some or all of the video corresponding to the thumbnail images of the static summary. 21. The method of claim 1, wherein when it is determined that the video is attached to an email message or the email message contains a link to the video, further comprising the step of including the generated static summary in the email message in place of the video, wherein some or all of the video is selectable for playback using one or more hyperlinks associated with the static summary. 22. The method of claim 1, wherein when the video is a recording of a meeting, further comprising identifying one or more calendar entries corresponding to the meeting in a database of a calendar application program, and including the generated static summary in the one or more calendar entries, wherein some or all of the video is selectable for playback using one or more hyperlinks associated with the summary. 23. An article of manufacture for creating a static summary of a video, the article of manufacture comprising a non-transitory computer readable storage medium having tangibly embodied thereon computer readable program code which, when executed, causes a computer to: detect shots in the video, wherein the detected shots are frames of the video having a correlation; cluster the detected shots into clusters based on similarity; rank the clusters of shots; select at least a portion of the shots based on cluster ranking for inclusion in the static summary; and generate the static summary by combining thumbnail images of the selected shots, wherein prior to ranking the clusters of shots, the computer readable program code, when executed, further causes the computer to detect a presence of slides in any of the shots, and ranking of a given shot is based in part on whether the shot is a slide, wherein the shots that contain slides are ranked among each other, wherein the slides are assigned to higher or lower ranks based on an amount of text in the slides, wherein ranking of shots of the video that contain slides further comprises ranking based on a slide content activity metric to detect visually busy slide frames.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.