[특허]Method and system for segmentation, classification, and summarization of video images

Method and system for segmentation, classification, and summarization of video images 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06K-009/00 G06K-009/64 G06F-007/00 H04N-009/64
출원번호	US-0556349 (2000-04-24)
발명자 / 주소	Gong,Yihong Liu,Xin
출원인 / 주소	NEC Corporation
대리인 / 주소	Sughrue Mion, PLLC
인용정보	피인용 횟수 : 51 인용 특허 : 13

초록 ▼

In a technique for video segmentation, classification and summarization based on the singular value decomposition, frames of the input video sequence are represented by vectors composed of concatenated histograms descriptive of the spatial distributions of colors within the video frames. The singular value decomposition maps these vectors into a refined feature space. In the refined feature space produced by the singular value decomposition, the invention uses a metric to measure the amount of information contained in each video shot of the input video sequence. The most static video shot is defined as an information unit, and the content value computed from this shot is used as a threshold to cluster the remaining frames. The clustered frames are displayed using a set of static keyframes or a summary video sequence. The video segmentation technique relies on the distance between the frames in the refined feature space to calculate the similarity between frames in the input video sequence. The input video sequence is segmented based on the values of the calculated similarities. Finally, average video attribute values in each segment are used in classifying the segments.

대표청구항 ▼

What is claimed is: 1. A method for summarizing a content of an input video sequence, said method comprising: (a) computing a feature vector for each frame in a set of frames from said input video sequence; (b) applying singular value decomposition to a matrix comprised of said feature vectors and projecting the matrix on a refined feature space representation, wherein positions of said projections on said refined feature space representation represent approximations of visual changes in said set of frames from said input video sequence; (c) clustering said frames of said input video sequence based upon positions of said projections on said refined feature space representation; (d) selecting a frame from each cluster to serve as a keyframe in a summarization of said input video sequence; and (e) using said clustered frames to output a motion video representative of a summary of said input video sequence, wherein said input video sequence summary is composed according to a time-length parameter Tlen and a minimum display time parameter Tmin by: locating the video shot Θi in each cluster Si having the greatest length; determining how the video shots in each cluster will be arranged according to C≦N=Tlen/Tmin, wherein C represents a number of clusters; and wherein N represents the maximum number of video shots; if C≦N, then all the video shots in each cluster is included in said input video sequence summary; and if C≦N, then sort each video shot Θi from each cluster Si in descending order by length, select the first N video shots for inclusion in said input video sequence summary and assign time length Tmin to each selected video shot. 2. The method of claim 1, wherein said singular value decomposition is performed using frames selected with a fixed interval from said input video sequence. 3. The method of claim 1, wherein each column of said matrix represents a frame in said refined feature space representation. 4. The method of claim 1, wherein said feature vectors are computed using a color histogram that outputs a histogram vector. 5. The method of claim 4, wherein said histogram vector is indicative of a spatial distribution of colors in said each of said frames. 6. The method of claim 5, wherein each of said frames is divided into a plurality of blocks, each of said plurality of blocks being represented by a histogram in a color space indicative of a distribution of colors within each of said blocks. 7. The method of claim 5, wherein each of said frames is divided into a plurality of blocks and said histogram vector comprises a plurality of histograms in a color space, each of said plurality of histograms corresponding to one of said plurality of blocks. 8. The method of claim 1, wherein said selecting a frame comprises locating a frame with a feature vector that projects into a singular value that is most representative of other singular values of the cluster. 9. The method of claim 1, wherein the composition of said input video sequence summary further comprises sorting the selected video shots by their respective time codes. 10. The method of claim 9, wherein the composition of said input video sequence summary further comprises extracting a portion of selected video shot equal in length to time length Tmin and inserting each extracted portion in order to said input video sequence summary. 11. The method of claim 1, wherein said clustering of said frames further comprises using a position of the most static shot of said input video sequence to compute a value as a threshold during the clustering of said frames. 12. The method of claim 11, wherein said clustering of said frames further comprises computing a content value and using said computed content value to cluster the remaining frames by: sorting said feature vectors in said refined feature space representation in ascending order according to a distance of each of said feature vectors to an origin of said refined feature space representation; selecting a victor among said sorted feature vectors which is closest to an origin of said refined feature space representation and including said selected feature vector into a first cluster; clustering said plurality of sorted feature vectors in said refined feature space representation into a plurality of clusters according to a distance between each of said plurality of sorted feature vectors and feature vectors in each of said plurality of clusters and an amount of information in each of said plurality of clusters. 13. The method of claim 12, wherein, in said clustering of sorted feature vectors, said plurality of sorted feature vectors are clustered into said plurality of clusters such that said amount of information in each of said plurality of clusters does not exceed an amount of information in said first cluster. 14. The method of claim 12, wherein said first cluster is composed of frames based on a distance variation between said frames and an average distance between frames in said first cluster. 15. The method of claim 12, wherein each of said plurality of clusters is composed of frames based on a distance variation between said frames and an average distance between frames in said each of said plurality of clusters. 16. A computer-readable medium containing a program for summarizing a content of an input video sequence, said program comprising: (a) computing a feature vector for each frame in a set of frames from said input video sequence; (b) applying singular value decomposition to a matrix comprised of said feature vectors and projecting the matrix on a refined feature space representation, wherein positions of said projections on said refined feature space representation represent approximations of visual changes in said set of frames from said input video sequence; (c) clustering said frames of said input video sequence based upon positions of said projections on said refined feature space representation; (d) selecting a frame from each cluster to serve as a keyframe in a summarization of said input video sequence; and (e) using said clustered frames to output a motion video representative of a summary of said input video sequence, wherein said input video sequence summary is composed according to a time-length parameter Tlen and a minimum display time parameter Tmin by: locating the video shot Θi in each cluster Si having the greatest length; determining how the video shots in each cluster will be arranged according to C≦N=Tlen/Tmin, wherein C represents a number of clusters; and wherein N represents the maximum number of video shots; if C≦N, then all the video shots in each cluster is included in said input video sequence summary; and if C>N, then sort each video shot Θi from each cluster Si in descending order by length, select the first N video shots for inclusion in said input video sequence summary and assign time length Tmin to each selected video shot. 17. The computer-readable medium of claim 16, wherein said singular value decomposition is performed using frames selected with a fixed interval from said input video sequence. 18. The computer-readable medium of claim 16, wherein each column of said matrix represents a frame in said refined feature space representation. 19. The computer-readable medium of claim 16, wherein said feature vectors are computed using a color histogram that outputs a histogram vector. 20. The computer-readable medium of claim 19, wherein said histogram vector is indicative of a spatial distribution of colors in said each of said frames. 21. The computer-readable medium of claim 20, wherein each of said frames is divided into a plurality of blocks, each of said plurality of blocks being represented by a histogram in a color space indicative of a distribution of colors within each of said blocks. 22. The computer-readable medium of claim 20, wherein each of said frames is divided into a plurality of blocks and said histogram vector comprises a plurality of histograms in a color space, each of said plurality of histograms corresponding to one of said plurality of blocks. 23. The computer-readable medium of claim 16, wherein said selecting a frame comprises locating a frame with a feature vector that projects into a singular value that is most representative of other singular values of the cluster. 24. The computer-readable medium of claim 16, wherein the composition of said input video sequence summary further comprises sorting the selected video shots by their respective time codes. 25. The computer-readable medium of claim 24, wherein the composition of said input video sequence summary further comprises extracting a portion of selected video shot equal in length to time length Tmin and inserting each extracted portion in order to said input video sequence summary. 26. The computer-readable medium of claim 16, wherein said clustering of said frames further comprises using a position of the most static shot of said input video sequence to compute a value as a threshold during the clustering of said frames. 27. The computer-readable medium of claim 25, wherein said clustering of said frames further comprises computing a content value and using said computed content value to cluster the remaining frames by: sorting said feature vectors in said refined feature space representation in ascending order according to a distance of each of said feature vectors to an origin of said refined feature space representation; selecting a vector among said sorted feature vectors which is closest to an origin of said refined feature space representation and including said selected feature vector into a first cluster; clustering said plurality of sorted feature vectors in said refined feature space representation into a plurality of clusters according to a distance between each of said plurality of sorted feature vectors and feature vectors in each of said plurality of clusters and an amount of information in each of said plurality of clusters. 28. The computer-readable medium of claim 27, wherein, in said clustering of sorted feature vectors, said plurality of sorted feature vectors are clustered into said plurality of clusters such that said amount of information in each of said plurality of clusters does not exceed an amount of information in said first cluster. 29. The computer-readable medium of claim 27, wherein said first cluster is composed of frames based on a distance variation between said frames and an average distance between frames in said first cluster. 30. The computer-readable medium of claim 27, wherein each of said plurality of clusters is composed of frames based on a distance variation between said frames and an average distance between frames in said each of said plurality of clusters.

이 특허에 인용된 특허 (13)

Uchihachi, Shingo; Foote, Jonathan T.; Wilcox, Lynn, Automatic video summarization using a measure of shot importance and a frame-packing method.
상세보기
Jacquelyn Annette Martino ; Nevenka Dimitrova ; Jan Hermanus Elenbaas ; Job Rutgers NL, Histogram method for characterizing video content.
상세보기
Lim, Joo Hwee, Method and apparatus for indexing and retrieving images using visual keywords.
상세보기
Gong, Yihong, Method and apparatus for personalized multimedia summarization based upon user specified theme.
상세보기
Yeo Boon-Lock ; Yeung Minerva M. ; Wolf Wayne ; Liu Bede, Method and apparatus for video browsing based on content and structure.
상세보기
Wang Katherine ; Normile James, Method and system for detecting scenes and summarizing video sequences.
상세보기
Warnick James ; Ferman Ahmet M. ; Gunsel Bilge ; Naphade Milind R. ; Mehrotra Rajiv, Method for content-based temporal segmentation of video.
상세보기
Ratakonda Krishna, Method for hierarchical summarization and browsing of digital video.
상세보기
Castelli Vittorio ; Li Chung-Sheng ; Thomasian Alexander, Multidimensional data clustering and dimension reduction for indexing and searching.
상세보기
Thomas McGee ; Nevenka Dimitrova ; Jan Herman Elenbaas, Significant scene detection and frame filtering for a visual indexing system using dynamic thresholds.
상세보기
Yucel Altunbasak ; HongJiang Zhang, System and method for automatically detecting shot boundary and key frame from a compressed video data.
상세보기
Toklu, Candemir; Liou, Shih-Ping, System and method for selecting key-frames of video data.
상세보기
Caid William R. (San Diego CA) Oing Pu (La Costa CA), System and method of context vector generation and retrieval.
상세보기

이 특허를 인용한 특허 (51)

Weber, Frank Elmo, Character-based automated media summarization.
상세보기
Brockmann, Ronald A.; Hoeben, Maarten, Class-based intelligent multiplexing over unmanaged networks.
상세보기
Petriuc, Mihai, Click distance determination.
상세보기
Tankovich, Vladimir; Meyerzon, Dmitriy; Poznanski, Victor, Detection of junk in search result ranking.
상세보기
Hunter,Kurt M.; Mercer,Ian Cameron; Ahlstrom,Bret, Digital video segmentation and dynamic segment labeling.
상세보기
Tankovich, Vladimir; Meyerzon, Dmitriy; Taylor, Michael James, Document length as a static relevance feature for ranking search results.
상세보기
Shiiyama, Hirotaka, Dynamic image digest automatic editing system and dynamic image digest automatic editing method.
상세보기
Meyerzon, Dmitriy; Shnitko, Yauhen; Burges, Chris J. C.; Taylor, Michael James, Enterprise relevancy ranking using a neural network.
상세보기
Betts, Christopher; Rogers, Tony, Fast searching of directories.
상세보기
Mercer, Ian Cameron, Features such as titles, transitions, and/or effects which vary according to positions.
상세보기
Mercer, Ian Cameron, Features such as titles, transitions, and/or effects which vary according to positions.
상세보기
Robertson, Stephen; Zaragoza, Hugo; Taylor, Michael; Larimore, Stefan Isbein; Petriuc, Mihai, Field weighting in text searching.
상세보기
Kumar, Mrityunjay; Loui, Alexander C.; Pillman, Bruce Harold, Identifying scene boundaries using group sparsity analysis.
상세보기
Kumar, Mrityunjay; Loui, Alexander C.; Pillman, Bruce Harold, Identifying scene boundaries using group sparsity analysis.
상세보기
Kumar, Mrityunjay; Loui, Alexander C.; Pillman, Bruce Harold, Identifying scene boundaries using group sparsity analysis.
상세보기
Miyashita, Naoyuki, Image processing apparatus, method and program for determining arrangement of vectors on a distribution map.
상세보기
Brockmann, Ronald A.; Hoeben, Maarten; Gorter, Onne; Hiddink, Gerrit, Intelligent multiplexing using class-based, multi-dimensioned decision logic for managed networks.
상세보기
Badawy, Wael; Rahman, Ashiq; Rogers, Shane; Du, Shan, Leak detection.
상세보기
Gordon, Donald; Pavlovskaia, Lena Y.; Landau, Airan; Lennartsson, Andreas; Cloud, Glenn M., MPEG objects and systems and methods for using MPEG objects.
상세보기
Dengler, Patrick M.; Krishnan, Arvind K.; Singh, Jagdish; Sanchez, Lawrence M.; Shankar, Sai; Chittamuru, Satish Kumar; Pekic, Zoltan; Mondal, Nabarun; Kumar, Namendra; i Dalfó, Ricard Roma, Metadata driven user interface.
상세보기
Villadsen, Peter; Chen, Zhaoqi; Gottumukkala, Ramakanthachary S.; Calderon, Marcos, Metadata-based eventing supporting operations on data.
상세보기
Chen,William; Chen,Jau Yuen, Method and apparatus for summarizing and indexing the contents of an audio-visual presentation.
상세보기
Peleg, Shmuel; Pritch, Yael; Ratovitch, Sarit; Hendel, Avishai, Methods and systems for producing a video synopsis using clustering.
상세보기
Brockmann, Ronald A.; Gorter, Onne; Dev, Anuj; Hiddink, Gerritt, Overlay rendering of user interface onto source video.
상세보기
Brockmann, Ronald A.; Gorter, Onne; Dev, Anuj; Hiddink, Gerritt, Overlay rendering of user interface onto source video.
상세보기
Obrador, Pere; Lin, Qian, Providing a visual indication of the content of a video by analyzing a likely user intent.
상세보기
Dahlby, Joshua; Marsavin, Andrey; Lawrence, Charles; Pavlovskaia, Lena Y., Providing television broadcasts over a managed network and interactive content over an unmanaged network to a client device.
상세보기
Obata, Kenji; Meyerzon, Dmitriy, Proxy server using a statistical model.
상세보기
Meyerzon, Dmitriy; Zaragoza, Hugo, Ranking search results using biased click distance.
상세보기
Meyerzon, Dmitriy; Li, Hang, Ranking search results using feature extraction.
상세보기
Meyerzon, Dmitriy; Zaragoza, Hugo, Ranking search results using language types.
상세보기
Poznanski, Victor; Wang, Oivind; Holm, Fredrik; Bodd, Nicolai; Tankovich, Vladimir; Meyerzon, Dmitriy, Re-ranking search results.
상세보기
Wilson, Andrew, Recognizing a movement of a pointing device.
상세보기
Brockmann, Ronald A.; Dev, Anuj; Hiddink, Gerrit; Dahlby, Joshua; Pavlovskaia, Lena Y., Reduction of latency in video distribution networks using adaptive bit rates.
상세보기
Newman, David A.; Silver, Adam, Scene and activity identification in video summary generation.
상세보기
Tankovich, Vladimir; Li, Hang; Meyerzon, Dmitriy; Xu, Jun, Search results ranking using editing distance and document information.
상세보기
Hirohata, Makoto; Imoto, Kazunori; Ohmori, Yoshihiro; Aoki, Hisashi; Uehara, Tatsuya, Signal processing apparatus and method thereof.
상세보기
Matsushita, Yasuyuki; Kang, Hong-Wen; Tang, Xiaoou, Space-time video montage.
상세보기
Gibbs, Simon; Hoch, Michael, System and method for data assisted chrom-keying.
상세보기
Gibbs, Simon; Hoch, Michael, System and method for data assisted chroma-keying.
상세보기
Brockmann, Ronald Alexander; Dev, Anuj; Hoeben, Maarten, System and method for exploiting scene graph information in construction of an encoded video sequence.
상세보기
Brockmann, Ronald Alexander; Dev, Anuj; Hoeben, Maarten, System and method for exploiting scene graph information in construction of an encoded video sequence.
상세보기
Hegde, Abhishek Naveen; Marwaha, Devendra K., System and method for previewing multimedia files.
상세보기
Lin, Xiaofan; Zhang, Tong; Atkins, C. Brian; Vondran, Jr., Gary L.; Chen, Mei; Untulis, Charles A.; Cheatle, Stephen Philip; Lee, Dominic, System and method for producing a page using frames of a video stream.
상세보기
Meyerzon, Dmitriy; Zaragoza, Hugo, System and method for ranking search results using click distance.
상세보기
Merrigan, Chadd Creighton; Peltonen, Kyle G.; Meyerzon, Dmitriy; Lee, David J., System and method for scoping searches using index keys.
상세보기
Avasarala, Bhargav; Aley, Douglas Frederick; Chen, Johnny Nienwei; Dudum, Andrew; Eles, Colin James; Mumm, Jonathan Ryan, Systems and methods for image recognition.
상세보기
Plagne, Geraud, Video processing method and device for depth extraction.
상세보기
Yamauchi, Masaki; Kimura, Masayuki, Video scene classification device and video scene classification method.
상세보기
Carlson, Adam; Gray, Douglas Ryan; Kulkarni, Ashutosh Vishwas; Taylor, Colin Jon, Video segmentation techniques.
상세보기
Sentinelli, Alexandro; Papariello, Francesco, Video-surveillance method, corresponding system, and computer program product.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Method and system for segmentation, classification, and summarization of video images 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (13)

이 특허를 인용한 특허 (51)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Method and system for segmentation, classification, and summarization of video images 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (13)

이 특허를 인용한 특허 (51)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트