Device and method for automatic participant identification in a recorded multimedia stream
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
H04N-007/14
G06K-009/00
G10L-015/00
출원번호
US-0638635
(2009-12-15)
등록번호
US-8390669
(2013-03-05)
우선권정보
NO-20085227 (2008-12-15)
발명자
/ 주소
Catchpole, Jason
Cockerton, Craig
출원인 / 주소
Cisco Technology, Inc.
인용정보
피인용 횟수 :
44인용 특허 :
9
초록▼
The present disclosure discloses a method for identifying individuals in a multimedia stream originating from a video conferencing terminal or a Multipoint Control Unit, including executing a face detection process on the multimedia stream; defining subsets including facial images of one or more ind
The present disclosure discloses a method for identifying individuals in a multimedia stream originating from a video conferencing terminal or a Multipoint Control Unit, including executing a face detection process on the multimedia stream; defining subsets including facial images of one or more individuals, where the subsets are ranked according to a probability that their respective one or more individuals will appear in a video stream; comparing a detected face to the subsets in consecutive order starting with a most probable subset, until a match is found; and storing an identity of the detected face as searchable metadata in a content database in response to the detected face matching a facial image in one of the subsets.
대표청구항▼
1. A device comprising: a receiver configured to receive a video conference format coded data stream;a conversion unit configured to convert the video conference format coded data stream to a multimedia stream in a defined multimedia streaming format and store the multimedia stream and searchable me
1. A device comprising: a receiver configured to receive a video conference format coded data stream;a conversion unit configured to convert the video conference format coded data stream to a multimedia stream in a defined multimedia streaming format and store the multimedia stream and searchable metadata in a content database;a user database including identities of known individuals stored with associated facial images; anda face recognition unit configured to, detect a face in a video stream included in the multimedia stream,define subsets including facial images of one or more of the known individuals, where the subsets are ranked according to a probability that their respective one or more known individuals will appear in the video stream,compare the detected face to the subsets in consecutive order starting with a most probable subset, until a match is found, andstore an identity of the detected face as searchable metadata in the content database in response to the detected face matching a facial image in one of the subsets. 2. The device according to claim 1, further comprising: a ranking unit configured to rank the subsets based on protocol information extracted from a communication session and based on groups in the user database,wherein the user database further includes one or more unique addresses of known terminals, the identities and unique addresses being associated with one or more of the groups, and at least one of the unique addresses being related to one or more of the known individuals. 3. The device according to claim 2, wherein the protocol information includes the one or more unique addresses of one or more terminals which are an origin of the video conference format coded data stream. 4. The device according to claim 1, wherein the face recognition unit is further configured to create a thumbnail image of the detected face and store the thumbnail image with the multimedia stream in the content database. 5. The device according to claim 1, further comprising a graphical user interface including a listing of available multimedia streams in the content database, the listing including a link to a stored multimedia stream and the searchable metadata associated with the multimedia stream, including the names of recognized participants. 6. The device according to claim 5, wherein the listing further includes one or more stored thumbnail images of individuals detected in each multimedia stream. 7. The device according to claim 1, wherein the video conference format coded data stream is a H.323, H.320, or Session Initiated Protocol coded data stream. 8. The device according to claim 2, wherein the face recognition unit is further configured to rank the subsets based on the one or more unique addresses according to a first subset including the known individuals related to the one or more unique addresses, a second subset including all known individuals associated with a first group also associated with the one or more unique addresses, and a third subset including all known individuals associated with a second group and associated with the one or more unique addresses. 9. The device according to claim 8, wherein the first group is selected from at least one of a department, a geographical location, a building, an office, and a floor, andwherein the second group is selected from at least one of a company, an organization, and a set of known individuals. 10. The device according to claim 8, further comprising: a ranking unit ranking a fourth subset higher than the second subset, the fourth subset including all known individuals associated with a user histogram group and associated with the one or more unique addresses of the known terminals,wherein the face recognition unit is further configured to generate a user histogram group for each terminal, wherein previous users of each terminal are associated with the user histogram group. 11. The device according to claim 8, further comprising: a ranking unit configured to rank a fifth subset higher than the second subset, the fifth subset including facial images of known users in a list of probable users,wherein the face recognition unit is further configured to generate the list of probable users of each of the known terminals at a given time, based on information from at least one of a scheduling server and a presence server. 12. A method for identifying individuals in a multimedia stream originating from a video conferencing terminal or a Multipoint Control Unit, the method comprising: executing, at a device, a face detection process on the multimedia stream;defining, at the device, subsets including facial images of one or more individuals, where the subsets are ranked according to a probability that their respective one or more individuals will appear in a video stream;comparing, at the device, a detected face to the subsets in consecutive order starting with a most probable subset, until a match is found; andstoring, at the device, an identity of the detected face as searchable metadata in a content database in response to the detected face matching a facial image in one of the subsets. 13. The method according to claim 12, further comprising: receiving a video conference format coded data stream;storing, in a user database, one or more unique addresses of known terminals and identities of known individuals, wherein the identities of known individuals and unique addresses are associated with one or more groups, and wherein at least one of the unique addresses are related to one or more of the known individuals;extracting protocol information from a communication session; andranking the subsets based on the protocol information and at least one of the groups in the user database. 14. The method according to claim 13, wherein the protocol information includes the one or more unique addresses of one or more terminals which are an origin of the video conference format coded data stream. 15. The method according to claim 12, further comprising: creating a thumbnail image of the detected face; andstoring the thumbnail image with the multimedia stream in the content database. 16. The method according to claim 12, further comprising providing a graphical user interface including a listing of available multimedia streams in the content database, the listing including a link to a stored multimedia stream and the searchable metadata associated with the multimedia stream, including the names of recognized participants. 17. The method according to claim 16, wherein the listing further includes one or more stored thumbnail images of individuals detected in each multimedia stream. 18. The method according to claim 13, wherein the video conference format coded data stream is a H.323, H.320, or Session Initiated Protocol coded data stream. 19. The method according to claim 13, further comprising ranking the subsets according to a first subset including the known individuals related to the one or more unique addresses, a second subset including all known individuals associated with a first group also associated with the one or more unique addresses, and a third subset including all known individuals associated with a second group and associated with the one or more unique addresses. 20. The method according to claim 19, wherein the first group is selected from at least one of a department, a geographical location, a building, an office, and a floor, andwherein the second group is selected from at least one of a company, an organization, and a set of known individuals. 21. The method according to claim 19, further comprising: generating a user histogram group for each terminal, wherein previous users of a terminal are associated with the user histogram group; andranking a fourth subset higher than the second subset, the fourth subset including all known individuals associated with the user histogram group and associated with the one or more unique addresses of the known terminals. 22. The method according to claim 19, further comprising: generating a list of probable users of each of the known terminals at a given time, based on information from at least one of a scheduling server and a presence server; andranking a fifth subset higher than the second subset, the fifth subset including facial images of known users in the list of probable users.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (9)
Ishihara Ken,JPX ; Kawagoe Masahiro,JPX ; Hasegawa Ryozo,JPX, Apparatus for and method of extracting time series image information.
Ellis Michael D. (Boulder CO) Dunn Stephen M. (Boulder CO) Fellinger Michael W. (Boulder CO) Younglove Fancy B. (Boulder CO) James David M. (Fort Collins CO) Clifton David L. (Boulder CO) Land Richar, Method and apparatus for producing a signature characterizing an interval of a video signal while compensating for pictu.
Bruno Richard F. ; Gibbon David C. ; Katseff Howard P. ; Markowitz Robert E. ; Robinson Bethany S. ; Shahraray Behzad ; Stuntebeck Peter H. ; Weber Roy P., Method and apparatus for recording and indexing an audio and multimedia conference.
Shanmukhadas, Binu Kaiparambil; Kulkarni, Hrishikesh G.; Belur, Raghuram; Lakshmipathy, Sandeep, Collaborative recording of a videoconference using a recording server.
Khot, Gautam; Kulkarni, Hrishikesh G.; Ranganath, Prithvi; Belur, Raghuram; Lakshmipathy, Sandeep, Conducting a direct private videoconference within a videoconference.
Shanmukhadas, Binu Kaiparambil; Kulkarni, Hrishikesh G.; Belur, Raghuram; Lakshmipathy, Sandeep, Distributed recording or streaming of a videoconference in multiple formats.
Asati, Somnath; Naganna, Soma Shekar; Seth, Abhishek; Tomar, Vishal; Yellareddy, Shashidhar R., Face recognition in big data ecosystem using multiple recognition models.
Asati, Somnath; Naganna, Soma Shekar; Seth, Abhishek; Tomar, Vishal; Yellareddy, Shashidhar R., Face recognition in big data ecosystem using multiple recognition models.
Goyal, Ashish; Shanmukhadas, Binu Kaiparambil; Wamorkar, Vivek; King, Keith C.; Slivinski, Stefan F.; Anuar, Raphael; Pullamkottu, Boby S.; George, Sunil, Recording a videoconference using video different from the videoconference.
Shanmukhadas, Binu Kaiparambil; Goyal, Ashish; Anuar, Raphael, Streaming a videoconference from a server including boundary information for client layout adjustment.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.