[논문]비디오 시각적 관계 이해 기술 동향

권용진; 김대회; 김종희; 오성찬; 함제석; 문진영

doi:10.22648/etri.2023.j.380602

[국내논문] 비디오 시각적 관계 이해 기술 동향
Trends in Video Visual Relationship Understanding 원문보기

전자통신동향분석 = Electronics and telecommunications trends, v.38 no.6, 2023년, pp.12 - 21

권용진 (시각지능연구실) , 김대회 (시각지능연구실) , 김종희 (시각지능연구실) , 오성찬 (시각지능연구실) , 함제석 (시각지능연구실) , 문진영 (시각지능연구실)

Abstract ▼ AI-Helper

Visual relationship understanding in computer vision allows to recognize meaningful relationships between objects in a scene. This technology enables the extraction of representative information within visual content. We discuss the technology of visual relationship understanding, specifically focusing on videos. We first introduce visual relationship understanding concepts in videos and then explore the latest existing techniques. Next, we present benchmark datasets commonly used in video visual relationship understanding. Finally, we discuss future research directions in video visual relationship understanding.

주제어

참고문헌 (38)

J. Johnson et al., "Image retrieval using scene graphs,"？in Proc. IEEE/CVF CVPR, (Boston, MA, USA), June？2015, pp. 3668-3678.
C. Lu et al., "Visual relationship detection with language？priors," in Proc. ECCV, Oct. 2016, pp. 852-569.
R. Krishna et al., "Visual genome: Connecting？language and vision using crowdsourced dense image？annotations," Int. J. Comput. Vis., vol. 123, no. 1, May？2017, pp. 32-73.
J. Ji et al., "Action genome: actions as compositions？of spatio-temporal scene graphs," in Proc. IEEE/CVF？CVPR, June 2020, pp. 10233-10244.
Y. Zhong et al., "Comprehensive image captioning via？scene graph decomposition," in Proc. ECCV, Aug. 2020,？pp. 211-229.
X. Yang et al., "Auto-encoding and distilling scene？graphs for image captioning," IEEE Trans. Pattern Anal.？Mach. Intell., vol. 44, no. 5, May 2022, pp. 2313-2327.
X. Lu and Y. Gao, "Guide and interact: SceneGraph？based generation and control of video captions,"？Multimed. Syst., vol. 29, no. 2, Apr. 2023, pp. 797-809.
C. Zhang et al., "An empirical study on leveraging scene？graphs for visual question answering," in Proc. BMVC,？Sept. 2019.
L. Li et al., "Relation-aware graph attention network for？visual question answering," in Proc. IEEE/CVF ICCV,？Oct. 2019, pp. 10312-10321.
J. Mao et al., "Dynamic multistep reasoning based on？video scene graph for video question answering," in？Proc. NAACL, Jul. 2022, pp. 3894-3904.
M. Qi et al., "Online cross-modal scene retrieval by？binary representation and semantic graph," in Proc.？ACM MM, Oct. 2017, pp. 744-752.
M. Daum et al., "VOCAL: Video organization and？interactive compositional analytics," in Proc. CIDR, Jan.？2022.
X. Chang et al., "A Comprehensive survey of scene？graphs: generation and application," IEEE Trans. Pattern？Anal. Mach. Intell., vol. 45, no. 1, 2023, pp. 1-26.

상세보기
O. Russakovsky et al., "ImageNet large scale visual？recognition challenge," Int. J. Comput. Vis., vol. 115,？no. 3, 2015, pp. 211-252.

상세보기
C. Liu et al., "Beyond short-term snippet: Video relation？detection with spatio-temporal global context," in Proc.？IEEE/CVF CVPR, June 2020, pp. 10837-10846.
Y. Li et al., "Interventional video relation detection," in？Proc. ACM MM, Oct. 2021, pp. 4091-4099.
X. Shang et al., "Video visual relation detection," in？Proc. ACM MM, Oct. 2017, pp. 1300-1308.
A. Vaswani et al., "Attention is all you need," in Proc.？NIPS, Dec. 2017, pp. 5998-6008.
Y.H.H. Tsai et al., "Video relationship reasoning using？gated spatio-temporal energy graph," in Proc. IEEE/CVF？CVPR, June 2019, pp. 10416-10425.
X. Qian et al., "Video relation detection with spatiotemporal graph," in Proc. ACM MM, Oct. 2019, pp. 84-93.
T. N. Kipf and M. Welling, "Semi-supervised classification with graph convolutional networks," in Proc.？ICLR, Apr. 2017.
L. Bertinetto et al., "Fully-connected siamese networks？for object tracking," in Proc. ECCVW, Oct. 2016, pp.？850-865.
Q. Cao et al., "3-D relation network for visual relation？recognition in videos," Neurocomputing, vol. 432, 2021,？pp. 91-100.

상세보기
X. Shang et al., "Video visual relation detection via？iterative inference," in Proc. ACM MM, Oct. 2021, pp.？3654-3663.
S. Chen et al., "Social fabric: tubelet compositions for？video relation detection," in Proc. IEEE/CVF ICCV, Oct.？2021, pp. 13465-13474.
K. Gao et al., "Classification-then-grounding: Reformulating video scene graphs as temporal bipartite？graphs," in Proc. IEEE/CVF CVPR, June 2022, pp.？19475-19484.
C. Lu et al., "DEBUG: A dense bottom-up grounding？approach for natural language video localization," in？Proc. EMNLP-IJCNLP, Nov. 2019, pp. 5144-5153.
Y. Teng et al., "Target adaptive context aggregation for？video scene graph generation," in Proc. IEEE/CVF ICCV,？Oct. 2021, pp. 13668-13677.
Y. Cong et al., "Spatial-temporal transformer for？dynamic scene graph generation," in Proc. IEEE/CVF？ICCV, Oct. 2021, pp. 16352-16363.
Y. Li et al., "Dynamic scene graph generation via？anticipatory pre-training," in Proc. IEEE/CVF CVPR,？June 2022, pp. 13864-13873.
S. Feng et al., "Exploiting long-term dependencies for？generating dynamic scene graphs," in Proc. IEEE/CVF？WACV, Jan. 2023, pp. 5119-5128.
S. Nag et al., "Unbiased Scene graph generation in？videos," in Proc. IEEE/CVF CVPR, June 2023, pp.？22803-22813.
L. Xu et al., "Meta spatio-temporal debiasing for video？scene graph generation," in Proc. ECCV, Oct. 2022, pp.？374-390.
X. Shang et al., "Annotating objects and relations in？user-generated videos," in Proc. ACM ICMR, June？2019, pp. 279-287.
B. Thomee et al., "YFCC100M: The new data in？multimedia research," Commun. ACM, vol. 59, no. 2,？2016, pp. 64-73.

상세보기
J. Ji et al., "Action genome: actions as compositions？of spatio-temporal scene graphs," in Proc. IEEE/CVF？CVPR, June 2020, pp. 10233-10244.
G. A. Sigurdsson et al., "Hollywood in homes: Crowdsourcing data collection for activity understanding," in？Proc. ECCV, Oct. 2016, pp. 510-526.
J. Yang et al., "Panoptic video scene graph generation,"？in Proc. IEEE/CVF CVPR, June 2023, pp. 18675-18685.

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 논문명, 저널/프로시딩명, 저자 , 발행년, 권, 호, 시작페이지, 끝페이지, 발행기관 관리번호, 논문명, 대등논문명, 저자 , 저널/프로시딩명, 발행기관, 발행년, 발행언어, 권, 호, 시작페이지, 끝페이지, ISBN, ISSN, 주제분야, 키워드, 초록(한글), 초록(영문), 저자(소속기관)
저장형식	Text(ASCII format) Excel format RefWorks Direct Export RIS format (for Reference Manager, ProCite, EndNote), Scholar's Aids, Mendeley
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

[국내논문] 비디오 시각적 관계 이해 기술 동향
Trends in Video Visual Relationship Understanding 원문보기

Abstract ▼ AI-Helper

주제어

표/그림 (2)

표/그림 (2)

참고문헌 (38)

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

원문 URL 링크

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

[국내논문] 비디오 시각적 관계 이해 기술 동향 Trends in Video Visual Relationship Understanding 원문보기

Abstract ▼ AI-Helper

주제어

표/그림 (2)

표/그림 (2)

참고문헌 (38)

이 논문을 인용한 문헌

관련 콘텐츠

원문 보기

원문 URL 링크

이 논문과 함께 이용한 콘텐츠

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

[국내논문] 비디오 시각적 관계 이해 기술 동향
Trends in Video Visual Relationship Understanding 원문보기