최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기IEEE transactions on pattern analysis and machine intelligence, v.43 no.5, 2021년, pp.1605 - 1619
Senocak, Arda (KAIST, School of Electrical Engineering, Daejeon, Republic of Korea) , Oh, Tae-Hyun (POSTECH, Pohang, Korea) , Kim, Junsik (KAIST, School of Electrical Engineering, Daejeon, Republic of Korea) , Yang, Ming-Hsuan (University of California, Merced, CA, USA) , Kweon, In So (KAIST, School of Electrical Engineering, Daejeon, Republic of Korea)
Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn to correlate the visual scene and sound, as well as localize the sound source only by observing them like humans? To investigate its empirical learnability, in this work we first present a novel unsup...
Proc Int Conf Learn Representations Neural machine translation by jointly learning to align and translate bahdanau 2015 1
Proc Int Conf Learn Representations Very deep convolutional networks for large-scale image recognition simonyan 2015 1
Proc Int Conf Mach Learn Show, attend and tell: Neural image caption generation with visual attention xu 2015 2048
Corbetta, Maurizio, Shulman, Gordon L.. Control of goal-directed and stimulus-driven attention in the brain. Nature reviews. Neuroscience, vol.3, no.3, 201-215.
Perrott, David R., Cisneros, John, Mckinley, Richard L., D'Angelo, William R.. Aurally Aided Visual Search under Virtual and Free-Field Listening Conditions. Human factors : the journal of the Human Factors and Ergonomics Society, vol.38, no.4, 702-715.
Bolia, Robert S., D'Angelo, William R., McKinley, Richard L.. Aurally Aided Visual Search in Three-Dimensional Space. Human factors : the journal of the Human Factors and Ergonomics Society, vol.41, no.4, 664-669.
Proc 26th Int Conf Neural Inf Process Syst Deep content-based music recommendation van den oord 2013 2643
Proc Eur Conf Comput Vis Visualizing and understanding convolutional networks zeiler 2014 818
Stein, Barry E., Stanford, Terrence R.. Multisensory integration: current issues from the perspective of the single neuron. Nature reviews. Neuroscience, vol.9, no.4, 255-266.
Majdak, Piotr, Goupell, Matthew J., Laback, Bernhard. 3-D localization of virtual sound sources: Effects of visual environment, pointing method, and training. Attention, perception & psychophysics, vol.72, no.2, 454-469.
Jones, Bill, Kabanoff, Boris. Eye movements in auditory space perception. Perception & psychophysics, vol.17, no.3, 241-245.
Shelton, B. R., Searle, C. L.. The influence of vision on the absolute identification of sound-source position. Perception & psychophysics, vol.28, no.6, 589-596.
Gaver, William W.. What in the World Do We Hear?: An Ecological Approach to Auditory Event Perception. Ecological psychology : a publication of the International Society for Ecological Psychology, vol.5, no.1, 1-29.
Proc 13th Int Conf Neural Inf Process Syst Learning joint statistical models for audio-visual fusion and segregation fisher 2001 742
Izadinia, H., Saleemi, I., Shah, M.. Multimodal Analysis for Identification and Segmentation of Moving-Sounding Objects. IEEE transactions on multimedia, vol.15, no.2, 378-390.
Proc IEEE Conf Comput Vis Pattern Recognit Deep 360 pilot: Learning a deep agent for piloting through $360^{\circ }$360? sports video hu 2017 1396
Proc AAAI Self-view grounding given a narrated $360^{\circ }$360? video chou 2017 6748
Proc 32nd Int Conf Neural Inf Process Syst Self-supervised generation of spatial audio for $360^{\circ }$360? video morgado 2018 360
arXiv 1904 07933 Audio–visual model distillation using acoustic images perez 2019
ACM Trans Graphics $360^{\circ }$360? video stabilization kopf 2016 10.1145/2980179.2982405 35
Proc Asia Conf Comput Vis On learning associations of faces and voices kim 2018 276
Proc Eur Conf Comput Vis Ambient sound provides supervision for visual learning owens 2016 801
Owens, Andrew, Wu, Jiajun, McDermott, Josh H., Freeman, William T., Torralba, Antonio. Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning. International journal of computer vision, vol.126, no.10, 1120-1137.
Proc 30th Int Conf Neural Inf Process Syst SoundNet: Learning sound representations from unlabeled video aytar 2016 892
Proc IEEE Int Conf Comput Vis Look, listen and learn arandjelovi? 2017 609
CoRR See, hear, and read: Deep aligned representations aytar 2017 abs 1706 932
Proc 32nd Int Conf Neural Inf Process Syst Cooperative learning of audio and video models from self-supervised synchronization korbar 2018 7774
Proc 12th Int Conf Neural Inf Process Syst Audio vision: Using audio-visual synchrony to locate sounds hershey 1999
Proc Eur Conf Comput Vis Learning to separate object sounds by watching unlabeled video gao 2018 36
Proc Eur Conf Comput Vis Objects that sound arandjelovic 2018 451
ACM Trans Graphics Looking to listen at the cocktail party: A speaker-independent audio-visual model for speech separation ephrat 2018 10.1145/3197517.3201357 37
Proc Eur Conf Comput Vis The sound of pixels zhao 2018 587
Proc Eur Conf Comput Vis Audio-visual scene analysis with self-supervised multisensory features owens 2018 639
Proc Eur Conf Comput Vis Jointly discovering visual objects and spoken words from raw sensory input harwath 2018 659
Proc IEEE Conf Comput Vis Pattern Recognit Making $360^{\circ }$360? video watchable in 2D: Learning videography for click free viewing su 2017 1368
Proc Eur Conf Comput Vis Audio-visual event localization in unconstrained videos tian 2018 252
Everingham, Mark, Van Gool, Luc, Williams, Christopher K. I., Winn, John, Zisserman, Andrew. The Pascal Visual Object Classes (VOC) Challenge. International journal of computer vision, vol.88, no.2, 303-338.
Kafle, Kushal, Kanan, Christopher. Visual question answering: Datasets, algorithms, and future challenges. Computer vision and image understanding : CVIU, vol.163, 3-20.
Proc Asia Conf Comput Vis Pano2vid: Automatic cinematography for watching $360^{\circ }$360? videos su 2016 154
TensorFlow: Large-scale machine learning on heterogeneous systems abadi 2015
Skinner, B. F.. 'Superstition' in the pigeon.. Journal of experimental psychology, vol.38, no.2, 168-172.
Thomee, Bart, Shamma, David A., Friedland, Gerald, Elizalde, Benjamin, Ni, Karl, Poland, Douglas, Borth, Damian, Li, Li-Jia. YFCC100M : the new data in multimedia research. Communications of the ACM, vol.59, no.2, 64-73.
Proc Int Conf Learn Representations Adam: A method for stochastic optimization kingma 2015 1
해당 논문의 주제분야에서 활용도가 높은 상위 5개 콘텐츠를 보여줍니다.
더보기 버튼을 클릭하시면 더 많은 관련자료를 살펴볼 수 있습니다.
*원문 PDF 파일 및 링크정보가 존재하지 않을 경우 KISTI DDS 시스템에서 제공하는 원문복사서비스를 사용할 수 있습니다.
저자가 공개 리포지터리에 출판본, post-print, 또는 pre-print를 셀프 아카이빙 하여 자유로운 이용이 가능한 논문
※ AI-Helper는 부적절한 답변을 할 수 있습니다.