PhD Defense: Natural Language Visual Grounding via Multimodal Learning
20 January 2020
Abstract: Natural language visual grounding aims to locate target objects within images given natural language queries, and natural language visual grounding can create a natural communication channel between human beings, physical environments, and intelligent agents. This thesis aims to exploit multimodal learning-based approaches for natural language visual grounding and achieve natural language visual grounding without auxiliary information, such as dialogues and gestures. To this end, this thesis proposes three architectures to ground explicit natural language queries and intention-related natural language instructions
Monday, 20 January 2020, 12:30, Seminar Room F334, Informatikum, Hamburg
Speaker: Jinpeng Mi, Informatics Dept., Univ. Hamburg