Chongyan Chen: Dissertation Defense

Event Status
Scheduled

This defense will be a virtual event, taking place on Zoom. The Zoom link for the defense will be shared one day before the event. The dissertation draft is available in UT Box

Title: Visual Question Answering: Representing Authentic Use Cases and Ambiguity

Abstract: Visual question answering (VQA) is a popular artificial intelligence (AI) task of having a machine return the answer to any natural language question about any image. Limitations of existing AI work are that existing datasets supporting model development are (1) typically contrived and (2) don't acknowledge ambiguity, which leads to multiple valid answers. This dissertation fills these gaps.  First, two new VQA datasets are introduced that are sourced from authentic use cases representing people with visual impairments and online question-answering communities.  Analysis of these datasets and modern models' performance on them reveals new challenges for the research community, such as handling when the visual evidence occupies only a small fraction of the image and answers are long. The second part of this dissertation introduces two new VQA datasets that enable the development of AI models that are ambiguity-aware, with the first visually grounding 10 answers corresponding to each visual question and the second grounding all regions to which the language in the question could refer.  Novel AI tasks include predicting whether multiple answer/question groundings exist and localizing them all.  By addressing authentic use cases and ambiguity, this dissertation establishes a foundation for developing VQA systems better equipped to handle the diversity and uncertainty inherent in real-world communications about visual content.

Dissertation Committee: Danna Gurari (Chair), Ying Ding (Co-Supervisor, School of Information), Kenneth R. Fleischmann (School of Information), Min Kyung Lee (School of Information), Amy Pavel (Department of Computer Science)

Date and Time
April 8, 2025, 11 a.m. to 1 p.m. Google Outlook iCal
Location