New Class of PhD Graduates Advances Research Frontiers at the iSchool

Published:
June 9, 2025
Anubrata Das, Siqi Yi, and Chongyan Chen

The University of Texas at Austin School of Information is proud to introduce its latest cohort of PhD graduates, who have successfully navigated the rigors of the dissertation process to emerge as new leaders in the field. These scholars are engaged in important research projects that advance, in new and exciting ways, conceptual and practical approaches to information science, where humans and technology meet. Their achievements are testament not only to their hard work but also to their imagination and foresight in tackling contemporary challenges.

One standout research project, by new PhD Anubrata Das, seeks to build transparent natural language process tools to allow for more effective human-AI collaboration, focused on the task of fact-checking. Another notable dissertation, by Siqi Yi, observes children’s search as learning when using voice assistant technology and offers suggestions for design tailored to children’s unique user experiences. A third exciting new project, by Chongyan Chen, pushes the boundaries of Visual Question Answering (VQA) in real-world contexts, introducing and benchmarking new VQA datasets collected from everyday scenarios, particularly benefiting visually impaired users.

As these new PhDs embark on their professional journeys, they are poised to make substantial contributions to academia, industry, and public service. Their expertise in the increasingly crucial field of information science equips them with tools to tackle challenges in diverse realms. The iSchool takes pride in their accomplishments and looks forward to celebrating their continued impact on the global stage.


Anubrata Das


Anubrata Das

Before coming to UT to pursue his PhD, Das worked with AI in industry and academic contexts, where he came to see the black-box nature of advanced natural language processing (NLP) products as a roadblock to user adoption. “As deep-learning models get really powerful, it is important that we actually understand how these things work under the hood,” he observes.

Das titled his dissertation, written under advisors Matthew Lease (iSchool) and Junyi Jessy Li (Department of Linguistics), “Towards Human-Centered and Trustworthy Natural Language Processing.” Das regards effective human-AI teaming in NLP as “an open question,” hampered by the challenge of developing trust in the opaque technology. For example, Das worked with real-world fact-checkers, who do the important work of shielding misleading information from millions of eyeballs around the world, while preserving norms of freedom of expression. When an NLP tool for AI fact-checking fails, professionals in the field want to know why and how to fix it, or they won’t use the product again.

To that end, borrowing techniques from the field of computer vision, Das built a novel methodology, ProtoTEx, an interpretable NLP model that explains classification outcomes in terms of the training data. Das sees it as a step towards improving interpretability and, in turn, trustworthiness.

The initial idea for ProtoTEx’s design came from an undergraduate colleague at UT with experience studying how computers see the world through cameras. “We looked into the computer-vision literature for interpretability, because they were doing a lot of really good stuff at that time,” Das says. “We wanted to get the model to perform as well as the black-box models do. It was a lot of trying things out, until we got to a point where we figured out how to make this thing work.”

In the subsequent phase of his dissertation, Das endeavored to evaluate the helpfulness of ProtoTEx and similar explainable AI methodologies in human decision-making in the complex, real-world domain of content moderation. In the third phase, he experimented with localizing and deleting toxic generation in large language models without retraining from scratch. Along the way, he was impressed by the ongoing roadblocks to uptake of explainable, trustworthy AI. “It's been really challenging to get positive results for human-centered experiments,” he says.

To address that real-world challenge, in a final phase of his dissertation research, Das, collaborating with other HCI researchers, organized co-design sessions with actual professionals in the content-management field. “We sat with fact checkers, professionals across different countries who are responsible for curbing misinformation online,” he says. “When we sat down with them and tried to design tools, we figured out that there is a gap between what current models can do and what they're looking for.”

Co-designing with these professionals, Das, who is headed next to a postdoc at McCombs School of Business at UT, brainstormed tools that fact-checkers felt would be more appealing to work alongside in human-AI teaming. “Hopefully, it can translate to real-world impact, where fact-checkers work with these AI tools and it helps us fight misinformation at scale a little better than we can today,” he says.


Siqi Yi


Siqi Yi

Yi first became interested in how children use voice assistant technology like Amazon Alexa around 2019 and 2020, when she worked with professors Ken Fleischmann and Jakki Bailey on a research project focused on digital assistant use by Latino and African-American youth. During the pandemic, Yi noticed that kids participating in distance learning had more and more need and opportunity to use AI voice technology. That compelled Yi to pursue a deeper study of children’s search as learning using these devices. 

Yi wrote her dissertation, “Understanding Children’s Search as Learning with Voice Assistants in the Home Environment,” under faculty advisor Dr. Soo Young Rieh. The project featured a two-week study of a diverse population of children, aged 6 to 10, none of whom had previously had voice assistants in their homes. Yi gave each family a Google Nest Mini 2nd generation smart speaker and encouraged children to use them daily and to diary about their experiences. Immediately before and after the two-week study period, children also attended sessions with Yi involving interviews, questionnaires and information search tasks. 

By analyzing the device logs alongside diaries, questionnaires, interviews and monitored information search tasks, Yi was able to achieve a well-rounded analysis of the children’s learning during the search processes. “It’s hard for children to be articulate sometimes, and also hard for them to keep focused, so you have to think of creative ways to do the study,” says Yi, who credits Dr. Rieh for help with the study design. “The variety of collection methods helped me to triangulate the data I got from the children.”

Yi found that, after using voice assistants for two weeks, children were more curious to learn new information from the devices, but less confident and motivated to use the devices. This seeming contradiction – a reticence to engage the technology even as they regarded it as fascinating and potentially useful – seemed to arise from children’s lack of confidence in communicating with the devices. 

“Since it’s a commercial product for the general public, sometimes children find it difficult to interact with the smart speaker, because it can’t recognize children’s speech,” Yi says. “That can make children feel a bit upset. If it happens too much, they may think, ‘Is it because I didn’t ask a good question?’” She also notes that device responses used phrasing and concepts scraped from Wikipedia and other sources that are too advanced for elementary-grade children and can intimidate them. 

With her PhD in hand, Yi is headed back to her home country, China, where she has a job offer as a product manager at a tech company working with smart speakers. She sees her work at the iSchool as informing future smart speaker design. 

“We definitely need to incorporate more data from children’s daily conversations to train the algorithm to improve automatic speech recognition for children,” Yi says. “Also, we need to incorporate scaffolding features to guide and encourage children to learn more about the information. It’s easy for them to get some quick answers. But if they want to learn more, the current devices are not good enough.” 


Chongyan Chen


Chongyan Chen

Chen’s passion for visual question answering (VQA) began during a course taught by Danna Gurari, who later became her PhD advisor. VQA is the task of teaching machines to predict the answer to any natural language questions about any image, which has many applications across domains like healthcare, accessibility, and education. 

In Gurari’s class, Chen explored the app Be My Eyes, which connects blind users with sighted volunteers. “I pretended to be visually impaired and asked questions about photos I took of my surroundings. The kindness and helpfulness of the volunteers left a lasting impression, inspiring me to pursue VQA not just as a technical challenge but as a human-centered task aimed at improving people’s quality of life, especially for those with visual impairments,” Chen recalls. 

In her dissertation, titled “Visual Question Answering: Representing Authentic Use Cases and Ambiguity,” Chen focuses on enhancing the robustness of VQA AI systems in real world settings. One problem she recognized early in her exploration is that most existing VQA datasets are created in contrived ways. For example, questions are often artificially generated or crowdsourced under constraints, rather than reflecting real-world needs. Consequently, VQA models trained on these datasets often underperform in real-world settings. 

To bridge this gap, Chen introduced two novel VQA datasets derived from authentic use cases: one focused on answer grounding for visually impaired users and another sourced from an online question-answering platform—the first VQA dataset fully collected from real-world scenarios. Through comprehensive analysis and benchmarking, Chen identified new challenges that existing models struggle with, paving the way for training models better suited to real-world applications. 

One major insight from her research is that ambiguity is inherent in human communication, especially in real-world scenarios. “Humans often communicate ambiguously to conserve effort while conveying ideas with minimal context. This reduces cognitive load but poses challenges for AI, requiring systems that understand user intentions and adapt to ambiguity,” she explains. While most VQA research treats ambiguity as noise to be removed, Chen instead suggests embracing it to create a more robust system. 

As part of her dissertation research, Chen collected ambiguous questions and introduced two datasets designed to recognize and localize both question and answer ambiguity. Chen developed a schema to categorize the causes of ambiguity and tested existing VQA algorithms against these datasets. “It was surprising to see how poorly current models handle ambiguity, which can have serious consequences,” she says. 

“Overall, my dissertation advances our understanding of the kinds of visual questions users naturally ask in daily life and provides insights into user profiles, intentions, question types, image variability and realistic behavior. Additionally, it deepens our understanding of ambiguity in visual question answering and provides a foundation for developing systems that are better equipped to handle the diverse and ambiguous environments encountered in real-world applications,” Chen says. 

Looking ahead, Chen’s overarching research ambition is developing a multimodal question answering system that delivers user-friendly responses—whether as short textual answers, long-form textual answers, visual answers, or a combination of textual and visual answers. This system should also address real-world challenges, including providing explanation, mitigating ambiguity and handling multiple possible valid answers via visual grounding. After completing her PhD, she will join Google DeepMind as a research scientist, continuing her quest to bridge the gap between VQA research and real-world application. 

News tags:
Research

Share this content