When
Noon – 1:30 p.m., Feb. 20, 2026
Faithfulness of LLM Reasoning and Its Open Questions
Abstract: Large language models (LLMs) can generate step-by-step reasoning, and these verbalizations are increasingly treated as windows into how the LLM actually works under the hood. However, what an LLM displays may have little to do with the computation that produced its response. This problem is reminiscent of the gap between human verbal reports and underlying cognitive processes. Measuring this gap in LLMs with billions of parameters trained on trillions of tokens is, unsurprisingly, difficult. The dominant approach of corrupting a verbalized reasoning chain and checking whether the answer changes is confounded by knowledge stored in the model's parameters, leading the field to likely underestimate faithfulness. We take a different path, intervening directly on the model's parameters rather than on what it generates, and arrive at a more favorable picture.
I'll also discuss how the landscape is shifting. Today's reasoning traces are vastly longer than the short chains faithfulness research has focused on, and they are central to agentic systems that revise strategies, execute actions, and operate over extended interactions, raising questions about what faithfulness measurement should even look like going forward. And given that people manage to collaborate despite imperfect introspective access to their own cognitive processes, it may be worth asking not just how faithful model reasoning is, but when unfaithfulness actually matters.
Bio: Ana Marasović is an Assistant Professor in the Kahlert School of Computing at the University of Utah. Her research interests broadly fall into natural language processing (NLP), human-centered AI, and interpretability. Previously, she was a Young Investigator at the Allen Institute for AI and University of Washington (by courtesy), and she completed her PhD at Heidelberg University. She is a recipient of the Outstanding Paper Award at EMNLP 2025, the Best Paper Award at ACL 2023, the Best Paper Honorable Mention at ACL 2020, and the Best Paper Award at SoCal 2022 NLP Symposium. Her work was selected as a CoLM 2025 Spotlight. She is also the University of Utah One-U Responsible AI Initiative Faculty Fellow and UC Berkeley EECS 2020 Rising Star.
Contacts
Steven Bethard