Alex Havrilla
Title: Towards a Theory and Practice of Open-ended Reasoning with Generative Models
Date: 6/13/2025
Time: 1 PM
Location:
- In-person: Skiles 202
- Remote: https://gatech.zoom.us/j/5472104648
Alexander Havrilla
Machine Learning PhD Student
School of Mathematics
Georgia Institute of Technology
Committee
1 Dr. Wenjing Liao, School of Mathematics (Advisor), Georgia Tech
2 Dr. Mark Riedl, School of Interactive Computing, Georgia Tech
3 Dr. Tuo Zhao, School of Industrial and Systems Engineering, Georgia Tech
4 Dr. Jacob Abernethy, School of Interactive Computing, Georgia Tech
5 Dr. David Alvarez-Melis, School of Engineering and Applied Sciences, Harvard
Abstract
Driven by advancements in large language modeling (LLMs), the last several years have seen an explosion in AI reasoning capability. In this dissertation, we characterize two distinct types of reasoning: closed-ended reasoning versus open-ended reasoning. We define closed-ended reasoning as the systematic application of a defined set of rules to reach a desired outcome. In contrast, we describe open-ended reasoning as a less structured process, often requiring the creation or adaptation of new rule sets themselves, and characterized by a greater need for exploration and discovery. While LLMs increasingly excel at closed-ended reasoning, they struggle more with problems requiring the open-ended counterpart. We study both types of reasoning in three parts. First, by establishing novel approximation and statistical theory for LLMs. This theory elucidates data complexity as a driving factor behind scaling laws, which themselves have a strong downstream effect on reasoning ability. Then, to improve reasoning ability in practice, we develop a novel RL framework for LLMs, trlX, which is used to fine-tune LLMs on reasoning problems. Our analysis reveals the exploration ability of LLMs as a key bottleneck to future improvement via RL. This leads us to propose SPARQ: a self-improvement style synthetic data generation algorithm drawing on techniques from the quality-diversity (QD) literature to improve both the correctness and diversity of LLM reasoning. We conclude by discussing open problems and future directions for better open-ended AI reasoning.