Title: Physics-Based Simulation for World Modeling Across Scales and Modalities
Date: Wednesday, April 15, 2026
Time: 1:00 PM - 3:00 PM Eastern Time (US)
Location: Coda C0908 Home Park
Zoom: https://gatech.zoom.us/j/95930183632
Committee
Dr. Bo Zhu (Advisor) – School of Interactive Computing, Georgia Institute of Technology
Dr. Greg Turk – School of Interactive Computing, Georgia Institute of Technology
Dr. Yalong Yang – School of Interactive Computing, Georgia Institute of Technology
Duowen Chen
Ph.D. Student
School of Interactive Computing
Georgia Institute of Technology
Abstract: Physical simulation plays a central role in computer graphics and scientific computing, but existing methods often remain fragmented across scales, representations, and sensory outputs. This thesis develops a physics-grounded approach to world modeling that connects accurate simulation, expressive representation, and multimodal generation. On the simulation side, I present completed work on dynamic interface tracking, solid–fluid coupling, and compressible flow simulation, using neural implicit representations and flow-map-based transport to improve geometric fidelity, long-horizon advection accuracy, and the modeling of complex physical interactions. Building on this foundation, I propose two new directions that extend physical simulation into richer multimodal world representations. The first develops a memory-augmented framework for wildfire world modeling, where physically meaningful simulations are translated into realistic digital-twin videos and combined with retrieval and vision-language reasoning to support environment understanding and structured reporting. The second develops a framework for spatial sound generation in 3D worlds, integrating layered scene reconstruction, semantically grounded sound synthesis, source localization, and physics-guided acoustic rendering to generate audio consistent with scene geometry and listener motion. Taken together, this thesis advances a broader view of world modeling in which physical simulation is not only a tool for reproducing dynamics, but also a foundation for building coherent visual, geometric, and auditory representations across scales and modalities.