Title: Physics-Based Simulation for World Modeling Across Scales and Modalities

 

Date:  Wednesday, April 15, 2026

Time: 1:00 PM - 3:00 PM Eastern Time (US)

Location:   Coda C0908 Home Park

Zoom: https://gatech.zoom.us/j/95930183632

 

Committee  
Dr. Bo Zhu (Advisor) – School of Interactive Computing, Georgia Institute of Technology

Dr. Greg Turk – School of Interactive Computing, Georgia Institute of Technology

Dr. Yalong Yang – School of Interactive Computing, Georgia Institute of Technology

  

Duowen Chen

Ph.D. Student

School of Interactive Computing 

Georgia Institute of Technology 

 

 

 

 Abstract: Physical simulation plays a central role in computer graphics and scientific computing, but existing methods often remain fragmented across scales, representations, and sensory outputs. This thesis develops a physics-grounded approach to world modeling that connects accurate simulation, expressive representation, and multimodal generation. On the simulation side, I present completed work on dynamic interface tracking, solid–fluid coupling, and compressible flow simulation, using neural implicit representations and flow-map-based transport to improve geometric fidelity, long-horizon advection accuracy, and the modeling of complex physical interactions. Building on this foundation, I propose two new directions that extend physical simulation into richer multimodal world representations. The first develops a memory-augmented framework for wildfire world modeling, where physically meaningful simulations are translated into realistic digital-twin videos and combined with retrieval and vision-language reasoning to support environment understanding and structured reporting. The second develops a framework for spatial sound generation in 3D worlds, integrating layered scene reconstruction, semantically grounded sound synthesis, source localization, and physics-guided acoustic rendering to generate audio consistent with scene geometry and listener motion. Taken together, this thesis advances a broader view of world modeling in which physical simulation is not only a tool for reproducing dynamics, but also a foundation for building coherent visual, geometric, and auditory representations across scales and modalities.