PhD Proposal by Mingyu Guan

Thursday

April 3, 2025

11:00AM - 1:00PM

Location

Coda C0903 Ansley • Microsoft Teams

Title: Scalable and Efficient Graph Learning for Dynamic, Heterogeneous, and Knowledge-Augmented Graphs

Date: Thursday, April 3, 2025

Time: 11:00 AM – 12:00 PM ET

Location (Hybrid):

Coda C0903 Ansley
Microsoft Teams (Meeting ID: 262 709 547 619, Passcode: rK6s9UP7)

Mingyu Guan

CS Ph.D. Student

School of Computer Science

College of Computing

Georgia Institute of Technology

Committee:

Dr. Taesoo Kim (advisor) – School of Cybersecurity and Privacy, Georgia Institute of Technology
Dr. Anand Iyer (co-advisor) - School of Computer Science, Georgia Institute of Technology
Dr. Ada Gavrilovska - School of Computer Science, Georgia Institute of Technology
Dr. Kexin Rong - School of Computer Science, Georgia Institute of Technology
Dr. Jay Stokes – Microsoft Research

Abstract:

Graphs are a fundamental component of modern machine learning for structured data, driving advancements in areas such as recommendation systems, fraud detection, and traffic prediction. However, real-world graphs are often dynamic, heterogeneous, and knowledge-rich, presenting significant challenges in scalability, efficiency, and adaptability. This thesis addresses these challenges through innovations in dynamic graph learning, heterogeneous graph modeling, and retrieval-augmented generation with knowledge graphs. First, this thesis introduces ReD, a system designed for efficient and scalable training of Dynamic Graph Neural Networks (DGNNs). By reusing intermediate results, incrementally computing aggregations across graph snapshots, and eliminating communication overhead in distributed training, ReD enables DGNNs to scale to massive dynamic graphs while achieving up to an order-of-magnitude speedup over existing frameworks. Second, this thesis presents HetTree, a novel approach for scalable and expressive Heterogeneous Graph Neural Networks (HGNNs). Unlike existing methods that treat metapaths independently, HetTree models their hierarchical relationships using a semantic tree structure and enhances representation learning with a subtree attention mechanism. This approach significantly improves both efficiency and predictive performance across large-scale heterogeneous graphs.

Building upon these foundations, this thesis explores Graph-Based Retrieval-Augmented Generation (RAG) for Large Language Models (LLMs), addressing a fundamental limitation of current RAGs—their reliance on unstructured text data and vector similarity matching. Conventional RAGs often fail to capture complex connections among entities across large corpus and struggle to generate comprehensive and contextually relevant responses. By leveraging graph structures to enhance retrieval quality and infusing global knowledge into LLM-based generation, this work aims to improve response quality, coherence, and diversity in knowledge-augmented AI systems. Together, these contributions provide novel methodologies that harness graph structures to tackle dynamic, heterogeneous, and knowledge-intensive tasks, laying the groundwork for more scalable, efficient, and adaptive AI systems.

Graduate Education

Office of Graduate and Postdoctoral Education

Search

Thursday

April 3, 2025

Accessibility Information

Office of Graduate Education

Georgia Institute of Technology