Title: Learning Dynamic Priority Scheduling Policies with Graph Attention Networks
Date: Wednesday, December 7, 2022
Time: 9 a.m. – 11 a.m. ET
Location: https://gatech.zoom.us/j/97223685875
Zheyuan Wang
PhD Candidate
School of Electrical and Computer Engineering
Georgia Institute of Technology
Committee
Dr. Matthew Gombolay (Advisor) - School of Interactive Computing, Georgia Institute of Technology
Dr. Matthieu Bloch (Co-Advisor) - School of Electrical and Computer Engineering, Georgia Institute of Technology
Dr. Sonia Chernova - School of Interactive Computing, Georgia Institute of Technology
Dr. Magnus Egerstedt - Department of Electrical Engineering and Computer Science, University of California, Irvine
Dr. Harish Ravichandar - School of Interactive Computing, Georgia Institute of Technology
Dr. Elias Khalil - Department of Mechanical and Industrial Engineering, University of Toronto
Abstract
The aim of this thesis is to develop novel graph attention network-based models to automatically learn scheduling policies for effectively solving resource optimization problems, covering both deterministic and stochastic environments. The policy learning methods utilize both imitation learning, when expert demonstrations are accessible at low cost, and reinforcement learning, when otherwise reward engineering is feasible. By parameterizing the learner with graph attention networks, the framework is computationally efficient and results in scalable resource optimization schedulers that adapt to various problem structures. This thesis addresses the problem of multi-robot task allocation (MRTA) under temporospatial constraints. Initially, robots with deterministic and homogeneous task performance are considered with the development of the RoboGNN scheduler. Then, I develop ScheduleNet, a novel heterogeneous graph attention network model, to efficiently reason about coordinating teams of heterogeneous robots. Next, I address problems under the more challenging stochastic setting in two parts. Part 1) Scheduling with stochastic and dynamic task completion times. The MRTA problem is extended by introducing human coworkers with dynamic learning curves and stochastic task execution. HybridNet, a hybrid network structure, has been developed that utilizes a heterogeneous graph-based encoder and a recurrent schedule propagator, to carry out fast schedule generation in multi-round settings. Part 2) Scheduling with stochastic and dynamic task arrival and completion times. With an application in failure-predictive plane maintenance, I develop a heterogeneous graph-based policy optimization (HetGPO) approach to enable learning robust scheduling policies in highly stochastic environments. Through extensive experiments, the proposed framework has been shown to outperform prior state-of-the-art algorithms in different applications. My research contributes several key innovations regarding designing graph-based learning algorithms in operations research.