Matthew Rines
(Advisor: Prof. Dimitri Mavris]
will defend a doctoral thesis entitled,
A Methodology for Resilience-Based Design
Of An Environmental Control and Life Support System
On
Monday, April 17 at 11:30 a.m. EDT
Microsoft Teams Link
Abstract
A space habitat provides life support to the crew during both normal operation and unexpected circumstances so that they are able to carry out the scientific duties of their mission. As space habitats become more complex and are located farther from Earth, there are increased challenges to ensuring resilience in system performance and crew safety. As a result of these challenges, it is beneficial for resource allocation decision making to be autonomous.
An autonomous resource allocation algorithm reduces the workload of the crew and reliance on terrestrial decision-making. The hostility of space drives the need for rapid decision-making without requiring humans to be in the loop. Furthermore, the inability to plan for all possible situations necessitates a resource allocation strategy that can adapt and learn from experience. The methods used to autonomously learn about changes in the environment also need to be careful not to compromise the safety of the crew.
To meet these goals, a methodology was devised and is herein presented for developing resilient resource allocation strategies for an environmental control and life support system (ECLSS) and integrating it into the design of space habitats. Reinforcement learning techniques were investigated to allow rapid ECLSS resource allocations both for nominal and off-nominal situations. Agents were successfully trained through the use of hyperparameter optimization studies to operate an oxygen pressure control assembly in simulation. Soft actor-critic (SAC) was found to perform the task with the least computational resources. The methodology includes the ability to learn from data collected in operation as well as in a priori training simulations in order to better respond to disturbances that the resource allocation algorithm had not previously experienced in simulations and other non-stationary dynamics. Specifically, the agent trained with SAC was then further demonstrated to be able to adapt in a simulated deployment to changes in crew oxygen consumption, cabin leak rates, and degraded injector capabilities.
The learning rate and reward function were found to have strong effects on the agent’s ability adapt to disturbances in real-time. A variety of learning rates were tested to find the optimal balance between responsiveness and volatility. Additionally, two reward functions using linearly and quadratically increasing penalties were compared given the same set of hyperparameters and the linear penalty was found to perform much better in allowing adaptations in this scenario.
Lastly, the methods that were found to perform best for the resilient control of the oxygen pressure control assembly were also applied to a more complex habitat. In this demonstration use-case, agents were trained to control the oxygen generation system, crew activities, and the oxygen pressure control assembly. An agent was successfully trained to operate these systems during a nominal simulation, but its ability to adapt in real-time to unexpected disturbances requires some additional configuration and future work.
This research demonstrates that a resource allocation algorithm utilizing reinforcement learning can be developed to improve the resilience of an existing space habitat or a space habitat that is yet to be designed. By incorporating ECLSS decision making algorithms early in the design process, a better design can be found earlier and more cost effectively.
Committee
- Prof. Dimitri Mavris – School of Aerospace Engineering (advisor)
- Prof. Glenn Lightsey – School of Aerospace Engineering
- Prof. Koki Ho – School of Aerospace Engineering
- Dr. Rodney Martin – NASA Ames Research Center
- Dr. Michael Balchanos – School of Aerospace Engineering