Title: Designing from Data to Discovery: Human-Centered Machine Learning for Interpretable Scientific Data Exploration
Date: April 22, 2025
Time: 1:00PM – 3:00PM EDT
Location: CODA 233
Zoom Link:
https://gatech.zoom.us/j/96027379785?pwd=xdrixi76Bht14sta3gwZLj4VXfFLla.1
Austin P. Wright
Machine Learning PhD Student
School of Computational Science and Engineering
Georgia Institute of Technology
Committee
1 Dr. Polo Chau (Advisor), School of Computational Science and Engineering, College of Computing, Georgia Institute of Technology
2 Dr. B. Aditya Prakash, School of Computational Science and Engineering, College of Computing, Georgia Institute of Technology
3 Dr. Kai Wang, School of Computational Science and Engineering, College of Computing, Georgia Institute of Technology
4 Dr. Alex Endert, School of Interactive Computing, College of Computing, Georgia Institute of Technology
5 Dr. Scott Davidoff, Human-Computer Interaction Institute, Carnegie Mellon University
Abstract
It is often ignored how scientific discovery is a social activity done by people, thus designing the statistical and computational tools that these people use to explore novel and complex data requires a human centered approach. While modern data mining methods purport to be able to assist in this endeavor by making more data types more amenable to visualization and analysis, very frequently these methods, when straightforwardly applied, do not solve the right problems that actual scientists face in their workflows. What is needed are better human-centered principles of applied data-science that takes into account this divide between scientific users and existing machine learning problem formulations. This thesis contributes towards precisely that goal, using extensive embedded field work to identify and understand the needs of specific groups of scientists across multiple domains, and designing new machine learning tools to address them. From this concrete basis I develop frameworks for the centering of people in the meta-process of the design of machine learning models within their total context of actual scientists’ processes of scientific discovery; a synthesis of machine learning (ML) theoretic and human-computer interaction (HCI) methodological frameworks. This work is therefore structured into two interrelated thrusts:
(1) Human-Centered Discovery Frameworks, where based on embedded user research on scientists working collaboratively in context, I develop frameworks for understanding the human processes of scientific discovery, model how ML systems interact with these processes, and create guidelines for improving the design of such systems.
(2) Interpretable ML for Exploratory Science in which I utilize these frameworks to collaborate with scientists and develop novel interpretable ML methods that address the particular problems of scientific users doing exploratory data analysis. Altogether this work contributes to scholarship in in ML, HCI, and scientific domains.