Name: Ananya Kumar, Ph.D. Candidate at Stanford University
Date: Thursday, March 16, 2023 at 11:00 am
Location: Scheller College of Business, Room 102
Link: This seminar is an in-person event only. However, the seminar will be recorded and uploaded to the School of Computational Science and Engineering channel on Georgia Tech MediaSpace following the presentation.
Title: Foundation Models for Robustness to Distribution Shift
Abstract: Machine learning systems are not robust—they suffer large drops in accuracy when deployed in different environments from what they were trained on. In this talk, I show that the foundation model paradigm—adapting models that are pretrained on broad unlabeled data—is a principled solution that leads to state-of-the-art robustness. I will focus on the key ingredients: how we should pretrain and adapt models for robustness. (1) First, I show that contrastive pretraining on unlabeled data learns transferable representations that improves accuracy even on domains where we had no labels. We explain why pretraining works in a very different way from some classical intuitions of collapsing representations (domain invariance). Our theory predicts phenomena on real datasets, and leads to improved methods. (1) Next, I will show that the standard approach of adaptation (updating all the model's parameters) can distort pretrained representations and perform poorly out-of-distribution. Our theoretical analysis leads to better methods for adaptation and state-of-the-art accuracies on ImageNet and in applications such as satellite remote sensing, wildlife conservation, and radiology.
Bio: Ananya Kumar is a Ph.D. candidate in the Department of Computer Science at Stanford University, advised by Percy Liang and Tengyu Ma. His work focuses on representation learning, foundation models, and reliable machine learning. His papers have been recognized with several Spotlight and Oral presentations at NeurIPS, ICML, and ICLR, and his research is supported by a Stanford Graduate Fellowship.