Title: Object-Centric Scene Understanding via Lidar-Based 3D Object Detection
Date: Friday, November 22, 2024
Time: 1 pm ET
Benjamin Wilson
Machine Learning PhD Student
Interactive Computing
Georgia Institute of Technology
Committee
Dr. James Hays (Advisor) - School of Interactive Computing, Georgia Tech
Dr. Dhruv Batra - School of Interactive Computing, Georgia Tech
Dr. Humphrey Shi - School of Interactive Computing, Georgia Tech
Dr. Judy Hoffman - School of Interactive Computing, Georgia Tech
Dr. Deva Ramanan - School of Computer Science, Carnegie Mellon University
Abstract
Autonomous driving is a challenging task which requires large-scale datasets and specialized models to accurately characterize diverse, complex scenes. This dissertation focuses on developing datasets which support the wide gamut of downstream tasks in autonomous driving and designing 3D object detection models which directly process the native lidar representation, the range view.
First, I present Argoverse 2, a suite of three different datasets which enable researchers to design and evaluate models for a variety of tasks including 3D object detection. Next, I introduce a 3D object detection model which directly processes the native lidar representation, the range view. I provide a systematic analysis of which modules matter in range-view only 3D object detection and present new methods for localization-conditioned classification supervision and sampling object proposals by range. Lastly, I incorporate multiview, multimodal, and temporal architectures to push state-of-the-art on range view 3D object detection.