Title: Object-Centric Scene Understanding via Lidar-Based 3D Object Detection

 

 

Date: Friday, November 22, 2024

 

Time:  1 pm ET

 

Location:

Remote

 

 

Benjamin Wilson

 

Machine Learning PhD Student

 

Interactive Computing

 

Georgia Institute of Technology

 

 

Committee

 

Dr. James Hays (Advisor) - School of Interactive Computing, Georgia Tech

 

Dr. Dhruv Batra - School of Interactive Computing, Georgia Tech

 

Dr. Humphrey Shi - School of Interactive Computing, Georgia Tech

 

Dr. Judy Hoffman - School of Interactive Computing, Georgia Tech

 

Dr. Deva Ramanan - School of Computer Science, Carnegie Mellon University

 

 

Abstract

Autonomous driving is a challenging task which requires large-scale datasets and specialized models to accurately characterize diverse, complex scenes. This dissertation focuses on developing datasets which support the wide gamut of downstream tasks in autonomous driving and designing 3D object detection models which directly process the native lidar representation, the range view.

 

First, I present Argoverse 2, a suite of three different datasets which enable researchers to design and evaluate models for a variety of tasks including 3D object detection. Next, I introduce a 3D object detection model which directly processes the native lidar representation, the range view. I provide a systematic analysis of which modules matter in range-view only 3D object detection and present new methods for localization-conditioned classification supervision and sampling object proposals by range. Lastly, I incorporate multiview, multimodal, and temporal architectures to push state-of-the-art on range view 3D object detection.