Title: Deploying Deep Neural Networks in Edge with Distribution
School of Computer Science
College of Computing
Georgia Institute of Technology
Date: Thursday, April 21, 2021
Time: 11:00 AM - 1:00 PM (EDT)
Location: Online (Bluejeans)
331 538 895
Want to dial in from a phone?
Dial one of the following numbers:
+1.408.419.1715 (United States(San Jose))
+1.408.915.6290 (United States(San Jose))
(see all numbers - https://www.bluejeans.com/numbers)
Enter the meeting ID and passcode followed by #
Connecting from a room system?
Dial: bjn.vc or 188.8.131.52 and enter your meeting ID & passcode
Dr. Hyesoon Kim (Advisor, School of Computer Science, Georgia Institute of Technology)
Dr. Saibal Mukhopadhyay (School of Electrical and Computer Engineering, Georgia Institute of Technology)
Dr. Michael S. Ryoo (Department of Computer Science, Stony Brook University; Research Scientist, Robotics at Google)
Dr. Tushar Krishna (School of Electrical and Computer Engineering, Georgia Institute of Technology)
Dr. Alexey Tumanov (School of Computer Science, Georgia Institute of Technology)
The widespread applicability of deep neural networks (DNNs) has led edge computing to emerge as a trend to extend our capabilities to several domains such as robotics, autonomous technologies, and Internet-of-things devices. Because of the tight resource constraints of such individual edge devices, computing accurate predictions while providing a fast execution is a key challenge. Moreover, modern DNNs increasingly demand more computation power than their predecessors. As a result, the current approach is to rely on compute resources in the cloud by offloading the inference computations of DNNs. This approach not only does raise privacy concerns but also relies on network infrastructure and data centers that are not scalable and do not guarantee fast execution.
My key insight is that edge devices can break their individual resource constraints by distributing the computation of DNNs on collaborating peer edge devices. In my approach, edge devices cooperate to conduct single-batch inferences in real-time while exploiting several model-parallelism methods. Nonetheless, since communication is costly and current DNN models capture a single-chain of dependency pattern, distributing and parallelizing the computations of current DNNs may not be an effective solution for edge domains. Therefore, to efficiently benefit from computing resources with low communication overhead, I propose new handcrafted edge-tailored models that consist of several independent and narrow DNNs. Additionally, I explore an automated neural architecture search methodology and propose custom DNN architectures with low communication overheads and high parallelization opportunities. Finally, to increase reliability, decrease susceptibility to short disconnectivity or losing a device, I propose a coded distributed computing recovery method that enables distributed DNN models on edge devices to tolerate failures and not lose time-sensitive and real-time information.