Title: Synthetic Speech Attacks Against Voice Assistants and Defenses
Date/Time: Monday, December 2, 2024, 2:00-4:00 pm EST
Location (in-person): Coda C0908 Home Park
Zoom Link: https://gatech.zoom.us/j/96401837809?pwd=YMH05b3kK4CpT73a9uNn4GCg6BHI7c.1&from=addon
Zhengxian He
Ph.D. Candidate in Computer Science
School of Cybersecurity and Privacy
Georgia Institute of Technology
Committee:
Dr. Mustaque Ahamad (Advisor), School of Cybersecurity and Privacy, Georgia Institute of Technology
Dr. Alexandra Boldyreva, School of Cybersecurity and Privacy, Georgia Institute of Technology
Dr. Saman Zonouz, School of Cybersecurity and Privacy, Georgia Institute of Technology
Dr. Frank Li, School of Cybersecurity and Privacy, Georgia Institute of Technology
Dr. Ashish Kundu, Cisco Research
Abstract:
Voice assistants have become prevalent in both home and enterprise environments, offering natural and convenient ways to interact with computing devices. However, their growing adoption has also introduced new security vulnerabilities. This dissertation investigates three critical security aspects of voice assistant systems. First, we demonstrate that attackers can synthesize malicious voice assistant commands using limited, unrelated speech samples to bypass currently available speaker verification mechanisms. In fact, they can achieve high success rates even with a lightweight unit-selection speech synthesis technique. Second, we show how malicious commands directed at voice assistants can be used to setup covert channels for data exfiltration from nearby compromised computers. We explore high frequency modulation to make data transfer unnoticeable to humans and characterize the achievable data rates and reliability of such a channel under various conditions. Third, we develop a novel liveness detection method based on harmonic distortion analysis, which leverages physical characteristics of audio reproduction systems to effectively distinguish between live and synthetic commands while maintaining computational efficiency. Through empirical evaluation, we demonstrate the feasibility of synthetic command attacks and the effectiveness of our proposed defense mechanism. Our findings highlight significant security challenges in current voice assistant systems while providing practical approaches for enhancing their security. This research contributes to both the understanding of voice assistant vulnerabilities and the development of countermeasures against attacks that could exploit such vulnerabilities.