Title: Program Synthesis for Fuzzing in the Perspective of Programming Language Characteristics
Date: Thursday, April 20, 2023
Time: 9:30 AM – 11:00 AM EST
Location: (hybrid) CODA 0903 Ansley, and Zoom
Soyeon Park
Ph.D. Student
School of Computer Science & School of Cybersecurity and Privacy
College of Computing
Georgia Institute of Technology
Committee:
Dr. Taesoo Kim (Advisor) - School of Computer Science & School of Cybersecurity and Privacy, Georgia Institute of Technology
Dr. Alessandro Orso - School of Computer Science, Georgia Institute of Technology
Dr. Qirun Zhang - School of Computer Science, Georgia Institute of Technology
Dr. Brendan D. Saltaformaggio - School of Computer Science & School of Cybersecurity and Privacy & School of Electrical and Computer Engineering, Georgia Institute of Technology
Dr. Jiyong Jang – IBM Research
Abstract:
Fuzzing has emerged as a practical method for discovering bugs in software testing. With the help of coverage feedback, fuzzing has been working well by providing random or semi-structured data to programs that take binary and slightly structured inputs in order to identify bugs. However, fuzzing programs that take heavily structured input, such as program code, necessitates program synthesis that takes programming language characteristics, like interpreters and compilers, into account. Additionally, generating fuzzing harnesses for open-source libraries based on their code requires a thorough understanding of programming language characteristics.
In this proposal, we first present our past experience in synthesizing JavaScript programs to test JavaScript interpreters. We propose a new technique called aspect-preserving mutation, which stochastically preserves desirable properties, referred to as aspects, that are considered essential for reaching vulnerabilities during mutation. The aspect preservation is demonstrated through two mutation strategies designed with JavaScript characteristics: structure and type preservation. Using this technique, we discovered 48 high-impact bugs in widely used JavaScript interpreters.
Moreover, we will discuss two program synthesis efforts to test the Rust programming language. Rust is a community-driven programming language that emphasizes memory safety and performance. To achieve memory safety through programming language features, Rust introduces unique elements that may pose challenges for synthesis. Many libraries used in Rust programs are written by the open-source community, and we discovered that several of them lack thorough testing before deployment. To address this issue, we propose an automatic fuzzing harness generator capable of generating fuzzing harnesses for open-source Rust libraries. Furthermore, we will discuss a grammar rule-based Rust code synthesizer for testing Rust compilers based on a refined specification that is not officially provided.
Lastly, we will examine the commonalities and differences in program synthesis for fuzzing concerning different programming languages and testing targets (e.g., interpreters, libraries, and compilers).