Title: Leverage AI to Combat Misinformation by Evaluating Detectors and Empowering Crowds
Bing He
School of Cybersecurity and Privacy
Georgia Institute of Technology
Date: Monday, Nov 6, 2023
Time: 10:00 am - 11:00 am EST
Location: https://gatech.zoom.us/j/7088026994?pwd=bDlKVzVzVEhLZlN3MExvV1pRWWJCdz09
Committee:
Dr. Mustaque Ahamad (advisor), School of Cybersecurity and Privacy, Georgia Institute of Technology
Dr. Srijan Kumar (advisor), School of Computational Science and Engineering, Georgia Institute of Technology
Dr. Frank Li, School of Cybersecurity and Privacy, Georgia Institute of Technology
Dr. Munmun De Choudhury, School of Interactive Computing, Georgia Institute of Technology
Abstract:
Online misinformation has become a global risk, leading to threatening real-world implications. To combat misinformation, existing research works focus on developing automatic ML methods to detect misinformation and its spreaders, as well as leveraging the expertise of professionals including journalists and fact-checkers to annotate and debunk misinformation. However, the vulnerabilities of deep sequence embedding-based detection systems are rarely examined and the efficacy of using professionals is restricted by their limited population. To complement professionals, non-expert ordinary users (a.k.a crowds) can act as eyes-on-the-ground who proactively question and counter misinformation, showing promise in combating misinformation. However, little is known about how these crowds organically combat misinformation. Concurrently, AI has progressed dramatically, demonstrating potential to help combat misinformation.
In this proposal, we aim to utilize AI to investigate the aforementioned challenges and provide insights and solutions to them. First, we evaluate the existing deep sequence embedding-based classification models used for detecting malicious users (e.g., misinformation spreaders). These models usually use the sequence of user posts to generate user embeddings and leverage them to detect bad actors on social media platforms. We evaluate the robustness of these detectors by proposing a novel end-to-end AI algorithm, called PETGEN (Personalized Text Generation Attack model), that simultaneously reduces the efficacy of the detection model and generates high-quality personalized posts. Second, we use advanced AI techniques to characterize the spread and textual properties of counter-misinformation generated by crowds as well as their characteristics during the COVID-19 pandemic. Our work provides insights into the role of crowds in countering misinformation. Among our analysis results, we found 96% of counter-misinformation is made by crowds, but 2 out of 3 are rude and lack evidence, which may lead to backfire. To address it, we first create novel datasets of misinformation and counter-misinformation response pairs and then propose a reinforcement learning-based AI algorithm, called MisinfoCorrect, that learns to generate high-quality counter-misinformation responses for an input misinformation post. Our work illustrates the promise of AI for empowering crowds in combating misinformation. Finally, I introduce ongoing projects where we characterize user reactions to crowd-generated counter-misinformation to investigate what kinds of crowd-generated counter-misinformation are needed in real-world applications.