Title: Towards Fine-grained Multi-Attribute Control Using Language Models

Date: Wednesday, 17th April, 2024

Time: 2:30 PM to 4:15 PM ET (11:30 AM - 1:15 PM PT)

Location: Virtual Zoom Link 

Meeting ID: 951 6076 2750
Passcode:   299490

 

Ashutosh Baheti

Computer Science Ph.D. Candidate

School of Interactive Computing
College of Computing

Georgia Institute of Technology

https://abaheti95.github.io/

Committee:

Prof. Mark Riedl (Advisor) -- School of Interactive Computing, Georgia Institute of Technology

Prof. Alan Ritter (Co-Advisor) -- School of Interactive Computing, Georgia Institute of Technology

Prof. Dhruv Batra -- School of Interactive Computing, Georgia Institute of Technology

Prof. Munmun de Choudhury -- School of Interactive Computing, Georgia Institute of Technology

Prof. Maarten Sap -- Language Technologies Institute, Carnegie Mellon University

Abstract

As we increasingly rely on powerful language models, ensuring their safe and effective operation necessitates extensive research in controllable text generation. Existing state-of-the-art language models struggle to generate the most accurate or desired output at the first attempt. Inspired by recent developments in self-correction in large language models and new reinforcement learning methods, we aim to train smaller language models as fine-grained editors, whereby they iteratively edit outputs to satisfy threshold constraints over multiple classifier-based attributes.

 

In this thesis, I show a study of contextual offensive behavior of pretrained large language models and curate a high-quality dataset for toxicity detection. Next, I introduce a novel offline RL algorithm that can utilize arbitrary numeric scores as rewards during training to optimize any user-desired LM behavior by filtering out suboptimal data. Finally, I design an offline RL framework, I propose a fine-grained multi-attribute controllability task, where the goal is to guide the language model to generate output sequences that satisfy user-defined threshold-based attribute constraints. The LM model can take multiple edits to reach the desired attributes. Experiments on both languages and proteins demonstrate the versatility and effectiveness of our approach.