Title: Building and Evaluating Controllable Models for Text Simplification
Date: Friday, August 4, 2023
Time: 2.30pm - 4.30pm EST
Location: https://gatech.zoom.us/j/92877588273
Mounica Maddela
PhD Student in Computer Science
School of Interactive Computing
College of Computing
Georgia Institute of Technology
Committee
Dr. Wei Xu (Advisor), School of Interactive Computing, Georgia Tech
Dr. Alan Ritter, School of Interactive Computing, Georgia Tech
Dr. Mark Riedl, School of Interactive Computing, Georgia Tech
Dr. Colin Cherry, Google Research
Dr. Y-Lan Boureau, Meta AI Research
Abstract
Although the existing natural language generation systems (NLG) have made great progress in generating fluent text indistinguishable from human-written text, they still lack the capability to adapt to specific constraints or attributes crucial for practical applications. There has been an emerging trend in NLG to develop controllable methods for text generation that generate texts by controlling various attributes such as sentiment, formality, politeness, and topic.
In this dissertation, I focus on controllable text generation for Automatic Text Simplification (ATS). ATS aims to improve the readability of texts with simpler grammar and word choices while preserving the original meaning. It is an audience-dependent task because the readability constraints vary based on the target population. Therefore, controllability is essential for the ATS systems to generate text adhering to diverse readability constraints. An ideal automatic simplification system should be able to control various attributes of the generated text such as syntactic structures, length, readability levels, and word choices that are appropriate for the situation. However, the existing simplification systems lack the capability to adapt to different readability constraints.
To address these issues, I develop two novel controllable approaches for ATS: a sentence simplification system that combines linguistic rules with Transformer models to generate simplified sentences at different readability levels and a lexical simplification system that leverages human judgments of word complexity to replace complex words with simpler phrases. Finally, I propose the first supervised automatic evaluation metric for ATS, LENS, which can capture multiple simplification styles and outperforms the existing metrics in evaluating controllable simplification systems. To train and evaluate LENS, I create SIMPEVAL, the first metric evaluation benchmark that incorporates different types of simplification operations.