カンファレンス (国際) Diamonds in the Rough: Generating Fluent Sentences from Early-Stage Drafts for Academic Writing Assistance
Takumi Ito*1*2, Tatsuki Kuribayashi*1*2, Hayato Kobayashi*3*4 , Ana Brassard*4*1, Masato Hagiwara*5, Jun Suzuki*1*4, and Kentaro Inui*1*4 (*1 Tohoku University, *2 Langsmith Inc., *3 Yahoo Japan Corporation, *4 RIKEN, *5 Octanove Labs LLC)
The 12th International Conference on Natural Language Generation (INLG 2019)
The writing process consists of several stages such as drafting, revising, editing, and proofreading. Studies on writing assistance, such as grammatical error correction (GEC), have mainly focused on sentence editing and proofreading, where surface-level issues such as typographical, spelling, or grammatical errors should be corrected. We broaden this focus to include the earlier revising stage, where sentences require adjustment to the information included or major rewriting and propose Sentence-level Revision (SentRev) as a new writing assistance task. Well-performing systems in this task can help inexperienced authors by producing fluent, complete sentences given their rough, incomplete drafts. We build a new freely available crowdsourced evaluation dataset consisting of incomplete sentences authored by non-native writers paired with their final versions extracted from published academic papers for developing and evaluating SentRev models. We also establish baseline performance on SentRev using our newly built evaluation dataset.