Word-level Text Generation from Language Models

Author(s):  
Ponrudee Netisopakul ◽  
Usanisa Taoto
2018 ◽  
Vol 6 ◽  
pp. 451-465 ◽  
Author(s):  
Daniela Gerz ◽  
Ivan Vulić ◽  
Edoardo Ponti ◽  
Jason Naradowsky ◽  
Roi Reichart ◽  
...  

Neural architectures are prominent in the construction of language models (LMs). However, word-level prediction is typically agnostic of subword-level information (characters and character sequences) and operates over a closed vocabulary, consisting of a limited word set. Indeed, while subword-aware models boost performance across a variety of NLP tasks, previous work did not evaluate the ability of these models to assist next-word prediction in language modeling tasks. Such subword-level informed models should be particularly effective for morphologically-rich languages (MRLs) that exhibit high type-to-token ratios. In this work, we present a large-scale LM study on 50 typologically diverse languages covering a wide variety of morphological systems, and offer new LM benchmarks to the community, while considering subword-level information. The main technical contribution of our work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction. We conduct experiments in the LM setting where the number of infrequent words is large, and demonstrate strong perplexity gains across our 50 languages, especially for morphologically-rich languages. Our code and data sets are publicly available.


2020 ◽  
Vol 10 (11) ◽  
pp. 297
Author(s):  
Gary A. Troia ◽  
Julie S. Brehmer ◽  
Kaitlin Glause ◽  
Heather L. Reichmuth ◽  
Frank Lawrence

Data were collected for this study early in the school year to analyze the direct and indirect effects of word-level literacy skills (word recognition, spelling, and written vocabulary use) and handwriting fluency on writing quality across three genres of typewritten papers. We further explored whether typing fluency and text generation fluency mediated the effects. Finally, we examined whether there was any difference in the effects across three writing genres. Fourth and fifth graders (N = 175) from 21 typical classrooms in 12 different Midwestern U.S. schools participated. Regression path analyses were employed and revealed that word-level literacy skills had both significant direct and serial indirect effects on quality, via typing fluency and then text generation fluency (text length) when controlling for handwriting fluency. Further, handwriting fluency had no direct effect when controlling for word-level literacy skills but did have a significant serial indirect effect on writing quality via typing fluency then text generation fluency. Results indicate that handwriting fluency matters, even when composing on the computer. Stronger transcription fluency, particularly by hand, leads to higher quality writing, likely because less cognitive effort is devoted to transcription. This study adds to limited research on the cross-modal effects of transcription on writing quality.


2020 ◽  
Vol 10 (4) ◽  
pp. 1340
Author(s):  
Heewoong Park ◽  
Jonghun Park

The task of sentence completion, which aims to infer the missing text of a given sentence, was carried out to assess the reading comprehension level of machines as well as humans. In this work, we conducted a comprehensive study of various approaches for the sentence completion based on neural language models, which have been advanced in recent years. First, we revisited the recurrent neural network language model (RNN LM), achieving highly competitive results with an appropriate network structure and hyper-parameters. This paper presents a bidirectional version of RNN LM, which surpassed the previous best results on Microsoft Research (MSR) Sentence Completion Challenge and the Scholastic Aptitude Test (SAT) sentence completion questions. In parallel with directly applying RNN LM to sentence completion, we also employed a supervised learning framework that fine-tunes a large pre-trained transformer-based LM with a few sentence-completion examples. By fine-tuning a pre-trained BERT model, this work established state-of-the-art results on the MSR and SAT sets. Furthermore, we performed similar experimentation on newly collected cloze-style questions in the Korean language. The experimental results reveal that simply applying the multilingual BERT models for the Korean dataset was not satisfactory, which leaves room for further research.


Author(s):  
Jianing Li ◽  
Yanyan Lan ◽  
Jiafeng Guo ◽  
Jun Xu ◽  
Xueqi Cheng

Neural language models based on recurrent neural networks (RNNLM) have significantly improved the performance for text generation, yet the quality of generated text represented by Turing Test pass rate is still far from satisfying. Some researchers propose to use adversarial training or reinforcement learning to promote the quality, however, such methods usually introduce great challenges in the training and parameter tuning processes. Through our analysis, we find the problem of RNNLM comes from the usage of maximum likelihood estimation (MLE) as the objective function, which requires the generated distribution to precisely recover the true distribution. Such requirement favors high generation diversity which restricted the generation quality. This is not suitable when the overall quality is low, since high generation diversity usually indicates lot of errors rather than diverse good samples. In this paper, we propose to achieve differentiated distribution recovery, DDR for short. The key idea is to make the optimal generation probability proportional to the β-th power of the true probability, where β > 1. In this way, the generation quality can be greatly improved by sacrificing diversity from noises and rare patterns. Experiments on synthetic data and two public text datasets show that our DDR method achieves more flexible quality-diversity trade-off and higher Turing Test pass rate, as compared with baseline methods including RNNLM, SeqGAN and LeakGAN.


2018 ◽  
Author(s):  
Avery Hiebert ◽  
Cole Peterson ◽  
Alona Fyshe ◽  
Nishant Mehta

2012 ◽  
Vol 19 (2) ◽  
pp. 135-146 ◽  
Author(s):  
Eder Miranda de Novais ◽  
Ivandré Paraboni

Author(s):  
Leonardo F. R. Ribeiro ◽  
Martin Schmitt ◽  
Hinrich Schütze ◽  
Iryna Gurevych

2020 ◽  
Vol 34 (05) ◽  
pp. 8303-8310
Author(s):  
Yuan Li ◽  
Chunyuan Li ◽  
Yizhe Zhang ◽  
Xiujun Li ◽  
Guoqing Zheng ◽  
...  

Learning to generate text with a given label is a challenging task because natural language sentences are highly variable and ambiguous. It renders difficulties in trade-off between sentence quality and label fidelity. In this paper, we present CARA to alleviate the issue, where two auxiliary classifiers work simultaneously to ensure that (1) the encoder learns disentangled features and (2) the generator produces label-related sentences. Two practical techniques are further proposed to improve the performance, including annealing the learning signal from the auxiliary classifier, and enhancing the encoder with pre-trained language models. To establish a comprehensive benchmark fostering future research, we consider a suite of four datasets, and systematically reproduce three representative methods. CARA shows consistent improvement over the previous methods on the task of label-conditional text generation, and achieves state-of-the-art on the task of attribute transfer.


Sign in / Sign up

Export Citation Format

Share Document