Word-level Text Generation from Language Models

Neural architectures are prominent in the construction of language models (LMs). However, word-level prediction is typically agnostic of subword-level information (characters and character sequences) and operates over a closed vocabulary, consisting of a limited word set. Indeed, while subword-aware models boost performance across a variety of NLP tasks, previous work did not evaluate the ability of these models to assist next-word prediction in language modeling tasks. Such subword-level informed models should be particularly effective for morphologically-rich languages (MRLs) that exhibit high type-to-token ratios. In this work, we present a large-scale LM study on 50 typologically diverse languages covering a wide variety of morphological systems, and offer new LM benchmarks to the community, while considering subword-level information. The main technical contribution of our work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction. We conduct experiments in the LM setting where the number of infrequent words is large, and demonstrate strong perplexity gains across our 50 languages, especially for morphologically-rich languages. Our code and data sets are publicly available.

Download Full-text

Creating word-level language models for handwriting recognition

Proceedings of Sixth International Conference on Document Analysis and Recognition ◽

10.1109/icdar.2001.953884 ◽

2002 ◽

Author(s):

J.F. Pitrelli ◽

A. Roy

Keyword(s):

Handwriting Recognition ◽

Language Models ◽

Word Level

Download Full-text

Direct and Indirect Effects of Literacy Skills and Writing Fluency on Writing Quality Across Three Genres

Education Sciences ◽

10.3390/educsci10110297 ◽

2020 ◽

Vol 10 (11) ◽

pp. 297

Author(s):

Gary A. Troia ◽

Julie S. Brehmer ◽

Kaitlin Glause ◽

Heather L. Reichmuth ◽

Frank Lawrence

Keyword(s):

Indirect Effects ◽

Cognitive Effort ◽

Literacy Skills ◽

Direct And Indirect Effects ◽

Text Generation ◽

Writing Quality ◽

School Year ◽

Word Level ◽

Path Analyses ◽

Text Length

Data were collected for this study early in the school year to analyze the direct and indirect effects of word-level literacy skills (word recognition, spelling, and written vocabulary use) and handwriting fluency on writing quality across three genres of typewritten papers. We further explored whether typing fluency and text generation fluency mediated the effects. Finally, we examined whether there was any difference in the effects across three writing genres. Fourth and fifth graders (N = 175) from 21 typical classrooms in 12 different Midwestern U.S. schools participated. Regression path analyses were employed and revealed that word-level literacy skills had both significant direct and serial indirect effects on quality, via typing fluency and then text generation fluency (text length) when controlling for handwriting fluency. Further, handwriting fluency had no direct effect when controlling for word-level literacy skills but did have a significant serial indirect effect on writing quality via typing fluency then text generation fluency. Results indicate that handwriting fluency matters, even when composing on the computer. Stronger transcription fluency, particularly by hand, leads to higher quality writing, likely because less cognitive effort is devoted to transcription. This study adds to limited research on the cross-modal effects of transcription on writing quality.

Download Full-text

Assessment of Word-Level Neural Language Models for Sentence Completion

Applied Sciences ◽

10.3390/app10041340 ◽

2020 ◽

Vol 10 (4) ◽

pp. 1340

Author(s):

Heewoong Park ◽

Jonghun Park

Keyword(s):

Language Model ◽

Fine Tuning ◽

Language Models ◽

Sentence Completion ◽

Korean Language ◽

Learning Framework ◽

Scholastic Aptitude ◽

Word Level ◽

Network Language ◽

Comprehensive Study

The task of sentence completion, which aims to infer the missing text of a given sentence, was carried out to assess the reading comprehension level of machines as well as humans. In this work, we conducted a comprehensive study of various approaches for the sentence completion based on neural language models, which have been advanced in recent years. First, we revisited the recurrent neural network language model (RNN LM), achieving highly competitive results with an appropriate network structure and hyper-parameters. This paper presents a bidirectional version of RNN LM, which surpassed the previous best results on Microsoft Research (MSR) Sentence Completion Challenge and the Scholastic Aptitude Test (SAT) sentence completion questions. In parallel with directly applying RNN LM to sentence completion, we also employed a supervised learning framework that fine-tunes a large pre-trained transformer-based LM with a few sentence-completion examples. By fine-tuning a pre-trained BERT model, this work established state-of-the-art results on the MSR and SAT sets. Furthermore, we performed similar experimentation on newly collected cloze-style questions in the Korean language. The experimental results reveal that simply applying the multilingual BERT models for the Korean dataset was not satisfactory, which leaves room for further research.

Download Full-text

Differentiated Distribution Recovery for Neural Text Generation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016682 ◽

2019 ◽

Vol 33 ◽

pp. 6682-6689

Author(s):

Jianing Li ◽

Yanyan Lan ◽

Jiafeng Guo ◽

Jun Xu ◽

Xueqi Cheng

Keyword(s):

Synthetic Data ◽

Likelihood Estimation ◽

Parameter Tuning ◽

Turing Test ◽

Language Models ◽

Pass Rate ◽

Text Generation ◽

True Probability ◽

Adversarial Training

Neural language models based on recurrent neural networks (RNNLM) have significantly improved the performance for text generation, yet the quality of generated text represented by Turing Test pass rate is still far from satisfying. Some researchers propose to use adversarial training or reinforcement learning to promote the quality, however, such methods usually introduce great challenges in the training and parameter tuning processes. Through our analysis, we find the problem of RNNLM comes from the usage of maximum likelihood estimation (MLE) as the objective function, which requires the generated distribution to precisely recover the true distribution. Such requirement favors high generation diversity which restricted the generation quality. This is not suitable when the overall quality is low, since high generation diversity usually indicates lot of errors rather than diverse good samples. In this paper, we propose to achieve differentiated distribution recovery, DDR for short. The key idea is to make the optimal generation probability proportional to the β-th power of the true probability, where β > 1. In this way, the generation quality can be greatly improved by sacrificing diversity from noises and rare patterns. Experiments on synthetic data and two public text datasets show that our DDR method achieves more flexible quality-diversity trade-off and higher Turing Test pass rate, as compared with baseline methods including RNNLM, SeqGAN and LeakGAN.

Download Full-text

Interpreting Word-Level Hidden State Behaviour of Character-Level LSTM Language Models

10.18653/v1/w18-5428 ◽

2018 ◽

Cited By ~ 1

Author(s):

Avery Hiebert ◽

Cole Peterson ◽

Alona Fyshe ◽

Nishant Mehta

Keyword(s):

Language Models ◽

Word Level ◽

State Behaviour

Download Full-text

Portuguese text generation using factored language models

Journal of the Brazilian Computer Society ◽

10.1007/s13173-012-0095-1 ◽

2012 ◽

Vol 19 (2) ◽

pp. 135-146 ◽

Cited By ~ 11

Author(s):

Eder Miranda de Novais ◽

Ivandré Paraboni

Keyword(s):

Portuguese Text ◽

Language Models ◽

Text Generation

Download Full-text

Attribute Alignment: Controlling Text Generation from Pre-trained Language Models

10.18653/v1/2021.findings-emnlp.194 ◽

2021 ◽

Author(s):

Dian Yu ◽

Zhou Yu ◽

Kenji Sagae

Keyword(s):

Language Models ◽

Text Generation

Download Full-text

Investigating Pretrained Language Models for Graph-to-Text Generation

10.18653/v1/2021.nlp4convai-1.20 ◽

2021 ◽

Author(s):

Leonardo F. R. Ribeiro ◽

Martin Schmitt ◽

Hinrich Schütze ◽

Iryna Gurevych

Keyword(s):

Language Models ◽

Text Generation

Download Full-text

Complementary Auxiliary Classifiers for Label-Conditional Text Generation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6346 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8303-8310

Author(s):

Yuan Li ◽

Chunyuan Li ◽

Yizhe Zhang ◽

Xiujun Li ◽

Guoqing Zheng ◽

...

Keyword(s):

Natural Language ◽

State Of The Art ◽

Language Models ◽

Future Research ◽

Text Generation ◽

Trade Off

Learning to generate text with a given label is a challenging task because natural language sentences are highly variable and ambiguous. It renders difficulties in trade-off between sentence quality and label fidelity. In this paper, we present CARA to alleviate the issue, where two auxiliary classifiers work simultaneously to ensure that (1) the encoder learns disentangled features and (2) the generator produces label-related sentences. Two practical techniques are further proposed to improve the performance, including annealing the learning signal from the auxiliary classifier, and enhancing the encoder with pre-trained language models. To establish a comprehensive benchmark fostering future research, we consider a suite of four datasets, and systematically reproduce three representative methods. CARA shows consistent improvement over the previous methods on the task of label-conditional text generation, and achieves state-of-the-art on the task of attribute transfer.

Download Full-text