scholarly journals Pushing on Text Readability Assessment: A Transformer Meets Handcrafted Linguistic Features

Author(s):  
Bruce W. Lee ◽  
Yoo Sung Jang ◽  
Jason Lee
2008 ◽  
Vol 41 (3) ◽  
pp. 409-429 ◽  
Author(s):  
Scott A. Crossley ◽  
Danielle S. McNamara

This paper follows up on the work of Crossley, Louwerse, McCarthy & McNamara (2007), who conducted an exploratory study of the linguistic differences of simplified and authentic texts found in beginner level English as a Second Language (ESL) textbooks using the computational tool Coh-Metrix. The purpose of this study is to provide a more comprehensive study of second language (L2) reading texts than that provided by Crossley et al. (2007) by investigating the differences between the linguistic structures of a larger and more selective corpus of intermediate reading texts. This study is important because advocates of both approaches to ESL text construction cite linguistic features, syntax, and discourse structures as essential elements of text readability, but only the Crossley et al. (2007) study has measured the differences between these text types and their implications for L2 learners. This research replicates the methods of the earlier study. The findings of this study provide a more thorough understanding of the linguistic features that construct simplified and authentic texts. This work will enable material developers, publishers, and reading researchers to more accurately judge the values of simplified and authentic L2 texts as well as improve measures for matching readers to text.


2014 ◽  
Vol 165 (2) ◽  
pp. 163-193 ◽  
Author(s):  
Felice Dell’Orletta ◽  
Simonetta Montemagni ◽  
Giulia Venturi

In this paper, we tackle three underresearched issues of the automatic readability assessment literature, namely the evaluation of text readability in less resourced languages, with respect to sentences (as opposed to documents) as well as across textual genres. Different solutions to these issues have been tested by using and refining READ‑IT, the first advanced readability assessment tool for Italian, which combines traditional raw text features with lexical, morpho-syntactic and syntactic information. In READ‑IT readability assessment is carried out with respect to both documents and sentences, with the latter constituting an important novelty of the proposed approach: READ‑IT shows a high accuracy in the document classification task and promising results in the sentence classification scenario. By comparing the results of two versions of READ‑IT, adopting a classification‑ versus ranking-based approach, we also show that readability assessment is strongly influenced by textual genre; for this reason a genre-oriented notion of readability is needed. With classification-based approaches, reliable results can only be achieved with genre-specific models: Since this is far from being a workable solution, especially for less resourced languages, a new ranking method for readability assessment is proposed, based on the notion of distance.


2014 ◽  
Vol 165 (2) ◽  
pp. 194-222 ◽  
Author(s):  
Sowmya Vajjala ◽  
Detmar Meurers

Readability assessment can play a role in the evaluation of a simplification algorithm as well as in the identification of what to simplify. While some previous research used traditional readability formulas to evaluate text simplification, there is little research into the utility of readability assessment for identifying and analyzing sentence level targets for text simplification. We explore this aspect in our paper by first constructing a readability model that is generalizable across corpora and across genres and later adapting this model to make sentence-level readability judgments. First, we report on experiments establishing that the readability model integrating a broad range of linguistic features works well at a document level, performing on par with the best systems on a standard test corpus. Next, the model is confirmed to be transferable to different text genres. Moving from documents to sentences, we investigate the model’s ability to correctly identify the difference in reading level between a sentence and its human simplified version. We conclude that readability models can be useful for identifying simplification targets for human writers and for evaluating machine generated simplifications.


Sign in / Sign up

Export Citation Format

Share Document