The Statistical Analysis in the Problem of the Author Identification of a Natural Language Text

This article presents a mathematical and empirical verification of computational constancy measures for natural language text. A constancy measure characterizes a given text by having an invariant value for any size larger than a certain amount. The study of such measures has a 70-year history dating back to Yule's K, with the original intended application of author identification. We examine various measures proposed since Yule and reconsider reports made so far, thus overviewing the study of constancy measures. We then explain how K is essentially equivalent to an approximation of the second-order Rényi entropy, thus indicating its signification within language science. We then empirically examine constancy measure candidates within this new, broader context. The approximated higher-order entropy exhibits stable convergence across different languages and kinds of text. We also show, however, that it cannot identify authors, contrary to Yule's intention. Lastly, we apply K to two unknown scripts, the Voynich manuscript and Rongorongo, and show how the results support previous hypotheses about these scripts.

Download Full-text

Morality Classification in Natural Language Text

IEEE Transactions on Affective Computing ◽

10.1109/taffc.2020.3034050 ◽

2020 ◽

pp. 1-1

Author(s):

Matheus C. Pavan ◽

Vitor G. Santos ◽

Alex G. J. Lan ◽

Joao Martins ◽

Wesley Ramos Santos ◽

...

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Language Text

Download Full-text

Accurate fact harvesting from natural language text in wikipedia with Lector

Proceedings of the 19th International Workshop on Web and Databases - WebDB '16 ◽

10.1145/2932194.2932203 ◽

2016 ◽

Cited By ~ 2

Author(s):

Matteo Cannaviccio ◽

Denilson Barbosa ◽

Paolo Merialdo

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Language Text

Download Full-text

Generation of Natural Language Text using Perspective Descriptor in Frames

IETE Journal of Research ◽

10.1080/03772063.2001.11416202 ◽

2001 ◽

Vol 47 (1-2) ◽

pp. 43-57

Author(s):

G V Uma ◽

T V Geetha

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Language Text

Download Full-text

Word-based self-indexes for natural language text

ACM Transactions on Information Systems ◽

10.1145/2094072.2094073 ◽

2012 ◽

Vol 30 (1) ◽

pp. 1-34 ◽

Cited By ~ 27

Author(s):

Antonio Fariña ◽

Nieves R. Brisaboa ◽

Gonzalo Navarro ◽

Francisco Claude ◽

Ángeles S. Places ◽

...

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Language Text

Download Full-text

Document representation in natural language text retrieval

Proceedings of the workshop on Human Language Technology - HLT '94 ◽

10.3115/1075812.1075896 ◽

1994 ◽

Cited By ~ 1

Author(s):

Tomek Strzalkowski

Keyword(s):

Natural Language ◽

Text Retrieval ◽

Document Representation ◽

Natural Language Text ◽

Language Text

Download Full-text

Process Model Generation from Natural Language Text

Notes on Numerical Fluid Mechanics and Multidisciplinary Design - Active Flow and Combustion Control 2018 ◽

10.1007/978-3-642-21640-4_36 ◽

2011 ◽

pp. 482-496 ◽

Cited By ~ 62

Author(s):

Fabian Friedrich ◽

Jan Mendling ◽

Frank Puhlmann

Keyword(s):

Natural Language ◽

Process Model ◽

Model Generation ◽

Natural Language Text ◽

Language Text

Download Full-text

Wordform as the main basis for analysis of natural language text

Informatization and communication ◽

10.34219/2078-8320-2021-12-2-101-108 ◽

2021 ◽

pp. 101-108

Author(s):

S.G. Antonov

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Probability Approach ◽

Correction Problem ◽

Main Basis ◽

Language Corpus ◽

Language Text

In the article discuss the application aspects of wordforms of natural language text for decision the mistakes correction problem. Discuss the merits and demerits of two known approaches for decision – deterministic and based on probabilities/ Construction principles of natural language corpus described, wich apply in probability approach. Declare conclusion about necessity of complex using these approaches in dependence on properties of texts.

Download Full-text

A Review on Question Generation from Natural Language Text

ACM Transactions on Information Systems ◽

10.1145/3468889 ◽

2022 ◽

Vol 40 (1) ◽

pp. 1-43

Author(s):

Ruqing Zhang ◽

Jiafeng Guo ◽

Lu Chen ◽

Yixing Fan ◽

Xueqi Cheng

Keyword(s):

Natural Language ◽

Question Answering ◽

Data Augmentation ◽

Text Structure ◽

Current Status ◽

Question Generation ◽

Natural Language Text ◽

Question Answering Systems ◽

The Right ◽

Language Text

Question generation is an important yet challenging problem in Artificial Intelligence (AI), which aims to generate natural and relevant questions from various input formats, e.g., natural language text, structure database, knowledge base, and image. In this article, we focus on question generation from natural language text, which has received tremendous interest in recent years due to the widespread applications such as data augmentation for question answering systems. During the past decades, many different question generation models have been proposed, from traditional rule-based methods to advanced neural network-based methods. Since there have been a large variety of research works proposed, we believe it is the right time to summarize the current status, learn from existing methodologies, and gain some insights for future development. In contrast to existing reviews, in this survey, we try to provide a more comprehensive taxonomy of question generation tasks from three different perspectives, i.e., the types of the input context text, the target answer, and the generated question. We take a deep look into existing models from different dimensions to analyze their underlying ideas, major design principles, and training strategies We compare these models through benchmark tasks to obtain an empirical understanding of the existing techniques. Moreover, we discuss what is missing in the current literature and what are the promising and desired future directions.

Download Full-text

An ASP Based Approach to Answering Questions for Natural Language Text

Practical Aspects of Declarative Languages - Lecture Notes in Computer Science ◽

10.1007/978-3-030-05998-9_4 ◽

2018 ◽

pp. 46-63 ◽

Cited By ~ 4

Author(s):

Dhruva Pendharkar ◽

Gopal Gupta

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Answering Questions ◽

Language Text

Download Full-text