Toward an optimal code for communication: The case of scientific English

2019 ◽  
Vol 0 (0) ◽  
Author(s):  
Stefania Degaetano-Ortlieb ◽  
Elke Teich

AbstractWe present a model of the linguistic development of scientific English from the mid-seventeenth to the late-nineteenth century, a period that witnessed significant political and social changes, including the evolution of modern science. There is a wealth of descriptive accounts of scientific English, both from a synchronic and a diachronic perspective, but only few attempts at a unified explanation of its evolution. The explanation we offer here is a communicative one: while external pressures (specialization, diversification) push for an increase in expressivity, communicative concerns pull toward convergence on particular options (conventionalization). What emerges over time is a code which is optimized for written, specialist communication, relying on specific linguistic means to modulate information content. As we show, this is achieved by the systematic interplay between lexis and grammar. The corpora we employ are the Royal Society Corpus (RSC) and for comparative purposes, the Corpus of Late Modern English (CLMET). We build various diachronic, computational n-gram language models of these corpora and then apply formal measures of information content (here: relative entropy and surprisal) to detect the linguistic features significantly contributing to diachronic change, estimate the (changing) level of information of features and capture the time course of change.

2010 ◽  
Vol 17 (4) ◽  
pp. 455-483 ◽  
Author(s):  
YUQING GUO ◽  
HAIFENG WANG ◽  
JOSEF VAN GENABITH

AbstractThis paper presents a general-purpose, wide-coverage, probabilistic sentence generator based on dependency n-gram models. This is particularly interesting as many semantic or abstract syntactic input specifications for sentence realisation can be represented as labelled bi-lexical dependencies or typed predicate-argument structures. Our generation method captures the mapping between semantic representations and surface forms by linearising a set of dependencies directly, rather than via the application of grammar rules as in more traditional chart-style or unification-based generators. In contrast to conventional n-gram language models over surface word forms, we exploit structural information and various linguistic features inherent in the dependency representations to constrain the generation space and improve the generation quality. A series of experiments shows that dependency-based n-gram models generalise well to different languages (English and Chinese) and representations (LFG and CoNLL). Compared with state-of-the-art generation systems, our general-purpose sentence realiser is highly competitive with the added advantages of being simple, fast, robust and accurate.


Author(s):  
Vitaly Kuznetsov ◽  
Hank Liao ◽  
Mehryar Mohri ◽  
Michael Riley ◽  
Brian Roark

2020 ◽  
Author(s):  
Grant P. Strimel ◽  
Ariya Rastrow ◽  
Gautam Tiwari ◽  
Adrien Piérard ◽  
Jon Webb

Author(s):  
ROMAN BERTOLAMI ◽  
HORST BUNKE

Current multiple classifier systems for unconstrained handwritten text recognition do not provide a straightforward way to utilize language model information. In this paper, we describe a generic method to integrate a statistical n-gram language model into the combination of multiple offline handwritten text line recognizers. The proposed method first builds a word transition network and then rescores this network with an n-gram language model. Experimental evaluation conducted on a large dataset of offline handwritten text lines shows that the proposed approach improves the recognition accuracy over a reference system as well as over the original combination method that does not include a language model.


2021 ◽  
pp. 001946462110645
Author(s):  
Sandipan Baksi

Science journalism in Hindi originated in the late nineteenth century. Hindi literary periodicals provided the first platform for science to be discussed along with literature. The onset of the twentieth century witnessed a remarkable advance in Hindi literary writing, and science writing also flourished with this advance. A remarkable overlap and a complementary relationship between the development of Hindi literature and Hindi commentaries on sciences is evident. Equally important in this context was the backdrop provided by a politically contentious process of evolution of a ‘modern’, ‘standard’ Hindi, and by the anti-colonial freedom movement, yoked to the idea of cultural and economic nationalism. The article surveys certain popular periodicals that regularly published essays and commentaries on science and scientific subjects. These periodicals were instrumental in shaping the popular discourses on science. The article also underlines an overwhelming effort by the intelligentsia to seek a philosophical commensurability between modern science and ‘traditional’ schools of thought. It concludes that the predominance of these characteristics in Hindi science journalism was a reflection of the agenda of the Hindi intelligentsia, shaped by linguistic nationalism framed alongside or in conjunction with a revivalist perspective.


Author(s):  
Mijail Kabadjov ◽  
Josef Steinberger ◽  
Ralf Steinberger ◽  
Massimo Poesio ◽  
Bruno Pouliquen
Keyword(s):  

2008 ◽  
Author(s):  
Ahmad Emami ◽  
Imed Zitouni ◽  
Lidia Mangu
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document