scholarly journals OldSlavNet: A scalable Early Slavic dependency parser trained on modern language data

2021 ◽  
Vol 8 ◽  
pp. 100063
Author(s):  
Nilo Pedrazzini ◽  
Hanne Martine Eckhoff
Author(s):  
Jarich Hoekstra

Abstract In this paper I investigate the early language contact between North Frisian and Danish. Since we have no direct evidence for this language contact apart from the layer of medieval Danish interferences in Modern North Frisian, the question arises, whether it is possible to say anything about the specific type of language contact that has taken place in the Middle Ages on the basis of the modern language data and with the help of language contact theory. Taking the lead of van Coetsem’s language contact theory, I discuss two phenomena in the (morpho)syntax of Modern North Frisian, the placement of directional particles and the inventory of verbal particles, and argue that they point to a language contact situation in which a considerable number of Danish-speakers shifted to North Frisian.


PMLA ◽  
1935 ◽  
Vol 50 (4) ◽  
pp. 1343-1343

The fifty-second meeting of the Modern Language Associationof America was held, on the invitation of the University of Cincinnati, at Cincinnati, Ohio, Monday, Tuesday, and Wednesday, December 30 and 31, 1935, and January 1, 1936. The Association headquarters were in the Netherland Plaza Hotel, where all meetings were held except those of Tuesday morning and afternoon. These took place at the University of Cincinnati. Registration cards at headquarters were signed by about 900, though a considerably larger number of members were in attendance. The Local Committee estimated the attendance at not less than 1400. This Committee consisted of Professor Frank W. Chandler, Chairman; Professor Edwin H. Zeydel; Professor Phillip Ogden; Mr. John J. Rowe (for the Directors); and Mr. Joseph S. Graydon (for the Alumni).


2020 ◽  
Vol 51 (2) ◽  
pp. 479-493
Author(s):  
Jenny A. Roberts ◽  
Evelyn P. Altenberg ◽  
Madison Hunter

Purpose The results of automatic machine scoring of the Index of Productive Syntax from the Computerized Language ANalysis (CLAN) tools of the Child Language Data Exchange System of TalkBank (MacWhinney, 2000) were compared to manual scoring to determine the accuracy of the machine-scored method. Method Twenty transcripts of 10 children from archival data of the Weismer Corpus from the Child Language Data Exchange System at 30 and 42 months were examined. Measures of absolute point difference and point-to-point accuracy were compared, as well as points erroneously given and missed. Two new measures for evaluating automatic scoring of the Index of Productive Syntax were introduced: Machine Item Accuracy (MIA) and Cascade Failure Rate— these measures further analyze points erroneously given and missed. Differences in total scores, subscale scores, and individual structures were also reported. Results Mean absolute point difference between machine and hand scoring was 3.65, point-to-point agreement was 72.6%, and MIA was 74.9%. There were large differences in subscales, with Noun Phrase and Verb Phrase subscales generally providing greater accuracy and agreement than Question/Negation and Sentence Structures subscales. There were significantly more erroneous than missed items in machine scoring, attributed to problems of mistagging of elements, imprecise search patterns, and other errors. Cascade failure resulted in an average of 4.65 points lost per transcript. Conclusions The CLAN program showed relatively inaccurate outcomes in comparison to manual scoring on both traditional and new measures of accuracy. Recommendations for improvement of the program include accounting for second exemplar violations and applying cascaded credit, among other suggestions. It was proposed that research on machine-scored syntax routinely report accuracy measures detailing erroneous and missed scores, including MIA, so that researchers and clinicians are aware of the limitations of a machine-scoring program. Supplemental Material https://doi.org/10.23641/asha.11984364


1991 ◽  
Vol 30 (04) ◽  
pp. 275-283 ◽  
Author(s):  
P. M. Pietrzyk

Abstract:Much information about patients is stored in free text. Hence, the computerized processing of medical language data has been a well-known goal of medical informatics resulting in different paradigms. In Gottingen, a Medical Text Analysis System for German (abbr. MediTAS) has been under development for some time, trying to combine and to extend these paradigms. This article concentrates on the automated syntax analysis of German medical utterances. The investigated text material consists of 8,790 distinct utterances extracted from the summary sections of about 18,400 cytopathological findings reports. The parsing is based upon a new approach called Left-Associative Grammar (LAG) developed by Hausser. By extending considerably the LAG approach, most of the grammatical constructions occurring in the text material could be covered.


2020 ◽  
Vol 13 (2) ◽  
pp. 189-210
Author(s):  
Artemis Alexiadou

This paper discusses the formation of synthetic compounds with proper names. While these are possible in English, Greek disallows such formations. However, earlier stages of the language allowed such compounds, and in the modern language formations of this type are possible as long as they contain heads that are either bound roots or root- derived nominals of Classical Greek origin. The paper builds on the following ingredients: a) proper names are phrases; b) synthetic compounding in Modern Greek involves incorporation, and thus proper names cannot incorporate; c) by contrast, English synthetic compounds involve phrasal movement, and thus proper names can appear within compounds in this language. It is shown that in earlier Greek, proper names had the same status as their English counterparts, hence the possibility of synthetic compounds with proper names. It is further argued that the formations that involve bound/archaic roots are actually cases of either root compounding or root affixation and not synthetic compounds.


2019 ◽  
Vol 113 (1) ◽  
pp. 9-30
Author(s):  
Kateřina Rysová ◽  
Magdaléna Rysová ◽  
Michal Novák ◽  
Jiří Mírovský ◽  
Eva Hajičová

Abstract In the paper, we present EVALD applications (Evaluator of Discourse) for automated essay scoring. EVALD is the first tool of this type for Czech. It evaluates texts written by both native and non-native speakers of Czech. We describe first the history and the present in the automatic essay scoring, which is illustrated by examples of systems for other languages, mainly for English. Then we focus on the methodology of creating the EVALD applications and describe datasets used for testing as well as supervised training that EVALD builds on. Furthermore, we analyze in detail a sample of newly acquired language data – texts written by non-native speakers reaching the threshold level of the Czech language acquisition required e.g. for the permanent residence in the Czech Republic – and we focus on linguistic differences between the available text levels. We present the feature set used by EVALD and – based on the analysis – we extend it with new spelling features. Finally, we evaluate the overall performance of various variants of EVALD and provide the analysis of collected results.


Sign in / Sign up

Export Citation Format

Share Document