Jaunu burtu veidošana ar diakritiskajām zīmēm latviešu valodas kā svešvalodas apguvēju tekstos

Author(s):  
Inga Kaija

A Latvian learner corpus “LaVA” is being built in the Institute of Mathematics and Computer Science, University of Latvia. The corpus includes texts written by beginner learners in the first two semesters of learning Latvian as a foreign language. The texts are written by hand and digitized afterwards in order to reduce the issues that could be caused by the necessity to learn not only writing itself but also using a foreign keyboard. One of the features that cannot be digitized is the new letters created by adding diacritical marks which are not used that way in the standard Latvian alphabet. Since one of the essential steps in learning to write in a language is learning the letters and diacritical marks of that language, this study aims to find instances of such newly made letters and to discuss the basic quantitative measures in order to define hypotheses and areas of interest for further research of such usage. Altogether 322 texts were searched, and 175 examples were found. The amount of examples found in 2nd semester texts was less than half the amount of examples found in the 1st semester texts, but the percentage of texts containing examples was higher than expected – more than 33 % in the 1st semester and almost 20 % in the 2nd semester. It leads to a conclusion that this is quite a common occurrence but also prone to reduction in the second semester. The corpus does not provide any data on later semesters so it cannot be predicted when such instances should become a rare, individual feature rather than a common one. The average amount of examples in a text is not high, though. Counting only the texts where at least one example was found, the average amount of examples per text is 2.136 in the 1st semester and 1.690 in the 2nd semester. Considering that the absolute lowest possible value here is 1, it should not be considered as a high value. Therefore, using diacritical marks to make new letters, while a common feature of the Latvian interlanguage, could be characterized as casual rather than systemic. However, that does not exclude the possibility of certain patterns in usage. The currently collected data already shows that there are some words – such as garšo, viņš, ļoti, četri – where examples were found in more than one author’s text. Examples of using unsuitable diacritical marks are also sometimes found next to letters for which said diacritical marks would be suitable. This should be explored more thoroughly using qualitative methods. The size of the corpus keeps growing; the expected size upon completion is 1000 texts. When it is reached, it would be useful to repeat the study and check whether the larger amount of data still confirms the same assumptions. The larger sample size would also allow for more detailed quantitative analysis discussing each letter, diacritical mark, placement of the diacritical mark, and metadata collected for the corpus, such as gender, native language and other spoken languages by the authors of the texts.

2018 ◽  
pp. 4-7
Author(s):  
S. I. Zenko

The article raises the problem of classification of the concepts of computer science and informatics studied at secondary school. The efficiency of creation of techniques of training of pupils in these concepts depends on its solution. The author proposes to consider classifications of the concepts of school informatics from four positions: on the cross-subject basis, the content lines of the educational subject "Informatics", the logical and structural interrelations and interactions of the studied concepts, the etymology of foreign-language and translated words in the definition of the concepts of informatics. As a result of the first classification general and special concepts are allocated; the second classification — inter-content and intra-content concepts; the third classification — stable (steady), expanding, key and auxiliary concepts; the fourth classification — concepts-nouns, conceptsverbs, concepts-adjectives and concepts — combinations of parts of speech.


2018 ◽  
Vol 22 (22) ◽  
pp. 13
Author(s):  
Maria Rosario Bautista Zambrana

This paper aims to analyse the extent to which the textbook for German as a foreign language DaF kompakt A1 (Sander et al., 2011) complies with the recommendations of the Common European Framework of Reference for Languages (Council of Europe, 2001) (hereafter CEFR) in respect to lexical competence and sociolinguistic competence in receptive and productive activities, specifically with regard to phraseological units. In this respect, we have focused on sentential formulae and fixed frames present in a corpus containing the textbook materials, and we have checked whether those fixed expressions correspond to the phraseological and sociolinguistic compe-tences that are expected in the Framework for an A1 level student of German language. To this end, we have compiled a corpus of the textbook receptive and productive materials, made up by three subcorpora: one for the written texts, one for the oral texts, and a third subcorpus containing exercises. We have performed a quantitative analysis (by means of AntConc 3.4.4 [Anthony, 2016] and kfNgram [Fletcher, 2007]), and a qualitative one. Our results suggest that the textbook complies with the recommendations of the CEFR.


2016 ◽  
Vol 9 (9) ◽  
pp. 139 ◽  
Author(s):  
Katsunori Kotani ◽  
Takehiko Yoshimi ◽  
Hiroaki Nanjo ◽  
Hitoshi Isahara

<p>In order to develop effective teaching methods and computer-assisted language teaching systems for learners of English as a foreign language who need to study the basic linguistic competences for writing, pronunciation, reading, and listening, it is necessary to first investigate which vocabulary and grammar they have or have not yet learned. Identifying such vocabulary and grammar requires a learner corpus for analyzing the accuracy and fluency of learners’ linguistic competences. However, it is difficult to use previous learner corpora for this purpose because they have not compiled all the types of linguistic data that we need. Therefore, this study aimed to solve this problem by designing and developing a new learner corpus that compiles linguistic data regarding the accuracy and fluency of the four basic linguistic competences of writing, pronunciation, reading, and listening. The reliability and validity of the learner corpus were partially confirmed, and practical application of the learner corpus is reported here as case studies.</p>


2021 ◽  
pp. 73-84
Author(s):  
Viktoria Ilse

APPLIED INTERDISCIPLINARITY: STRENGTHENING CAREER ORIENTATION IN COMPUTER SCIENCE STUDIES BY APPLYING THE ACTION-ORIENTED TEACHING METHODS OF FOREIGN LANGUAGE DIDACTICS Can teaching methods be simply adapted in an interdisciplinary context? Does a teaching method from foreign language didactics work also in teaching during the first semester of computer science? If so, how can this process look and work in practice? These questions are addressed in this article.


Author(s):  
Noelia Navarro Gil ◽  
Helena Roquet Pugès

Abstract This paper explores the use of adversative Linking Adverbials (LAs) in the academic writing of advanced English Foreign Language (EFL) learners with different linguistic backgrounds. The learner corpus used in this study consists of 50 argumentative texts, which are contrasted with a native corpus: the American university students’ corpus (LOCNESS). Liu’s (2008) comprehensive list of adversative LAs has been used for the analysis. Findings reveal that both non-native (NNS) and native speakers of English (NS) use similar types of adversative LAs, but NNS place them regularly in sentence- and sometimes in paragraph- initial position, which often results in punctuation issues and misuse. A total of 9 LAs were found to be overused (e.g., nevertheless) and underused (e.g., actually) by NNS. The analysis performed according to L1 has yielded unexpected results in terms of preference, frequency, and placement of adversative LAs. The so-called ‘teaching effect’ is considered one of the main factors influencing the learners’ choices.


2020 ◽  
Vol 6 (1) ◽  
pp. 72-103 ◽  
Author(s):  
Nicolas Ballier ◽  
Stéphane Canu ◽  
Caroline Petitjean ◽  
Gilles Gasso ◽  
Carlos Balhana ◽  
...  

Abstract This paper discusses machine learning techniques for the prediction of Common European Framework of Reference (CEFR) levels in a learner corpus. We summarise the CAp 2018 Machine Learning (ML) competition, a classification task of the six CEFR levels, which map linguistic competence in a foreign language onto six reference levels. The goal of this competition was to produce a machine learning system to predict learners’ competence levels from written productions comprising between 20 and 300 words and a set of characteristics computed for each text extracted from the French component of the EFCAMDAT data (Geertzen et al., 2013). Together with the description of the competition, we provide an analysis of the results and methods proposed by the participants and discuss the benefits of this kind of competition for the learner corpus research (LCR) community. The main findings address the methods used and lexical bias introduced by the task.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Oleksandr Kapranov

Abstract This article presents and discusses a quantitative investigation of discourse markers (further – DMs) in the corpus of peer reviews of academic essays in didactics written by a group of future teachers of English as a Foreign Language (EFL). In total, 12 future EFL teachers at an intermediate level of EFL proficiency (henceforth – participants) took part in the study. The participants were instructed to form dyads and write peer reviews of each other’s academic essays on a range of topics in EFL didactics. Two corpora were used in the study, the corpus of the participants’ academic essays in EFL didactics and the corpus of peer reviews thereof. The corpora were analysed using WordSmith (Scott 2008) in order to establish the frequencies of the use of DMs per 1000 words. The results of the quantitative analysis of the corpora indicated that the participants employed a repertoire of stylistically neutral DMs in their peer reviews that was quantitatively similar to that of the academic essays. These findings will be further discussed in the article.


2020 ◽  
Vol 11 (1) ◽  
pp. 315-330
Author(s):  
Sameena Malik ◽  
Huang Qin ◽  
Said Muhammad khan ◽  
Khalid Ahmed

Sign in / Sign up

Export Citation Format

Share Document