scholarly journals Large-scale study of speech acts' development in early childhood

2021 ◽  
Author(s):  
Mitja Nikolaus ◽  
Eliot Maes ◽  
Jeremy Auguste ◽  
Laurent Prévot ◽  
Abdellah Fourtassi

Studies of children's language use in the wild (e.g., in the context of child-caregiver social interaction) have been slowed by the time- and resource- consuming task of hand annotating utterances for communicative intents/speech acts. Existing studies have typically focused on investigating rather small samples of children, raising the question of how their findings generalize both to larger and more representative populations and to a richer set of interaction contexts. Here we propose a simple automatic model for speech act labeling in early childhood based on the INCA-A coding scheme (Ninio, Snow, Pan, & Rollins, 1994). After validating the model against ground truth labels, we automatically annotated the entire English-language data from the CHILDES corpus. The major theoretical result was that earlier findings generalize quite well at a large scale. Further, we introduced two complementary measures for the age of acquisition of speech acts which allows us to rank different speech acts according to their order of emergence in production and comprehension.Our model will be shared with the community so that researchers can use it with their data to investigate various question related to language use both in typical and atypical populations of children.

2021 ◽  
Author(s):  
Mitja Nikolaus ◽  
Juliette Maes ◽  
Jeremy Auguste ◽  
Laurent Prévot ◽  
Abdellah Fourtassi

Studies of children's language use in the wild (e.g., in the context of child-caregiver social interaction) have been slowed by the time- and resource- consuming task of hand annotating utterances for communicative intents/speech acts. Existing studies have typically focused on investigating rather small samples of children, raising the question of how their findings generalize both to larger and more representative populations and to a richer set of interaction contexts. Here we propose a simple automatic model for speech act labeling in early childhood based on the INCA-A coding scheme (Ninio et al., 1994). After validating the model against ground truth labels, we automatically annotated the entire English-language data from the CHILDES corpus. The major theoretical result was that earlier findings generalize quite well at a large scale. Our model will be shared with the community so that researchers can use it with their data to investigate various questions related to language use development.


Symmetry ◽  
2020 ◽  
Vol 12 (11) ◽  
pp. 1832
Author(s):  
Tomasz Hachaj ◽  
Patryk Mazurek

Deep learning-based feature extraction methods and transfer learning have become common approaches in the field of pattern recognition. Deep convolutional neural networks trained using tripled-based loss functions allow for the generation of face embeddings, which can be directly applied to face verification and clustering. Knowledge about the ground truth of face identities might improve the effectiveness of the final classification algorithm; however, it is also possible to use ground truth clusters previously discovered using an unsupervised approach. The aim of this paper is to evaluate the potential improvement of classification results of state-of-the-art supervised classification methods trained with and without ground truth knowledge. In this study, we use two sufficiently large data sets containing more than 200,000 “taken in the wild” images, each with various resolutions, visual quality, and face poses which, in our opinion, guarantee the statistical significance of the results. We examine several clustering and supervised pattern recognition algorithms and find that knowledge about the ground truth has a very small influence on the Fowlkes–Mallows score (FMS) of the classification algorithm. In the case of the classification algorithm that obtained the highest accuracy in our experiment, the FMS improved by only 5.3% (from 0.749 to 0.791) in the first data set and by 6.6% (from 0.652 to 0.718) in the second data set. Our results show that, beside highly secure systems in which face verification is a key component, face identities discovered by unsupervised approaches can be safely used for training supervised classifiers. We also found that the Silhouette Coefficient (SC) of unsupervised clustering is positively correlated with the Adjusted Rand Index, V-measure score, and Fowlkes–Mallows score and, so, we can use the SC as an indicator of clustering performance when the ground truth of face identities is not known. All of these conclusions are important findings for large-scale face verification problems. The reason for this is the fact that skipping the verification of people’s identities before supervised training saves a lot of time and resources.


2019 ◽  
Vol 1 (1) ◽  
pp. 42-75 ◽  
Author(s):  
Douglas Biber

Abstract Douglas Biber, Regents’ Professor of Applied Linguistics at Northern Arizona University, authors this article exploring the connections between register and a text-linguistic approach to language variation. He has spent the last 30 years pursuing a research program that explores the inherent link between register and language use, including at the phraseological, grammatical, and lexico-grammatical levels. His seminal book Variation across Speech and Writing (1988, Cambridge University Press) launched multi-dimensional (MD) analysis, a comprehensive framework and methodology for the large-scale study of register variation. This approach was innovative in taking a text-linguistic approach to characterize language use across situations of use through the quantitative and functional analysis of linguistic co-occurrence patterns and underlying dimensions of language use. MD analysis is now used widely to study register variation over time, in general and specialized registers, in learner language, and across a range of languages. In 1999, the Longman Grammar of Spoken and Written English (Biber et al.) became the first comprehensive descriptive reference book to systematically consider register variation in describing the grammatical and lexico-grammatical patterns of use in English. Douglas Biber’s quantitative linguistic research has consistently demonstrated the importance of register as a predictor of language variation. In his own words, “register always matters” (Gray 2013: 360, Interview with Douglas Biber, English Language & Linguistics).


2020 ◽  
Vol 11 (5) ◽  
pp. 841
Author(s):  
Raifu O. Farinde ◽  
Wasiu A. Oyedokun-Alli

The main goal of language teaching is that at the end of the period of learning, the learners should be able to communicate in that language effectively. The main source of language is language use. The students must therefore be given plenty of opportunity to use the language. This is where the principles of pragmatics come into language teaching. Pragmatics provides ample opportunities for the students to learn English language communicatively and practically. In this study, I shall focus particularly on the application of pragmatics to language teaching with emphasis on Gricean pragmatics and Searle’s speech acts. The question of why pragmatics should be assigned a more prominent place in language teaching syllabus is also sufficiently and adequately addressed.


2019 ◽  
Vol 30 (3) ◽  
pp. 516-538
Author(s):  
Lénia Marques ◽  
Nigel Williams

This article investigates the similarities and differences for tangible and intangible elements (factors and language use) contributing to placemaking in Airbnb English language reviews in Paris (59,057 reviews), Barcelona (19,291 reviews) and London (30,403 reviews). This paper contributes to provide new insights on the narrative construction of reputational capital which is connected to placemaking strategies. A combined quantitative approach using large scale text analysis enabled the analysis of review content and style. Patterns in the words usage were identified. Findings suggest that tangible and intangible elements work together in the discourse, contributing to the place-narrative built on the host’s reputational capital. The host-guest interaction is the main aspect of the reviews, followed by the importance of transport and local amenities. Cities have different profiles in the composition of the word clusters which indicates differences in the guests’ perceived experience.


2020 ◽  
pp. 1-28
Author(s):  
Brian A. COLLINS ◽  
Claudio O. TOPPELBERG

Abstract Young Latino children of immigrants typically speak primarily Spanish at home and are exposed to varying amounts of English. As a result, they often enter school with a wide range of proficiencies in each language. The current study investigated family background, language use at home and early childhood settings as predictors of Spanish and English language proficiencies among Latino dual language children (N = 228). Findings demonstrated divergent sets of predictors were associated with either Spanish or English proficiencies at kindergarten and second grade. Sociocultural variables (parent origin, gender, home language use, home literacy practices, and language use in early childhood settings) predicted children's Spanish proficiency, while socioeconomic variables (poverty, and maternal and paternal education) predicted children's English proficiency, with little to no overlap in these predictions. These results suggest that different supports are required for proficiency in Spanish and in English, highlighting the importance of sociocultural and socioeconomic factors.


Corpora ◽  
2019 ◽  
Vol 14 (3) ◽  
pp. 327-349
Author(s):  
Craig Frayne

This study uses the two largest available American English language corpora, Google Books and the Corpus of Historical American English (coha), to investigate relations between ecology and language. The paper introduces ecolinguistics as a promising theme for corpus research. While some previous ecolinguistic research has used corpus approaches, there is a case to be made for quantitative methods that draw on larger datasets. Building on other corpus studies that have made connections between language use and environmental change, this paper investigates whether linguistic references to other species have changed in the past two centuries and, if so, how. The methodology consists of two main parts: an examination of the frequency of common names of species followed by aspect-level sentiment analysis of concordance lines. Results point to both opportunities and challenges associated with applying corpus methods to ecolinguistc research.


2020 ◽  
Vol 5 (3) ◽  
pp. 77-81
Author(s):  
Sayyora Azimova ◽  

This article is devoted to the pragmatic interpretation of the illocutionary action of the speech act “expression of refusals”. The article discusses different ways of reflecting cases of denial. This article was written not only for English language professionals, but also for use in aggressive conflicts and their pragmatic resolution, which naturally occur in the process of communication in all other languages


2020 ◽  
Vol 6 (3) ◽  
pp. 227-231
Author(s):  
Sayyora Azimova ◽  

This article is devoted to the pragmatic interpretation of the illocutionary action of the speech act“expression of refusals”. The article discusses different ways of reflecting cases of denial. This article was written not only for English language professionals, but also for use in aggressive conflicts and their pragmatic resolution, which naturally occur in the process of communication in all other languages


Sign in / Sign up

Export Citation Format

Share Document