scholarly journals Building theories of consistency and variability in children’s language development: A large-scale data approach

2021 ◽  
Author(s):  
Angeline Tsui ◽  
Virginia A. Marchman ◽  
Michael C. Frank

Young children typically begin learning words during their first two years of life. On the other hand, they also vary substantially in their language learning. Similarities and differences in language learning call for a quantitative theory that can predict and explain which aspects of early language are consistent and which are variable. However, current developmental research practices limit our ability to build such quantitative theories because of small sample sizes and challenges related to reproducibility and replicability. In this chapter, we suggest that three approaches – meta-analysis, multi-site collaborations, and secondary data aggregation – can together address some of the limitations of current research in the developmental area. We review the strengths and limitations of each approach and end by discussing the potential impacts of combining these three approaches.

Author(s):  
Tianye Jia ◽  
Congying Chu ◽  
Yun Liu ◽  
Jenny van Dongen ◽  
Evangelos Papastergios ◽  
...  

AbstractDNA methylation, which is modulated by both genetic factors and environmental exposures, may offer a unique opportunity to discover novel biomarkers of disease-related brain phenotypes, even when measured in other tissues than brain, such as blood. A few studies of small sample sizes have revealed associations between blood DNA methylation and neuropsychopathology, however, large-scale epigenome-wide association studies (EWAS) are needed to investigate the utility of DNA methylation profiling as a peripheral marker for the brain. Here, in an analysis of eleven international cohorts, totalling 3337 individuals, we report epigenome-wide meta-analyses of blood DNA methylation with volumes of the hippocampus, thalamus and nucleus accumbens (NAcc)—three subcortical regions selected for their associations with disease and heritability and volumetric variability. Analyses of individual CpGs revealed genome-wide significant associations with hippocampal volume at two loci. No significant associations were found for analyses of thalamus and nucleus accumbens volumes. Cluster-based analyses revealed additional differentially methylated regions (DMRs) associated with hippocampal volume. DNA methylation at these loci affected expression of proximal genes involved in learning and memory, stem cell maintenance and differentiation, fatty acid metabolism and type-2 diabetes. These DNA methylation marks, their interaction with genetic variants and their impact on gene expression offer new insights into the relationship between epigenetic variation and brain structure and may provide the basis for biomarker discovery in neurodegeneration and neuropsychiatric conditions.


2017 ◽  
Vol 24 (4) ◽  
pp. 799-805 ◽  
Author(s):  
Jean Louis Raisaro ◽  
Florian Tramèr ◽  
Zhanglong Ji ◽  
Diyue Bu ◽  
Yongan Zhao ◽  
...  

Abstract The Global Alliance for Genomics and Health (GA4GH) created the Beacon Project as a means of testing the willingness of data holders to share genetic data in the simplest technical context—a query for the presence of a specified nucleotide at a given position within a chromosome. Each participating site (or “beacon”) is responsible for assuring that genomic data are exposed through the Beacon service only with the permission of the individual to whom the data pertains and in accordance with the GA4GH policy and standards. While recognizing the inference risks associated with large-scale data aggregation, and the fact that some beacons contain sensitive phenotypic associations that increase privacy risk, the GA4GH adjudged the risk of re-identification based on the binary yes/no allele-presence query responses as acceptable. However, recent work demonstrated that, given a beacon with specific characteristics (including relatively small sample size and an adversary who possesses an individual’s whole genome sequence), the individual’s membership in a beacon can be inferred through repeated queries for variants present in the individual’s genome. In this paper, we propose three practical strategies for reducing re-identification risks in beacons. The first two strategies manipulate the beacon such that the presence of rare alleles is obscured; the third strategy budgets the number of accesses per user for each individual genome. Using a beacon containing data from the 1000 Genomes Project, we demonstrate that the proposed strategies can effectively reduce re-identification risk in beacon-like datasets.


2021 ◽  
Vol 11 (1) ◽  
pp. 6650-6655
Author(s):  
A. Alghamdi ◽  
T. Alsubait ◽  
A. Baz ◽  
H. Alhakami

Big data have attracted significant attention in recent years, as their hidden potentials that can improve human life, especially when applied in healthcare. Big data is a reasonable collection of useful information allowing new breakthroughs or understandings. This paper reviews the use and effectiveness of data analytics in healthcare, examining secondary data sources such as books, journals, and other reputable publications between 2000 and 2020, utilizing a very strict strategy in keywords. Large scale data have been proven of great importance in healthcare, and therefore there is a need for advanced forms of data analytics, such as diagnostic data and descriptive analysis, for improving healthcare outcomes. The utilization of large-scale data can form the backbone of predictive analytics which is the baseline for future individual outcome prediction.


2021 ◽  
Author(s):  
Stephan Meylan ◽  
Jessica Mankewitz ◽  
Sammy Floyd ◽  
Hugh Rabagliati ◽  
Mahesh Srinivasan

Because words have multiple meanings, language users must often choose appropriate meanings according to the context of use. How this potential ambiguity affects first language learning, especially word learning, is unknown. Here, we present the first large-scale study of how children are exposed to, and themselves use, ambiguous words in their actual language learning environments. We tag 180,000 words in two longitudinal child language corpora with word senses from WordNet, focusing between 9 and 51 months and limiting to words from a popular parental vocabulary report. We then compare the diversity of sense usage in adult speech around children to that observed in a sample of adult-directed language, as well as the diversity of sense usage in children's own productions. To accomplish this we use a Bayesian model-based estimate of sense entropy, a measure of diversity that takes into account uncertainty inherent in small sample sizes. This reveals that sense diversity in caregivers' speech to children is similar to that observed in a sample of adult-directed written material, and that children' use of nouns --- but not verbs --- is similarly diverse to that of adults. Finally, we show that sense entropy is a significant predictor of vocabulary development: children begin to produce words with a higher diversity of adult sense usage at later ages. We discuss the implications of our findings for theories of word learning.


2009 ◽  
Vol 28 (11) ◽  
pp. 2737-2740
Author(s):  
Xiao ZHANG ◽  
Shan WANG ◽  
Na LIAN

2016 ◽  
Author(s):  
John W. Williams ◽  
◽  
Simon Goring ◽  
Eric Grimm ◽  
Jason McLachlan

2008 ◽  
Vol 9 (10) ◽  
pp. 1373-1381 ◽  
Author(s):  
Ding-yin Xia ◽  
Fei Wu ◽  
Xu-qing Zhang ◽  
Yue-ting Zhuang

Sign in / Sign up

Export Citation Format

Share Document