Linguistic Proficiency: A Quantitative Approach to Immigrant and Heritage Speakers of Danish

Author(s):  
Jan Heegård Petersen ◽  
Gert Foget Hansen ◽  
Jacob Thøgersen ◽  
Karoline Kühl

AbstractThis paper presents a corpus-based quantitative study on linguistic proficiency of approx. 300 immigrant and heritage speakers of Danish in North America and Argentina, aiming at the question whether linguistic proficiency is connected to ‘immigrant generation’ (i.e. the difference between speakers who migrated as adults with a fully acquired language competence and foreign-born heritage speakers) or the sociocultural setting, or both. The large data base at hand provides a rare opportunity to compare developments within the same minority language in different places, representing different sociocultural settings for the immigrant or heritage speakers and, accordingly, different language ecologies. The study relies on the Corpus of American Danish (1.6 million tokens, including both words and non-word utterances). Based on this data set, the paper explores the distribution of 13 linguistic and non-linguistic variables representing linguistic proficiency (i.e. Danish words, L2 words, word-internal codeswitching, type-token ratio, empty and filled pauses, self-interruption, lengthening, speech rate, word length, runlength and the ratio of main and subclauses) by applying Factor Analysis as a statistical tool. On an empirically solid basis, the paper concludes that (a) the sociolinguistic setting is the crucial factor in the development of linguistic proficiency and (b) linguistic proficiency is a non-universal cognitive phenomenon.

2018 ◽  
Vol 52 (1) ◽  
pp. 201-210 ◽  
Author(s):  
Semra Sevi ◽  
Vincent Arel-Bundock ◽  
André Blais

AbstractWe study data on the gender of more than 21,000 unique candidates in all Canadian federal elections since 1921, when the first women ran for seats in Parliament. This large data set allows us to compute precise estimates of the difference in the electoral fortunes of men and women candidates. When accounting for party effects and time trends, we find that the difference between the vote shares of men and women is substantively negligible (±0.5 percentage point). This gender gap was larger in the 1920s (±2.5 percentage points), but it is now statistically indistinguishable from zero. Our results have important normative implications: political parties should recruit and promote more women candidates because they remain underrepresented in Canadian politics and because they do not suffer from a substantial electoral penalty.


2019 ◽  
Vol 56 (6) ◽  
pp. 851-887 ◽  
Author(s):  
Bianca E. Bersani ◽  
Adam W. Pittman

Objective:This study reassesses the generational disparity in immigrant offending. Patterns and predictors of offending are compared using traditional peer-based models and an alternative within-family (parent–child dyad) model.Method:The National Longitudinal Survey of Youth (1979; NLSY79) and NLSY-Child and Young Adult (NLSY_CYA) data are merged to create an intergenerational data set to compare generational disparities in immigrant offending across peers and within families. Differences in self-reported offending (prevalence and variety) by immigrant generation are assessed using a combination of descriptive analyses (χ2and analysis of variance) and regression models.Results:While NLSY_CYA children generally are at a greater risk of offending compared with the NLSY79 mothers, the difference in offending is greatest between first-generation mom and second-generation child dyads. Disparities in offending are driven in large part by exceedingly low levels of offending among first-generation immigrants.Conclusion:Although the factors driving an increase in offending between parent–child generations are not unique to immigrants, they are amplified in immigrant families. Whereas the second generation is remarkably similar to their U.S.-born counterparts in terms of their involvement in crime, suggesting a high degree of swift integration, the greater involvement in crime among the children of immigrants compared to their foreign-born mothers suggests a decline in well-being across successive generations.


Geology ◽  
2020 ◽  
Vol 48 (7) ◽  
pp. 718-722
Author(s):  
Jason S. Alexander ◽  
Brandon J. McElroy ◽  
Snehalata Huzurbazar ◽  
Marissa L. Murr

Abstract Accurate estimation of paleo–streamflow depth from outcrop is important for estimation of channel slopes, water discharges, sediment fluxes, and basin sizes of ancient river systems. Bar-scale inclined strata deposited from slipface avalanching on fluvial bar margins are assumed to be indicators of paleodepth insofar as their thickness approaches but does not exceed formative flow depths. We employed a unique, large data set from a prolonged bank-filling flood in the sandy, braided Missouri River (USA) to examine scaling between slipface height and measures of river depth during the flood. The analyses demonstrated that the most frequent slipface height observations underestimate study-reach mean flow depth at peak stage by a factor of 3, but maximum values are approximately equal to mean flow depth. At least 70% of the error is accounted for by the difference between slipface base elevation and mean bed elevation, while the difference between crest elevation and water surface accounts for ∼30%. Our analysis provides a scaling for bar-scale inclined strata formed by avalanching and suggests risk of systematic bias in paleodepth estimation if mean thickness measurements of these deposits are equated to mean bankfull depth.


2020 ◽  
Vol 20 (11) ◽  
pp. 6291-6303
Author(s):  
Guy Dagan ◽  
Philip Stier

Abstract. Aerosol effects on cloud properties and the atmospheric energy and radiation budgets are studied through ensemble simulations over two month-long periods during the NARVAL campaigns (Next-generation Aircraft Remote-Sensing for Validation Studies, December 2013 and August 2016). For each day, two simulations are conducted with low and high cloud droplet number concentrations (CDNCs), representing low and high aerosol concentrations, respectively. This large data set, which is based on a large spread of co-varying realistic initial conditions, enables robust identification of the effect of CDNC changes on cloud properties. We show that increases in CDNC drive a reduction in the top-of-atmosphere (TOA) net shortwave flux (more reflection) and a decrease in the lower-tropospheric stability for all cases examined, while the TOA longwave flux and the liquid and ice water path changes are generally positive. However, changes in cloud fraction or precipitation, that could appear significant for a given day, are not as robustly affected, and, at least for the summer month, are not statistically distinguishable from zero. These results highlight the need for using a large sample of initial conditions for cloud–aerosol studies for identifying the significance of the response. In addition, we demonstrate the dependence of the aerosol effects on the season, as it is shown that the TOA net radiative effect is doubled during the winter month as compared to the summer month. By separating the simulations into different dominant cloud regimes, we show that the difference between the different months emerges due to the compensation of the longwave effect induced by an increase in ice content as compared to the shortwave effect of the liquid clouds. The CDNC effect on the longwave flux is stronger in the summer as the clouds are deeper and the atmosphere is more unstable.


2019 ◽  
Author(s):  
Guy Dagan ◽  
Philip Stier

Abstract. Aerosol effects on cloud properties and the atmospheric energy and radiation budgets are studied through ensemble simulations over two month-long periods during the NARVAL campaigns (December 2013 and August 2016). For each day, two simulations are conducted with low and high cloud droplet number concentrations (CDNC), representing low and high aerosol concentrations, respectively. This large data-set, which is based on a large spread of co-varying realistic initial conditions, enables robust identification of the effect of CDNC changes on cloud properties. We show that increases in CDNC drive a reduction in the top of atmosphere (TOA) net shortwave flux (more reflection) and a decrease in the lower tropospheric stability for all cases examined, while the TOA longwave flux and the liquid and ice water path changes are generally positive. However, changes in cloud fraction or precipitation, that could appear significant for a given day, are not as robustly affected, and, at least for the summer month, are not statistically distinguishable from zero. These results highlight the need for using large statistics of initial conditions for cloud–aerosol studies for identifying the significance of the response. In addition, we demonstrate the dependence of the aerosol effects on the season, as it is shown that the TOA net radiative effect is doubled during the winter month as compared to the summer month. By separating the simulations into different dominant cloud regimes, we show that the difference between the different months emerge due to the compensation of the longwave effect induced by an increase in ice content as compared to the shortwave effect of the liquid clouds. The CDNC effect on the longwave is stronger in the summer as the clouds are deeper and the atmosphere is more unstable.


Author(s):  
Jules S. Jaffe ◽  
Robert M. Glaeser

Although difference Fourier techniques are standard in X-ray crystallography it has only been very recently that electron crystallographers have been able to take advantage of this method. We have combined a high resolution data set for frozen glucose embedded Purple Membrane (PM) with a data set collected from PM prepared in the frozen hydrated state in order to visualize any differences in structure due to the different methods of preparation. The increased contrast between protein-ice versus protein-glucose may prove to be an advantage of the frozen hydrated technique for visualizing those parts of bacteriorhodopsin that are embedded in glucose. In addition, surface groups of the protein may be disordered in glucose and ordered in the frozen state. The sensitivity of the difference Fourier technique to small changes in structure provides an ideal method for testing this hypothesis.


2020 ◽  
Vol 39 (5) ◽  
pp. 6419-6430
Author(s):  
Dusan Marcek

To forecast time series data, two methodological frameworks of statistical and computational intelligence modelling are considered. The statistical methodological approach is based on the theory of invertible ARIMA (Auto-Regressive Integrated Moving Average) models with Maximum Likelihood (ML) estimating method. As a competitive tool to statistical forecasting models, we use the popular classic neural network (NN) of perceptron type. To train NN, the Back-Propagation (BP) algorithm and heuristics like genetic and micro-genetic algorithm (GA and MGA) are implemented on the large data set. A comparative analysis of selected learning methods is performed and evaluated. From performed experiments we find that the optimal population size will likely be 20 with the lowest training time from all NN trained by the evolutionary algorithms, while the prediction accuracy level is lesser, but still acceptable by managers.


2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ruolan Zeng ◽  
Jiyong Deng ◽  
Limin Dang ◽  
Xinliang Yu

AbstractA three-descriptor quantitative structure–activity/toxicity relationship (QSAR/QSTR) model was developed for the skin permeability of a sufficiently large data set consisting of 274 compounds, by applying support vector machine (SVM) together with genetic algorithm. The optimal SVM model possesses the coefficient of determination R2 of 0.946 and root mean square (rms) error of 0.253 for the training set of 139 compounds; and a R2 of 0.872 and rms of 0.302 for the test set of 135 compounds. Compared with other models reported in the literature, our SVM model shows better statistical performance in a model that deals with more samples in the test set. Therefore, applying a SVM algorithm to develop a nonlinear QSAR model for skin permeability was achieved.


Author(s):  
Lior Shamir

Abstract Several recent observations using large data sets of galaxies showed non-random distribution of the spin directions of spiral galaxies, even when the galaxies are too far from each other to have gravitational interaction. Here, a data set of $\sim8.7\cdot10^3$ spiral galaxies imaged by Hubble Space Telescope (HST) is used to test and profile a possible asymmetry between galaxy spin directions. The asymmetry between galaxies with opposite spin directions is compared to the asymmetry of galaxies from the Sloan Digital Sky Survey. The two data sets contain different galaxies at different redshift ranges, and each data set was annotated using a different annotation method. The results show that both data sets show a similar asymmetry in the COSMOS field, which is covered by both telescopes. Fitting the asymmetry of the galaxies to cosine dependence shows a dipole axis with probabilities of $\sim2.8\sigma$ and $\sim7.38\sigma$ in HST and SDSS, respectively. The most likely dipole axis identified in the HST galaxies is at $(\alpha=78^{\rm o},\delta=47^{\rm o})$ and is well within the $1\sigma$ error range compared to the location of the most likely dipole axis in the SDSS galaxies with $z>0.15$ , identified at $(\alpha=71^{\rm o},\delta=61^{\rm o})$ .


Sign in / Sign up

Export Citation Format

Share Document