scholarly journals Correlation analysis among single nucleotide polymorphisms in thirteen language genes and culture/education parameters from twenty-six countries

2021 ◽  
Author(s):  
Bo Sun ◽  
Changlu Guo ◽  
Zhizhou Zhang

Language is a vital feature of any human culture, but whether language gene polymorphisms have meaningful correlations with some cultural characteristics during the long-run evolution of human languages largely remains obscure (uninvestigated). This study would be an endeavor example to find evidences for the answer of above question. In this study, the collected basic data include 13 language genes and their randomly selected 111 single nucleotide polymorphisms (SNPs), SNP profiles, 29 culture/education parameters, and estimated cultural context values for 26 representative countries. In order to undertake principal component analysis (PCA) for correlation search, SNP genotypes, cultural context and all other culture/education parameters have to be quantitatively represented into numerical values. Based on the above conditions, this study obtained its preliminary results, the main points of which contain: (1) The 111 SNPs contain several clusters of correlational groups with positive and negative correlations with each other; (2) Low cultural context level significantly influences the correlational patterns among 111 SNPs in the principal component analysis diagram; and (3) Among 29 culture/education parameters, several basic characteristics of a language (the numbers of alphabet, vowel, consonant and dialect) demonstrate least correlations with 111 SNPs of 13 language genes.

1970 ◽  
Vol 46 (3) ◽  
pp. 302-312
Author(s):  
A.A. Zwane ◽  
A. Maiwashe ◽  
M.L. Makgahlela ◽  
A. Choudhury ◽  
J.F. Taylor ◽  
...  

Access to genotyping assays enables the identification of informative markers that discriminate between cattle breeds. Identification of these markers can assist in breed assignment, improvement and conservation. The objective of this study was to identify breed informative markers to discriminate between three South African indigenous cattle breeds. Data from BovineSNP50 and GeneSeek Genomic Profiler (GGP-80K) assays were generated for Afrikaner, Drakensberger and Nguni, and were analysed for their genetic differentiation. Hereford and Angus were included as outgroups. Breeds were differentiated using principal component analysis (PCA). Single-nucleotide polymorphisms (SNPs) within the breeds were determined when minor allele frequency (MAF) was ≥ 0.05. Breed-specific SNPs were identified using Reynolds Fst and extended Lewontin and Krakauer's (FLK) statistics. These SNPs were validated using three African breeds, namely N’Dama, Kuri and Zebu from Madagascar. PCA discriminated among the breeds. A larger number of polymorphic SNPs was detected in Drakensberger (73%) than in Afrikaner (56%) and Nguni (65%). No substantial numbers of informative SNPs (Fst ≥ 0.6) were identified among indigenous breeds. Eleven SNPs were validated as discriminating the indigenous breeds from other African breeds. This is because the SNPs on BovineSNP50 and GGP-80K assays were ascertained as being common in European taurine breeds. Lower MAF and SNP informativeness observed in this study limits the application of these assays in breed assignment, and could have other implications for genome-wide studies in South African indigenous breeds. Sequencing should therefore be considered to discover new SNPs that are common among indigenous South African breeds and also SNPs that discriminate among these indigenous breeds.


2020 ◽  
Vol 14 (6) ◽  
pp. 1405-1424
Author(s):  
Paul Adjei Kwakwa

Purpose This study aims to fill the gap in existing studies that have analyzed the drivers of carbon dioxide (CO2) emissions. The author investigate the long-run effects of energy types, urbanization, financial development and, the interaction between urbanization and financial development on CO2 emissions. Design/methodology/approach Stochastic impacts by regression on population, affluence and technology model served as the framework for empirical modeling. Using annual time-series data for Tunisia, autoregressive distributed lag bounds test was used to examine the cointegration of the variables. Also, the fully modified ordinary least squares was used to estimate the emission effect of the explanatory variables. Further investigations were done using the principal component analysis and variance decomposition analysis. Findings Income, urbanization, trade and financial development exert upward pressure on CO2 emissions. However, the interaction between urbanization and financial development reduces the emission of CO2. Furthermore, primary energy use, energy intensity, electricity consumption and fossil fuel consumption have positive effects on carbon emission, while combustible renewables and waste, and electricity production from natural gas have negative effects on carbon emission. Practical implications The policy implication/recommendation indicates that the financial sector’s authorities can combat carbon emission by properly regulating the development and activities of the financial sector in urban areas in Tunisia. The promotion of the development and usage of cleaner energy is recommended to help reduce carbon emission. Policymakers need to promote environmentally friendly economic growth and development agenda. Originality/value The contribution of this study to the environmental degradation literature is that it offers evidence from Tunisia, which has not received much empirical attention. It also examines the effect of various forms of energy usage on carbon emission. To the best of the author’s knowledge, this is the first study to examine the interaction effect between urbanization and financial development on carbon emission. Also, if not the first, this study is among the earliest to use the principal component analysis as a part of the prediction of the carbon emission effect of energy variables.


2017 ◽  
Author(s):  
Kridsadakorn Chaichoompu ◽  
Fentaw Abegaz Yazew ◽  
Sissades Tongsima ◽  
Philip James Shaw ◽  
Anavaj Sakuntabhai ◽  
...  

AbstractBackgroundResolving population genetic structure is challenging, especially when dealing with closely related or geographically confined populations. Although Principal Component Analysis (PCA)-based methods and genomic variation with single nucleotide polymorphisms (SNPs) are widely used to describe shared genetic ancestry, improvements can be made especially when fine-scale population structure is the target.ResultsThis work presents an R package called IPCAPS, which uses SNP information for resolving possibly fine-scale population structure. The IPCAPS routines are built on the iterative pruning Principal Component Analysis (ipPCA) framework that systematically assigns individuals to genetically similar subgroups. In each iteration, our tool is able to detect and eliminate outliers, hereby avoiding severe misclassification errors.ConclusionsIPCAPS supports different measurement scales for variables used to identify substructure. Hence, panels of gene expression and methylation data can be accommodated as well. The tool can also be applied in patient sub-phenotyping contexts. IPCAPS is developed in R and is freely available from bio3.giga.ulg.ac.be/ipcaps


2012 ◽  
Vol 461 ◽  
pp. 753-756
Author(s):  
Chong Xing ◽  
Yao Wang ◽  
You Zhou ◽  
Yan Chun Liang

Recently, non-coding RNA prediction is the one of the most important researches in bioinformatics. In this paper, on the basis of principal component analysis, we present a tRNA prediction strategy by using least squares support vector machine (LS-SVM). Appearance frequencies of single nucleotide, 2 – nucleotides and (G-C) %, (A-T) % were chosen as characteristics inputs. Results from tests showed that the prediction accuracy was 90.51% on prokaryotic tRNA dataset. Experimental results indicate that the method is effective for prokaryotic ncRNA prediction.


2014 ◽  
Author(s):  
Gad Abraham ◽  
Michael Inouye

Principal component analysis (PCA) is routinely used to analyze genome-wide single-nucleotide polymorphism (SNP) data, for detecting population structure and potential outliers. However, the size of SNP datasets has increased immensely in recent years and PCA of large datasets has become a time consuming task. We have developed flashpca, a highly efficient PCA implementation based on randomized algorithms, which delivers identical accuracy in extracting the top principal components compared with existing tools, in substantially less time. We demonstrate the utility of flashpca on both HapMap3 and on a large Immunochip dataset. For the latter, flashpca performed PCA of 15,000 individuals up to 125 times faster than existing tools, with identical results, and PCA of 150,000 individuals using flashpca completed in 4 hours. The increasing size of SNP datasets will make tools such as flashpca essential as traditional approaches will not adequately scale. This approach will also help to scale other applications that leverage PCA or eigen-decomposition to substantially larger datasets.


2021 ◽  
Author(s):  
Zongxuan Liu ◽  
Wei Xia ◽  
Bo Sun ◽  
Changlu Guo ◽  
Zhizhou Zhang

Abstract Human language diversity, as a biological phenotype, shall be genetically linked with language gene polymorphism. Meanwhile, this phenotype is historically shaped by local geographical/social factors. But how many language gene polymorphisms have direct correlations with some geography/society characteristics during the long-run evolution of human languages is an interesting question and largely remains uninvestigated. This study selected a series of geography/society factors (including 13 geographical factors and 21 social factors) from 26 countries and 111 single nucleotide polymorphisms (SNPs) randomly selected from 13 language genes. Principal component analysis (PCA) was performed to explore their potential correlations. Preliminary but interesting results were obtained as follow. (1) Most geographical parameters are concentrated into one cluster in the PCA diagram. The cluster contains 12 parameters that are positively correlated with each other; (2) PCA diagrams divide social parameters into four clusters, among which exist positive and negative correlations; (3) The strongest positive correlations were observed at one of ATP2C2 gene SNPs (ATP-1: rs78371901); the strongest negative correlations were found at one of NFXL1 gene SNPs (NFX-6: rs1440228); and the least correlations with language gene SNPs were observed at four geography/society factors: aash (Annual average rainfall), fore (Forest coverage), pden (Population density of the country) and rway (Runway traffic mode).


Sign in / Sign up

Export Citation Format

Share Document