scholarly journals A population genomics approach to uncover the CNVs, and their evolutionary significance, hidden in reduced‐representation sequencing data sets

2020 ◽  
Vol 29 (24) ◽  
pp. 4749-4753
Author(s):  
Anna Tigano
2018 ◽  
Vol 49 (6) ◽  
pp. 579-591 ◽  
Author(s):  
Zhe Zhang ◽  
Qianqian Zhang ◽  
Qian Xiao ◽  
Hao Sun ◽  
Hongding Gao ◽  
...  

2015 ◽  
Author(s):  
Thomas F Cooke ◽  
Muh-Ching Yee ◽  
Marina Muzzio ◽  
Alexandra Sockell ◽  
Ryan Bell ◽  
...  

Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth.


2019 ◽  
Author(s):  
Matthias H. Weissensteiner ◽  
Ignas Bunikis ◽  
Ana Catalán ◽  
Kees-Jan Francoijs ◽  
Ulrich Knief ◽  
...  

AbstractStructural variation (SV) accounts for a substantial part of genetic mutations segregating across eukaryotic genomes with important medical and evolutionary implications. Here, we characterized SV across evolutionary time scales in the songbird genus Corvus using de novo assembly and read mapping approaches. Combining information from short-read (N = 127) and long-read re-sequencing data (N = 31) as well as from optical maps (N = 16) revealed a total of 201,738 insertions, deletions and inversions. Population genetic analysis of SV in the Eurasian crow speciation model revealed an evolutionary young (~530,000 years) cis-acting 2.25-kb retrotransposon insertion reducing expression of the NDP gene with consequences for premating isolation. Our results attest to the wealth of SV segregating in natural populations and demonstrate its evolutionary significance.


BMC Genomics ◽  
2014 ◽  
Vol 15 (1) ◽  
pp. 16 ◽  
Author(s):  
Maja P Greminger ◽  
Kai N Stölting ◽  
Alexander Nater ◽  
Benoit Goossens ◽  
Natasha Arora ◽  
...  

BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Belinda Wright ◽  
Katherine A. Farquharson ◽  
Elspeth A. McLennan ◽  
Katherine Belov ◽  
Carolyn J. Hogg ◽  
...  

PLoS Genetics ◽  
2016 ◽  
Vol 12 (2) ◽  
pp. e1005631 ◽  
Author(s):  
Thomas F. Cooke ◽  
Muh-Ching Yee ◽  
Marina Muzzio ◽  
Alexandra Sockell ◽  
Ryan Bell ◽  
...  

2021 ◽  
Vol 12 (2) ◽  
pp. 317-334
Author(s):  
Omar Alaqeeli ◽  
Li Xing ◽  
Xuekui Zhang

Classification tree is a widely used machine learning method. It has multiple implementations as R packages; rpart, ctree, evtree, tree and C5.0. The details of these implementations are not the same, and hence their performances differ from one application to another. We are interested in their performance in the classification of cells using the single-cell RNA-Sequencing data. In this paper, we conducted a benchmark study using 22 Single-Cell RNA-sequencing data sets. Using cross-validation, we compare packages’ prediction performances based on their Precision, Recall, F1-score, Area Under the Curve (AUC). We also compared the Complexity and Run-time of these R packages. Our study shows that rpart and evtree have the best Precision; evtree is the best in Recall, F1-score and AUC; C5.0 prefers more complex trees; tree is consistently much faster than others, although its complexity is often higher than others.


Sign in / Sign up

Export Citation Format

Share Document