scholarly journals Suitability of different mapping algorithms for genome-wide polymorphism scans with Pool-Seq data

2016 ◽  
Author(s):  
Robert Kofler ◽  
Anna Maria Langmüller ◽  
Pierre Nouhaud ◽  
Kathrin Anna Otte ◽  
Christian Schlöetterer

AbstractThe cost-effectiveness of sequencing pools of individuals (Pool-Seq) provides the basis for the popularity and wide-spread use of this method for many research questions, ranging from unravelling the genetic basis of complex traits to the clonal evolution of cancer cells. Because the accuracy of Pool-Seq could be affected by many potential sources of error, several studies determined, for example, the influence of the sequencing technology, the library preparation protocol, and mapping parameters. Nevertheless, the impact of the mapping tools has not yet been evaluated. Using simulated and real Pool-Seq data, we demonstrate a substantial impact of the mapping tools leading to characteristic false positives in genome-wide scans. The problem of false positives was particularly pronounced when data with different read lengths and insert sizes were compared. Out of 14 evaluated algorithms novoalign, bwa mem and clc4 are most suitable for mapping Pool-Seq data. Nevertheless, no single algorithm is sufficient for avoiding all false positives. We show that the intersection of the results of two mapping algorithms provides a simple, yet effective strategy to eliminate false positives. We propose that the implementation of a consistent Pool-seq bioinformatics pipeline building on the recommendations of this study can substantially increase the reliability of Pool-Seq results, in particular when libraries generated with different protocols are being compared.

1993 ◽  
Vol 22 (2) ◽  
pp. 150-158 ◽  
Author(s):  
Dennis Wichelns ◽  
Jeffrey D. Kline

This paper examines the economic impact of selected farmland characteristics on the appraised value of development rights. Price elasticities are estimated for the size and location of farmland parcels, the amount of road frontage, the existence of panoramic views, and the distance to urban centers. Estimated elasticities suggest that parcel characteristics have a substantial impact on the cost of preserving farmland. For example, the per-acre cost of development rights is estimated to be 53 percent higher on farmland parcels that have a panoramic view of water than on parcels that have no water view. Similarly, the per-acre cost of development rights on a typical 25-acre farm is estimated to be 90 percent higher than on a typical 150-acre farm. Results suggest that the net social benefits obtained through farmland preservation programs may be enhanced by considering the impact of farmland characteristics on the marginal costs of purchasing development rights, when selecting among a set of candidate farms.


2017 ◽  
Author(s):  
Marie Verbanck ◽  
Chia-Yen Chen ◽  
Benjamin Neale ◽  
Ron Do

AbstractA fundamental assumption in inferring causality of an exposure on complex disease using Mendelian randomization (MR) is that the genetic variant used as the instrumental variable cannot have pleiotropic effects. Violation of this ‘no pleiotropy’ assumption can cause severe bias. Emerging evidence have supported a role for pleiotropy amongst disease-associated loci identified from GWA studies. However, the impact and extent of pleiotropy on MR is poorly understood. Here, we introduce a method called the Mendelian Randomization Pleiotropy RESidual Sum and Outlier (MR-PRESSO) test to detect and correct for pleiotropy in multi-instrument summary-level MR testing. We show using simulations that existing approaches are less sensitive to the detection of pleiotropy when it occurs in a subset of instrumental variables, as compared to MR-PRESSO. Next, we show that pleiotropy is widespread in MR, occurring in 41% amongst significant causal relationships (out of 4,250 MR tests total) from pairwise comparisons of 82 complex traits and diseases from summary level genome-wide association data. We demonstrate that pleiotropy causes distortion between-168% and 189% of the causal estimate in MR. Furthermore, pleiotropy induces false positive causal relationships-defined as those causal estimates that were no longer statistically significant in the pleiotropy corrected MR test but were previously significant in the naive MR test-in up to 10% of the MR tests using a P < 0.05 cutoff that is commonly used in MR studies. Finally, we show that MR-PRESSO can correct for distortion in the causal estimate in most cases. Our results demonstrate that pleiotropy is widespread and pervasive, and must be properly corrected for in order to maintain the validity of MR.


2020 ◽  
Author(s):  
Wesley Warren ◽  
Tyler Boggs ◽  
Richard Borowsky ◽  
Brian Carlson ◽  
Estephany Ferrufino ◽  
...  

Abstract Identifying the genetic factors that underlie complex traits is central to understanding the mechanistic underpinnings of evolution. In nature, adaptation to severe environmental change, such as encountered following colonization of caves, has dramatically altered genomes of species over varied time spans. Genomic sequencing approaches have identified mutations associated with troglomorphic trait evolution, but the functional impacts of these mutations remain poorly understood. The Mexican Tetra, Astyanax mexicanus, is abundant in the surface waters of northeastern Mexico, and also inhabits at least 30 different caves in the region. Cave-dwelling A. mexicanus morphs are well adapted to subterranean life and many populations appear to have evolved troglomorphic traits independently, while the surface-dwelling populations can be used as a proxy for the ancestral form. Here we present a high-resolution, chromosome-level surface fish genome, enabling the first genome-wide comparison between surface fish and cavefish populations. Using this resource, we performed quantitative trait locus (QTL) mapping analyses for pigmentation and eye size and found new candidate genes for eye loss such as dusp26. We used CRISPR gene editing in A. mexicanus to confirm the essential role of a gene within an eye size QTL, rx3, in eye formation. We also generated the first genome-wide evaluation of deletion variability that includes an analysis of the impact on protein-coding genes across cavefish populations to gain insight into this potential source of cave adaptation. The new surface fish genome reference now provides a more complete resource for comparative, functional, developmental and genetic studies of drastic trait differences within a species.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ashley S. Ling ◽  
El Hamidi Hay ◽  
Samuel E. Aggrey ◽  
Romdhane Rekaya

Abstract Background Use of genomic information has resulted in an undeniable improvement in prediction accuracies and an increase in genetic gain in animal and plant genetic selection programs in spite of oversimplified assumptions about the true biological processes. Even for complex traits, a large portion of markers do not segregate with or effectively track genomic regions contributing to trait variation; yet it is not clear how genomic prediction accuracies are impacted by such potentially nonrelevant markers. In this study, a simulation was carried out to evaluate genomic predictions in the presence of markers unlinked with trait-relevant QTL. Further, we compared the ability of the population statistic FST and absolute estimated marker effect as preselection statistics to discriminate between linked and unlinked markers and the corresponding impact on accuracy. Results We found that the accuracy of genomic predictions decreased as the proportion of unlinked markers used to calculate the genomic relationships increased. Using all, only linked, and only unlinked marker sets yielded prediction accuracies of 0.62, 0.89, and 0.22, respectively. Furthermore, it was found that prediction accuracies are severely impacted by unlinked markers with large spurious associations. FST-preselected marker sets of 10 k and larger yielded accuracies 8.97 to 17.91% higher than those achieved using preselection by absolute estimated marker effects, despite selecting 5.1 to 37.7% more unlinked markers and explaining 2.4 to 5.0% less of the genetic variance. This was attributed to false positives selected by absolute estimated marker effects having a larger spurious association with the trait of interest and more negative impact on predictions. The Pearson correlation between FST scores and absolute estimated marker effects was 0.77 and 0.27 among only linked and only unlinked markers, respectively. The sensitivity of FST scores to detect truly linked markers is comparable to absolute estimated marker effects but the consistency between the two statistics regarding false positives is weak. Conclusion Identification and exclusion of markers that have little to no relevance to the trait of interest may significantly increase genomic prediction accuracies. The population statistic FST presents an efficient and effective tool for preselection of trait-relevant markers.


Author(s):  
Wesley C. Warren ◽  
Tyler E. Boggs ◽  
Richard Borowsky ◽  
Brian M. Carlson ◽  
Estephany Ferrufino ◽  
...  

AbstractIdentifying the genetic factors that underlie complex traits is central to understanding the mechanistic underpinnings of evolution. In nature, adaptation to severe environmental change, such as encountered following colonization of caves, has dramatically altered genomes of species over varied time spans. Genomic sequencing approaches have identified mutations associated with troglomorphic trait evolution, but the functional impacts of these mutations remain poorly understood. The Mexican Tetra, Astyanax mexicanus, is abundant in the surface waters of northeastern Mexico, and also inhabits at least 30 different caves in the region. Cave-dwelling A. mexicanus morphs are well adapted to subterranean life and many populations appear to have evolved troglomorphic traits independently, while the surface-dwelling populations can be used as a proxy for the ancestral form. Here we present a high-resolution, chromosome-level surface fish genome, enabling the first genome-wide comparison between surface fish and cavefish populations. Using this resource, we performed quantitative trait locus (QTL) mapping analyses for pigmentation and eye size and found new candidate genes for eye loss such as dusp26. We used CRISPR gene editing in A. mexicanus to confirm the essential role of a gene within an eye size QTL, rx3, in eye formation. We also generated the first genome-wide evaluation of deletion variability that includes an analysis of the impact on protein-coding genes across cavefish populations to gain insight into this potential source of cave adaptation. The new surface fish genome reference now provides a more complete resource for comparative, functional, developmental and genetic studies of drastic trait differences within a species.


Open Biology ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 190221 ◽  
Author(s):  
R. V. Broekema ◽  
O. B. Bakker ◽  
I. H. Jonkers

Over the past 15 years, genome-wide association studies (GWASs) have enabled the systematic identification of genetic loci associated with traits and diseases. However, due to resolution issues and methodological limitations, the true causal variants and genes associated with traits remain difficult to identify. In this post-GWAS era, many biological and computational fine-mapping approaches now aim to solve these issues. Here, we review fine-mapping and gene prioritization approaches that, when combined, will improve the understanding of the underlying mechanisms of complex traits and diseases. Fine-mapping of genetic variants has become increasingly sophisticated: initially, variants were simply overlapped with functional elements, but now the impact of variants on regulatory activity and direct variant-gene 3D interactions can be identified. Moreover, gene manipulation by CRISPR/Cas9, the identification of expression quantitative trait loci and the use of co-expression networks have all increased our understanding of the genes and pathways affected by GWAS loci. However, despite this progress, limitations including the lack of cell-type- and disease-specific data and the ever-increasing complexity of polygenic models of traits pose serious challenges. Indeed, the combination of fine-mapping and gene prioritization by statistical, functional and population-based strategies will be necessary to truly understand how GWAS loci contribute to complex traits and diseases.


2021 ◽  
Vol 37 (S1) ◽  
pp. 34-34
Author(s):  
Kate Halsby ◽  
Bryony Langford ◽  
Anna Pagotto ◽  
Harriet Tuson ◽  
Shuk-Li Collings ◽  
...  

IntroductionThe importance of patient-centered outcome (PCO) evidence is increasingly recognized, but its inclusion in Health Technology Assessment (HTA) submissions remains inconsistent. We explored the impact of PCO evidence on HTA decision-making.MethodsA framework was developed to assess the impact of PCO evidence (excluding EQ-5D) on HTA appraisals. An impact rating was determined by reviewing company, committee and Evidence Review Group (ERG) opinion. This was applied to publicly available appraisal documents (National Institute for Health and Care Excellence [NICE]: 8; Scottish Medicines Consortium [SMC]: 2) in a pilot study. The framework was then refined and applied to a larger dataset.ResultsPCO evidence had ‘substantial impact’ in 3/8 NICE and 1/2 SMC appraisals, and ‘some impact’ in those remaining. PCO evidence informed the cost-effectiveness model in 2/8 NICE and 1/2 SMC submissions, and was considered superior to EQ-5D evidence in one NICE and one SMC submission. The ERG considered PCO evidence relevant to decision-making in 5/8 NICE appraisals. PCO evidence was mentioned in guidance for 7/10 appraisals (deemed relevant in 5/10). In one assessment, committee comments were notably more favorable than ERG comments. Larger dataset analysis results provided further insights to the pilot study.ConclusionsThe framework allows a systematic approach to evaluating the impact of PCO evidence on HTA appraisals.BL, AP, DGB and NY are employees of Symmetron Ltd, which received funding from Pfizer UK in connection with the development of this manuscript. KH, HT, SLC and JB are employees of Pfizer UK. This study was sponsored by Pfizer UK.


Author(s):  
Alvaro N. Barbeira ◽  
Yanyu Liang ◽  
Rodrigo Bonazzola ◽  
Gao Wang ◽  
Heather E. Wheeler ◽  
...  

AbstractThe integration of transcriptomic studies and GWAS (genome-wide association studies) via imputed expression has seen extensive application in recent years, enabling the functional characterization and causal gene prioritization of GWAS loci. However, the techniques for imputing transcriptomic traits from DNA variation remain underdeveloped. Furthermore, associations found when linking eQTL studies to complex traits through methods like PrediXcan can lead to false positives due to linkage disequilibrium between distinct causal variants. Therefore, the best prediction performance models may not necessarily lead to more reliable causal gene discovery. With the goal of improving discoveries without increasing false positives, we develop and compare multiple transcriptomic imputation approaches using the most recent GTEx release of expression and splicing data on 17,382 RNA-sequencing samples from 948 post-mortem donors in 54 tissues. We find that informing prediction models with posterior causal probability from fine-mapping (dap-g) and borrowing information across tissues (mashr) lead to better performance in terms of number and proportion of significant associations that are colocalized and the proportion of silver standard genes identified as indicated by precision-recall and ROC (Receiver Operating Characteristic) curves. All prediction models are made publicly available at predictdb.org.Author summaryIntegrating molecular biology information with genome-wide association studies (GWAS) sheds light on the mechanisms tying genetic variation to complex traits. However, associations found when linking eQTL studies to complex traits through methods like PrediXcan can lead to false positives due to linkage disequilibrium of distinct causal variants. By integrating fine-mapping information into the models, and leveraging the widespread tissue-sharing of eQTLs, we improve the proportion of likely causal genes among significant gene-trait associations, as well as the prediction of “ground truth” genes.


Pathogens ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1604
Author(s):  
Nelisiwe Mkize ◽  
Azwihangwisi Maiwashe ◽  
Kennedy Dzama ◽  
Bekezela Dube ◽  
Ntanganedzeni Mapholi

Understanding the biological mechanisms underlying tick resistance in cattle holds the potential to facilitate genetic improvement through selective breeding. Genome wide association studies (GWAS) are popular in research on unraveling genetic determinants underlying complex traits such as tick resistance. To date, various studies have been published on single nucleotide polymorphisms (SNPs) associated with tick resistance in cattle. The discovery of SNPs related to tick resistance has led to the mapping of associated candidate genes. Despite the success of these studies, information on genetic determinants associated with tick resistance in cattle is still limited. This warrants the need for more studies to be conducted. In Africa, the cost of genotyping is still relatively expensive; thus, conducting GWAS is a challenge, as the minimum number of animals recommended cannot be genotyped. These population size and genotype cost challenges may be overcome through the establishment of collaborations. Thus, the current review discusses GWAS as a tool to uncover SNPs associated with tick resistance, by focusing on the study design, association analysis, factors influencing the success of GWAS, and the progress on cattle tick resistance studies.


2014 ◽  
Vol 84 (5-6) ◽  
pp. 244-251 ◽  
Author(s):  
Robert J. Karp ◽  
Gary Wong ◽  
Marguerite Orsi

Abstract. Introduction: Foods dense in micronutrients are generally more expensive than those with higher energy content. These cost-differentials may put low-income families at risk of diminished micronutrient intake. Objectives: We sought to determine differences in the cost for iron, folate, and choline in foods available for purchase in a low-income community when assessed for energy content and serving size. Methods: Sixty-nine foods listed in the menu plans provided by the United States Department of Agriculture (USDA) for low-income families were considered, in 10 domains. The cost and micronutrient content for-energy and per-serving of these foods were determined for the three micronutrients. Exact Kruskal-Wallis tests were used for comparisons of energy costs; Spearman rho tests for comparisons of micronutrient content. Ninety families were interviewed in a pediatric clinic to assess the impact of food cost on food selection. Results: Significant differences between domains were shown for energy density with both cost-for-energy (p < 0.001) and cost-per-serving (p < 0.05) comparisons. All three micronutrient contents were significantly correlated with cost-for-energy (p < 0.01). Both iron and choline contents were significantly correlated with cost-per-serving (p < 0.05). Of the 90 families, 38 (42 %) worried about food costs; 40 (44 %) had chosen foods of high caloric density in response to that fear, and 29 of 40 families experiencing both worry and making such food selection. Conclusion: Adjustments to USDA meal plans using cost-for-energy analysis showed differentials for both energy and micronutrients. These differentials were reduced using cost-per-serving analysis, but were not eliminated. A substantial proportion of low-income families are vulnerable to micronutrient deficiencies.


Sign in / Sign up

Export Citation Format

Share Document