scholarly journals An Improved Genome Assembly of the European Aspen Populus tremula

2019 ◽  
Author(s):  
Bastian Schiffthaler ◽  
Nicolas Delhomme ◽  
Carolina Bernhardsson ◽  
Jerry Jenkins ◽  
Stefan Jansson ◽  
...  

ABSTRACTThe genome assembly of the European aspen Populus tremula proved difficult for a short-read based strategy due to high genomic variation. As a consequence, the fragmented sequence is impeding studies that benefit from highly contiguous data, particularly genome-wide association studies (GWAS) and comparative genomics. Here we present an updated assembly based on long-read sequences, optical mapping and genetic mapping. This assembly - henceforth referred to as Potra V2 - is assembled into 19 contiguous chromosomes which provides a powerful tool for future association studies. The genome sequence and any feature files are available from the PopGenIE resource.

2021 ◽  
Vol 4 (4) ◽  
pp. e202000902 ◽  
Author(s):  
Robert A Player ◽  
Ellen R Forsyth ◽  
Kathleen J Verratti ◽  
David W Mohr ◽  
Alan F Scott ◽  
...  

Reference genome fidelity is critically important for genome wide association studies, yet most vary widely from the study population. A typical whole genome sequencing approach implies short-read technologies resulting in fragmented assemblies with regions of ambiguity. Further information is lost by economic necessity when genotyping populations, as lower resolution technologies such as genotyping arrays are commonly used. Here, we present a phased reference genome for Canis lupus familiaris using high molecular weight DNA-sequencing technologies. We tested wet laboratory and bioinformatic approaches to demonstrate a minimum workflow to generate the 2.4 gigabase genome for a Labrador Retriever. The de novo assembly required eight Oxford Nanopore R9.4 flowcells (∼23X depth) and running a 10X Genomics library on the equivalent of one lane of an Illumina NovaSeq S1 flowcell (∼88X depth), bringing the cost of generating a nearly complete reference genome to less than $10K (USD). Mapping of short-read data from 10 Labrador Retrievers against this reference resulted in 1% more aligned reads versus the current reference (CanFam3.1, P < 0.001), and a 15% reduction of variant calls, increasing the chance of identifying true, low-effect size variants in a genome-wide association studies. We believe that by incorporating the cost to produce a full genome assembly into any large-scale genotyping project, an investigator can improve study power, decrease costs, and optimize the overall scientific value of their study.


Author(s):  
Katie Saund ◽  
Evan S Snitkin

Bacterial genome-wide association studies (bGWAS) capture associations between genomic variation and phenotypic variation. Convergence based bGWAS methods identify genomic mutations that occur independently multiple times on the phylogenetic tree in the presence of phenotypic variation more often than is expected by chance. This work introduces hogwash, an open source R package that implements three algorithms for convergence based bGWAS. Hogwash additionally contains two burden testing approaches to perform gene- or pathway-analysis to improve power and increase convergence detection for related but weakly penetrant genotypes. To identify optimal use cases, we applied hogwash to data simulated with a variety of phylogenetic signals and convergence distributions. These simulated data are publicly available and contain the relevant metadata regarding convergence and phylogenetic signal for each phenotype and genotype. Hogwash is available for download from GitHub.


2020 ◽  
Vol 6 (11) ◽  
Author(s):  
Katie Saund ◽  
Evan S. Snitkin

Bacterial genome-wide association studies (bGWAS) capture associations between genomic variation and phenotypic variation. Convergence-based bGWAS methods identify genomic mutations that occur independently multiple times on the phylogenetic tree in the presence of phenotypic variation more often than is expected by chance. This work introduces hogwash, an open source R package that implements three algorithms for convergence-based bGWAS. Hogwash additionally contains two burden testing approaches to perform gene or pathway analysis to improve power and increase convergence detection for related but weakly penetrant genotypes. To identify optimal use cases, we applied hogwash to data simulated with a variety of phylogenetic signals and convergence distributions. These simulated data are publicly available and contain the relevant metadata regarding convergence and phylogenetic signal for each phenotype and genotype. Hogwash is available for download from GitHub.


2019 ◽  
Author(s):  
Biyue Tan ◽  
Pär K. Ingvarsson

SummaryGenome-wide association studies (GWAS) is a powerful and widely used approach to decipher the genetic control of complex traits. A major challenge for dissecting quantitative traits in forest trees is statistical power. In this study, we use a population consisting of 1123 samples from two successive generations that have been phenotyped for growth and wood property traits and genotyped using the EuChip60K chip, yielding 37,832 informative SNPs. We use multi-locus GWAS models to assess both additive and dominance effects to identify markers associated with growth and wood property traits in the eucalypt hybrids. Additive and dominance association models identified 78 and 82 significant SNPs across all traits, respectively, which captured between 39 and 86% of the genomic-based heritability. We also used SNPs identified from the GWAS and SNPs using less stringent significance thresholds to evaluate predictive abilities in a genomic selection framework. Genomic selection models based on the top 1% SNPs captured a substantially greater proportion of the genetic variance of traits compared to when all SNPs were used for model training. The prediction ability of estimated breeding values was significantly improved for all traits using either the top 1% SNPs or SNPs identified using a relaxed p-value threshold (p<10-3). This study highlights the added value of also considering dominance effects for identifying genomic regions controlling growth traits in trees. Moreover, integrating GWAS results into genomic selection method provides enhanced power relative to discrete associations for identifying genomic variation potentially useful in tree breeding.


2012 ◽  
Vol 24 (4) ◽  
pp. 1195-1214 ◽  
Author(s):  
Scott I. Vrieze ◽  
William G. Iacono ◽  
Matt McGue

AbstractThis article serves to outline a research paradigm to investigate main effects and interactions of genes, environment, and development on behavior and psychiatric illness. We provide a historical context for candidate gene studies and genome-wide association studies, including benefits, limitations, and expected payoffs. Using substance use and abuse as our driving example, we then turn to the importance of etiological psychological theory in guiding genetic, environmental, and developmental research, as well as the utility of refined phenotypic measures, such as endophenotypes, in the pursuit of etiological understanding and focused tests of genetic and environmental associations. Phenotypic measurement has received considerable attention in the history of psychology and is informed by psychometrics, whereas the environment remains relatively poorly measured and is often confounded with genetic effects (i.e., gene–environment correlation). Genetically informed designs, which are no longer limited to twin and adoption studies thanks to ever-cheaper genotyping, are required to understand environmental influences. Finally, we outline the vast amount of individual difference in structural genomic variation, most of which remains to be leveraged in genetic association tests. Although the genetic data can be massive and burdensome (tens of millions of variants per person), we argue that improved understanding of genomic structure and function will provide investigators with new tools to test specific a priori hypotheses derived from etiological psychological theory, much like current candidate gene research but with less confusion and more payoff than candidate gene research has to date.


Sign in / Sign up

Export Citation Format

Share Document