Omics-Facilitated Crop Improvement for Climate Resilience and Superior Nutritive Value

Novel crop improvement approaches, including those that facilitate for the exploitation of crop wild relatives and underutilized species harboring the much-needed natural allelic variation are indispensable if we are to develop climate-smart crops with enhanced abiotic and biotic stress tolerance, higher nutritive value, and superior traits of agronomic importance. Top among these approaches are the “omics” technologies, including genomics, transcriptomics, proteomics, metabolomics, phenomics, and their integration, whose deployment has been vital in revealing several key genes, proteins and metabolic pathways underlying numerous traits of agronomic importance, and aiding marker-assisted breeding in major crop species. Here, citing several relevant examples, we appraise our understanding on the recent developments in omics technologies and how they are driving our quest to breed climate resilient crops. Large-scale genome resequencing, pan-genomes and genome-wide association studies are aiding the identification and analysis of species-level genome variations, whilst RNA-sequencing driven transcriptomics has provided unprecedented opportunities for conducting crop abiotic and biotic stress response studies. Meanwhile, single cell transcriptomics is slowly becoming an indispensable tool for decoding cell-specific stress responses, although several technical and experimental design challenges still need to be resolved. Additionally, the refinement of the conventional techniques and advent of modern, high-resolution proteomics technologies necessitated a gradual shift from the general descriptive studies of plant protein abundances to large scale analysis of protein-metabolite interactions. Especially, metabolomics is currently receiving special attention, owing to the role metabolites play as metabolic intermediates and close links to the phenotypic expression. Further, high throughput phenomics applications are driving the targeting of new research domains such as root system architecture analysis, and exploration of plant root-associated microbes for improved crop health and climate resilience. Overall, coupling these multi-omics technologies to modern plant breeding and genetic engineering methods ensures an all-encompassing approach to developing nutritionally-rich and climate-smart crops whose productivity can sustainably and sufficiently meet the current and future food, nutrition and energy demands.

Download Full-text

Towards precision medicine: The application of omics technologies in asthma management

F1000Research ◽

10.12688/f1000research.14309.2 ◽

2018 ◽

Vol 7 ◽

pp. 423 ◽

Cited By ~ 2

Author(s):

Chiara Scelfo ◽

Carla Galeone ◽

Francesca Bertolini ◽

Marco Caminati ◽

Patrizia Ruggiero ◽

...

Keyword(s):

Large Scale ◽

Association Studies ◽

Real Life ◽

Diagnostic Tools ◽

Chronic Obstructive ◽

Genome Wide Association Studies ◽

Pharmacological Treatments ◽

Omics Technologies ◽

Clinical Biomarkers ◽

Bronchial Inflammation

Asthma is a chronic obstructive respiratory disease characterised by bronchial inflammation. Its biological and clinical features have been widely explored and a number of pharmacological treatments are currently available. Currently several aspects of asthma pathophysiological background remain unclear, and this is represent a limitation for the traditional asthma phenotype approach. In this scenario, the identification of new molecular and clinical biomarkers may be helpful in order to better understand the disease, define specific diagnostic tools and highlight relevant novel targets for pharmacological treatments. Omics technologies offer innovative research tools for addressing the above mentioned goals. However, there is still a lot to do both in the fields of basic research and in the clinical application. Recently, genome-wide association studies, microRNAs and proteomics are contributing to enrich the available data for the identification of new asthma biomarkers. A precise approach to the patient with asthma, particularly with severe uncontrolled asthma, requires new and specific therapeutic targets, but also proper tools able to drive the clinician in tailoring the treatment. On the other hand, there is a need of predictors to treatment’s response, particularly in the field of biological drugs, whose sustainability implies a correct and precise selection of the patients. Translating acquired omics knowledge in clinical practice may address the unmet needs described above, but large-scale studies are required in order to confirm their relevance and effectiveness in daily practice. Thus in our opinion the application of omics is still lagging in the real-life setting.

Download Full-text

Optimized permutation testing for information theoretic measures of multi-gene interactions

BMC Bioinformatics ◽

10.1186/s12859-021-04107-6 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

James M. Kunert-Graf ◽

Nikita A. Sakhanenko ◽

David J. Galas

Keyword(s):

Large Scale ◽

Permutation Test ◽

Association Studies ◽

Genome Wide Association Studies ◽

Permutation Testing ◽

Exact Test ◽

Information Theoretic ◽

Information Theoretic Measures ◽

Full Analysis ◽

Computational Bottleneck

Abstract Background Permutation testing is often considered the “gold standard” for multi-test significance analysis, as it is an exact test requiring few assumptions about the distribution being computed. However, it can be computationally very expensive, particularly in its naive form in which the full analysis pipeline is re-run after permuting the phenotype labels. This can become intractable in multi-locus genome-wide association studies (GWAS), in which the number of potential interactions to be tested is combinatorially large. Results In this paper, we develop an approach for permutation testing in multi-locus GWAS, specifically focusing on SNP–SNP-phenotype interactions using multivariable measures that can be computed from frequency count tables, such as those based in Information Theory. We find that the computational bottleneck in this process is the construction of the count tables themselves, and that this step can be eliminated at each iteration of the permutation testing by transforming the count tables directly. This leads to a speed-up by a factor of over 103 for a typical permutation test compared to the naive approach. Additionally, this approach is insensitive to the number of samples making it suitable for datasets with large number of samples. Conclusions The proliferation of large-scale datasets with genotype data for hundreds of thousands of individuals enables new and more powerful approaches for the detection of multi-locus genotype-phenotype interactions. Our approach significantly improves the computational tractability of permutation testing for these studies. Moreover, our approach is insensitive to the large number of samples in these modern datasets. The code for performing these computations and replicating the figures in this paper is freely available at https://github.com/kunert/permute-counts.

Download Full-text

Power comparison of Cochran-Armitage trend test against allelic and genotypic tests in large-scale case-control genetic association studies

Statistical Methods in Medical Research ◽

10.1177/0962280216683979 ◽

2016 ◽

Vol 27 (9) ◽

pp. 2657-2673 ◽

Cited By ~ 2

Author(s):

Mathieu Emily

Keyword(s):

Large Scale ◽

Association Studies ◽

Disease Model ◽

Trend Test ◽

Genome Wide Association Studies ◽

Power Functions ◽

Power Comparison ◽

Powerful Test ◽

Armitage Trend Test ◽

Mode Of Inheritance

The Cochran-Armitage trend test (CA) has become a standard procedure for association testing in large-scale genome-wide association studies (GWAS). However, when the disease model is unknown, there is no consensus on the most powerful test to be used between CA, allelic, and genotypic tests. In this article, we tackle the question of whether CA is best suited to single-locus scanning in GWAS and propose a power comparison of CA against allelic and genotypic tests. Our approach relies on the evaluation of the Taylor decompositions of non-centrality parameters, thus allowing an analytical comparison of the power functions of the tests. Compared to simulation-based comparison, our approach offers the advantage of simultaneously accounting for the multidimensionality of the set of features involved in power functions. Although power for CA depends on the sample size, the case-to-control ratio and the minor allelic frequency (MAF), our results first show that it is largely influenced by the mode of inheritance and a deviation from Hardy–Weinberg Equilibrium (HWE). Furthermore, when compared to other tests, CA is shown to be the most powerful test under a multiplicative disease model or when the single-nucleotide polymorphism largely deviates from HWE. In all other situations, CA lacks in power and differences can be substantial, especially for the recessive mode of inheritance. Finally, our results are illustrated by the comparison of the performances of the statistics in two genome scans.

Download Full-text

GWASpro: a high-performance genome-wide association analysis server

Bioinformatics ◽

10.1093/bioinformatics/bty989 ◽

2018 ◽

Vol 35 (14) ◽

pp. 2512-2514 ◽

Cited By ~ 4

Author(s):

Bongsong Kim ◽

Xinbin Dai ◽

Wenchao Zhang ◽

Zhaohong Zhuang ◽

Darlene L Sanchez ◽

...

Keyword(s):

High Performance ◽

Large Scale ◽

Linear Mixed Model ◽

Association Studies ◽

Learning Curves ◽

Experimental Designs ◽

Genome Wide Association ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Genome Wide

Abstract Summary We present GWASpro, a high-performance web server for the analyses of large-scale genome-wide association studies (GWAS). GWASpro was developed to provide data analyses for large-scale molecular genetic data, coupled with complex replicated experimental designs such as found in plant science investigations and to overcome the steep learning curves of existing GWAS software tools. GWASpro supports building complex design matrices, by which complex experimental designs that may include replications, treatments, locations and times, can be accounted for in the linear mixed model. GWASpro is optimized to handle GWAS data that may consist of up to 10 million markers and 10 000 samples from replicable lines or hybrids. GWASpro provides an interface that significantly reduces the learning curve for new GWAS investigators. Availability and implementation GWASpro is freely available at https://bioinfo.noble.org/GWASPRO. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A description of large-scale metabolomics studies: increasing value by combining metabolomics with genome-wide SNP genotyping and transcriptional profiling

Journal of Endocrinology ◽

10.1530/joe-12-0144 ◽

2012 ◽

Vol 215 (1) ◽

pp. 17-28 ◽

Cited By ~ 18

Author(s):

Georg Homuth ◽

Alexander Teumer ◽

Uwe Völker ◽

Matthias Nauck

Keyword(s):

Blood Cells ◽

Large Scale ◽

Genetic Factors ◽

Association Studies ◽

Transcriptional Profiling ◽

Genome Wide Association Studies ◽

Protein Levels ◽

Future Developments ◽

Genome Wide ◽

Metabolome Data

The metabolome, defined as the reflection of metabolic dynamics derived from parameters measured primarily in easily accessible body fluids such as serum, plasma, and urine, can be considered as the omics data pool that is closest to the phenotype because it integrates genetic influences as well as nongenetic factors. Metabolic traits can be related to genetic polymorphisms in genome-wide association studies, enabling the identification of underlying genetic factors, as well as to specific phenotypes, resulting in the identification of metabolome signatures primarily caused by nongenetic factors. Similarly, correlation of metabolome data with transcriptional or/and proteome profiles of blood cells also produces valuable data, by revealing associations between metabolic changes and mRNA and protein levels. In the last years, the progress in correlating genetic variation and metabolome profiles was most impressive. This review will therefore try to summarize the most important of these studies and give an outlook on future developments.

Download Full-text

Better estimation of SNP heritability from summary statistics provides a new understanding of the genetic architecture of complex traits

10.1101/284976 ◽

2018 ◽

Cited By ~ 6

Author(s):

Doug Speed ◽

David J Balding

Keyword(s):

Complex Traits ◽

Genetic Architecture ◽

Large Scale ◽

Association Studies ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Confounding Bias ◽

Conserved Regions ◽

Genome Wide ◽

Variation Explained

LD Score Regression (LDSC) has been widely applied to the results of genome-wide association studies. However, its estimates of SNP heritability are derived from an unrealistic model in which each SNP is expected to contribute equal heritability. As a consequence, LDSC tends to over-estimate confounding bias, under-estimate the total phenotypic variation explained by SNPs, and provide misleading estimates of the heritability enrichment of SNP categories. Therefore, we present SumHer, software for estimating SNP heritability from summary statistics using more realistic heritability models. After demonstrating its superiority over LDSC, we apply SumHer to the results of 24 large-scale association studies (average sample size 121 000). First we show that these studies have tended to substantially over-correct for confounding, and as a result the number of genome-wide significant loci has under-reported by about 20%. Next we estimate enrichment for 24 categories of SNPs defined by functional annotations. A previous study using LDSC reported that conserved regions were 13-fold enriched, and found a further twelve categories with above 2-fold enrichment. By contrast, our analysis using SumHer finds that conserved regions are only 1.6-fold (SD 0.06) enriched, and that no category has enrichment above 1.7-fold. SumHer provides an improved understanding of the genetic architecture of complex traits, which enables more efficient analysis of future genetic data.

Download Full-text

High-throughput phenotyping reveals a link between transpiration efficiency and transpiration restriction under high evaporative demand and new loci controlling water use-related traits in African rice, Oryza glaberrima Steud.

10.1101/2021.11.28.470237 ◽

2021 ◽

Author(s):

Pablo Affortit ◽

Branly Effa Effa ◽

Mame Sokhatil Ndoye ◽

Daniel Moukouanga ◽

Nathalie Luchaire ◽

...

Keyword(s):

Water Use Efficiency ◽

Water Use ◽

Association Studies ◽

Crop Improvement ◽

Oryza Glaberrima ◽

Genome Wide Association Studies ◽

Transpiration Efficiency ◽

Evaporative Demand ◽

African Rice ◽

Use Efficiency

Because water availability is the most important environmental factor limiting crop production, improving water use efficiency, the amount of carbon fixed per water used, is a major target for crop improvement. In rice, the genetic bases of transpiration efficiency, the derivation of water use efficiency at the whole-plant scale, and its putative component trait transpiration restriction under high evaporative demand, remain unknown. These traits were measured in a panel of 147 African rice Oryza glaberrima genotypes, known as potential sources of tolerance genes to biotic and abiotic stresses. Our results reveal that higher transpiration efficiency is associated with transpiration restriction in African rice. Detailed measurements in a subset of highly differentiated genotypes confirmed these associations and suggested that the root to shoot ratio played an important role in transpiration restriction. Genome wide association studies identified marker-trait associations for transpiration response to evaporative demand, transpiration efficiency and its residuals, that links to genes involved in water transport and cell wall patterning. Our data suggest that root shoot partitioning is an important component of transpiration restriction that has a positive effect on transpiration efficiency in African rice. Both traits are heritable and define targets for breeding rice with improved water use strategies.

Download Full-text

RAFFI: Accurate and fast familial relationship inference in large scale biobank studies using RaPID

PLoS Genetics ◽

10.1371/journal.pgen.1009315 ◽

2021 ◽

Vol 17 (1) ◽

pp. e1009315

Author(s):

Ardalan Naseri ◽

Junjie Shi ◽

Xihong Lin ◽

Shaojie Zhang ◽

Degui Zhi

Keyword(s):

Large Scale ◽

Association Studies ◽

Scale Up ◽

Data Driven ◽

Genome Wide Association Studies ◽

Inference Method ◽

Genome Wide ◽

Familial Relationship ◽

Kinship Coefficients ◽

Data Driven Approach

Inference of relationships from whole-genome genetic data of a cohort is a crucial prerequisite for genome-wide association studies. Typically, relationships are inferred by computing the kinship coefficients (ϕ) and the genome-wide probability of zero IBD sharing (π0) among all pairs of individuals. Current leading methods are based on pairwise comparisons, which may not scale up to very large cohorts (e.g., sample size >1 million). Here, we propose an efficient relationship inference method, RAFFI. RAFFI leverages the efficient RaPID method to call IBD segments first, then estimate the ϕ and π0 from detected IBD segments. This inference is achieved by a data-driven approach that adjusts the estimation based on phasing quality and genotyping quality. Using simulations, we showed that RAFFI is robust against phasing/genotyping errors, admix events, and varying marker densities, and achieves higher accuracy compared to KING, the current leading method, especially for more distant relatives. When applied to the phased UK Biobank data with ~500K individuals, RAFFI is approximately 18 times faster than KING. We expect RAFFI will offer fast and accurate relatedness inference for even larger cohorts.

Download Full-text

297 GWAS for complex models accounting for populations structure with GBLUP and ssGBLUP

Journal of Animal Science ◽

10.1093/jas/skaa278.057 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 32-32

Author(s):

Juan P Steibel ◽

Ignacio Aguilar

Keyword(s):

Hypothesis Testing ◽

Large Scale ◽

Mixed Model ◽

Prediction Models ◽

Association Studies ◽

Least Square ◽

Type I ◽

Phenotypic Variance ◽

Genome Wide Association Studies ◽

Formal Hypothesis Testing

Abstract Genomic Best Linear Unbiased Prediction (GBLUP) is the method of choice for incorporating genomic information into the genetic evaluation of livestock species. Furthermore, single step GBLUP (ssGBLUP) is adopted by many breeders’ associations and private entities managing large scale breeding programs. While prediction of breeding values remains the primary use of genomic markers in animal breeding, a secondary interest focuses on performing genome-wide association studies (GWAS). The goal of GWAS is to uncover genomic regions that harbor variants that explain a large proportion of the phenotypic variance, and thus become candidates for discovering and studying causative variants. Several methods have been proposed and successfully applied for embedding GWAS into genomic prediction models. Most methods commonly avoid formal hypothesis testing and resort to estimation of SNP effects, relying on visual inspection of graphical outputs to determine candidate regions. However, with the advent of high throughput phenomics and transcriptomics, a more formal testing approach with automatic discovery thresholds is more appealing. In this work we present the methodological details of a method for performing formal hypothesis testing for GWAS in GBLUP models. First, we present the method and its equivalencies and differences with other GWAS methods. Moreover, we demonstrate through simulation analyses that the proposed method controls type I error rate at the nominal level. Second, we demonstrate two possible computational implementations based on mixed model equations for ssGBLUP and based on the generalized least square equations (GLS). We show that ssGBLUP can deal with datasets with extremely large number of animals and markers and with multiple traits. GLS implementations are well suited for dealing with smaller number of animals with tens of thousands of phenotypes. Third, we show several useful extensions, such as: testing multiple markers at once, testing pleiotropic effects and testing association of social genetic effects.

Download Full-text

An atlas of polygenic risk score associations to highlight putative causal relationships across the human phenome

10.1101/467910 ◽

2018 ◽

Cited By ~ 1

Author(s):

Tom G. Richardson ◽

Sean Harrison ◽

Gibran Hemani ◽

George Davey Smith

Keyword(s):

Web Application ◽

Large Scale ◽

Complex Disease ◽

Association Studies ◽

Risk Scores ◽

Polygenic Risk Score ◽

Genome Wide Association Studies ◽

Genetic Liability ◽

Polygenic Risk ◽

The Uk

AbstractThe age of large-scale genome-wide association studies (GWAS) has provided us with an unprecedented opportunity to evaluate the genetic liability of complex disease using polygenic risk scores (PRS). In this study, we have analysed 162 PRS (P<5×l0 05) derived from GWAS and 551 heritable traits from the UK Biobank study (N=334,398). Findings can be investigated using a web application (http://mrcieu.mrsoftware.org/PRS_atlas/), which we envisage will help uncover both known and novel mechanisms which contribute towards disease susceptibility.To demonstrate this, we have investigated the results from a phenome-wide evaluation of schizophrenia genetic liability. Amongst findings were inverse associations with measures of cognitive function which extensive follow-up analyses using Mendelian randomization (MR) provided evidence of a causal relationship. We have also investigated the effect of multiple risk factors on disease using mediation and multivariable MR frameworks. Our atlas provides a resource for future endeavours seeking to unravel the causal determinants of complex disease.

Download Full-text