scholarly journals Genome-wide discovery of epistatic loci affecting antibiotic resistance using evolutionary couplings

2018 ◽  
Author(s):  
Benjamin Schubert ◽  
Rohan Maddamsetti ◽  
Jackson Nyman ◽  
Maha R. Farhat ◽  
Debora S. Marks

ABSTRACTThe analysis of whole genome sequencing data should, in theory, allow the discovery of interdependent loci that cause antibiotic resistance. In practice, however, identifying this epistasis remains a challenge as the vast number of possible interactions erodes statistical power. To solve this problem, we extend a method that has been successfully used to identify epistatic residues in proteins to infer genomic loci that are strongly coupled and associated with antibiotic resistance. Our method reduces the number of tests required for an epistatic genome-wide association study and increases the likelihood of identifying causal epistasis. We discovered 38 loci and 250 epistatic pairs that influence the dose needed to inhibit growth for five different antibiotics in 1,102 isolates of Neisseria gonorrhoeae that were confirmed in an independent dataset of 495 isolates. Many known resistance-affecting loci were recovered; however, the majority of loci occurred in unreported genes, including murE which was associated with cefixime. About half of the novel epistasis we report involved at least one locus previously associated with antibiotic resistance, including interactions between gyrA and parC associated with ciprofloxacin. Still, many combinations involved unreported loci and genes. Our work provides a systematic identification of epistasis pairs affecting antibiotic resistance in N. gonorrhoeae and a generalizable method for epistatic genome-wide association studies.


2021 ◽  
Author(s):  
Mohsen Yoosefzadeh Najafabadi ◽  
Sepideh Torabi ◽  
Davoud Torkamaneh ◽  
Dan Tulpan ◽  
Istvan Rajcan ◽  
...  

Genome-wide association study (GWAS) is currently one of the important approaches for discovering quantitative trait loci (QTL) associated with traits of interest. However, insufficient statistical power is the limiting factor in current conventional GWAS methods for characterizing quantitative traits, especially in narrow genetic bases plants such as soybean. In this study, we evaluated the potential use of machine learning (ML) algorithms such as support vector machine (SVR) and random forest (RF) in GWAS, compared with two conventional methods of mixed linear models (MLM) and fixed and random model circulating probability unification (FarmCPU), for identifying QTL associated with soybean yield components. In this study, important soybean yield component traits, including the number of reproductive nodes (RNP), non-reproductive nodes (NRNP), total nodes (NP), and total pods (PP) per plant along with yield and maturity were assessed using 227 soybean genotypes evaluated across four environments. Our results indicated SVR-mediated GWAS outperformed RF, MLM and FarmCPU in discovering the most relevant QTL associated with the traits, supported by the functional annotation of candidate gene analyses. This study for the first time demonstrated the potential benefit of using sophisticated mathematical approaches such as ML algorithms in GWAS for identifying QTL suitable for genomic-based breeding programs.



2021 ◽  
Author(s):  
Xiaoru Sun ◽  
Hongkai Li ◽  
Yuanyuan Yu ◽  
Zhongshang Yuan ◽  
Chuandi Jin ◽  
...  

Genome-wide association study (GWAS) is fundamentally designed to detect disease-causing genes. To reduce spurious associations or improve statistical power, about 80% of GWASs arbitrarily adjusted for demographic or clinical covariates. However, adjustment strategies in GWASs have not achieved consistent conclusions. Given the initial aim of GWAS that is to identify the causal association between a specific causal single-nucleotide polymorphism (SNP) and disease trait, we summarized all complex relationships of the target SNP, covariate and disease trait into 15 causal diagrams according to various roles of the covariate. Following each causal diagram, we conducted a series of theoretical justifications and statistical simulations. Our results demonstrate that it is unadvisable to adjust for any demographic or clinical covariates. We illustrate our point by applying GWASs for body mass index (BMI) and breast cancer, including adjusting and non-adjusting for age and smoking status. Genetic effects and P values might vary across different strategies. Instead, adjustments for SNPs (G') should be strongly recommended when G' are in linkage disequilibrium with the target SNP, and correlated with disease trait conditional on the target SNP. Specifically, adjustment for such G' can block all the confounding paths between the target SNP and disease trait, and avoid over-adjusting for colliders or intermediaries.



2018 ◽  
Author(s):  
Matthew D. C. Neville ◽  
Jihoon Choi ◽  
Jonathan Lieberman ◽  
Qing Ling Duan

AbstractBackgroundCandidate gene and genome-wide association studies have identified hundreds of asthma risk loci. The majority of associated variants, however, are not known to have any biological function and are believed to represent markers rather than true causative mutations. We hypothesized that many of these associated markers are in linkage disequilibrium (LD) with the elusive causative variants.MethodsWe compiled a comprehensive list of 447 asthma-associated variants previously reported in candidate gene and genome-wide association studies. Next, we identified all sequence variants located within the 304 unique genes using whole-genome sequencing data from the 1000 Genomes Project. Then, we calculated the LD between known asthma variants and the sequence variants within each gene. LD variants identified were then annotated to determine those that are potentially deleterious and/or functional (i.e. coding or regulatory effects on the encoded transcript or protein).ResultsWe identified 10,048 variants in LD (r2 > 0.6) with known asthma variants. Annotations of these LD variants revealed that several have potentially deleterious effects including frameshift, alternate splice site, stop-lost, and missense. Moreover, 24 of the LD variants have been reported to regulate gene expression as expression quantitative trait loci (eQTLs).ConclusionsThis study is proof of concept that many of the genetic loci previously associated with complex diseases such as asthma are not causative but represent markers of disease, which are in LD with the elusive causative variants. We hereby report a number of potentially deleterious and regulatory variants that are in LD with the reported asthma loci. These reported LD variants could account for the original association signals with asthma and represent the true causative mutations at these loci.



2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Daniel L. McCartney ◽  
Josine L. Min ◽  
Rebecca C. Richmond ◽  
Ake T. Lu ◽  
Maria K. Sobczyk ◽  
...  

Abstract Background Biological aging estimators derived from DNA methylation data are heritable and correlate with morbidity and mortality. Consequently, identification of genetic and environmental contributors to the variation in these measures in populations has become a major goal in the field. Results Leveraging DNA methylation and SNP data from more than 40,000 individuals, we identify 137 genome-wide significant loci, of which 113 are novel, from genome-wide association study (GWAS) meta-analyses of four epigenetic clocks and epigenetic surrogate markers for granulocyte proportions and plasminogen activator inhibitor 1 levels, respectively. We find evidence for shared genetic loci associated with the Horvath clock and expression of transcripts encoding genes linked to lipid metabolism and immune function. Notably, these loci are independent of those reported to regulate DNA methylation levels at constituent clock CpGs. A polygenic score for GrimAge acceleration showed strong associations with adiposity-related traits, educational attainment, parental longevity, and C-reactive protein levels. Conclusion This study illuminates the genetic architecture underlying epigenetic aging and its shared genetic contributions with lifestyle factors and longevity.



PLoS ONE ◽  
2018 ◽  
Vol 13 (3) ◽  
pp. e0193256 ◽  
Author(s):  
Zhaozhong Zhu ◽  
Verneri Anttila ◽  
Jordan W. Smoller ◽  
Phil H. Lee


2018 ◽  
Author(s):  
Natalie Terzikhan ◽  
Fangui Sun ◽  
Fien M. Verhamme ◽  
Hieab H.H. Adams ◽  
Daan Loth ◽  
...  

AbstractBackgroundAlthough several genome wide association studies (GWAS) have investigated the genetics of pulmonary ventilatory function, little is known about the genetic factors that influence gas exchange.AimTo investigate the heritability of, and genetic variants associated with the diffusing capacity of the lung.MethodsGWAS was performed on diffusing capacity, measured by carbon monoxide uptake (DLCO) and per alveolar volume (DLCO/VA) using the single-breath technique, in 8,372 individuals from two population-based cohort studies, the Rotterdam Study and the Framingham Heart Study. Heritability was estimated in related (n=6,246) and unrelated (n=3,286) individuals.ResultsHeritability of DLCO and DLCO/VA ranged between 23% and 28% in unrelated individuals and between 45% and 49% in related individuals. Meta-analysis identified a genetic variant in GPR126 that is significantly associated with DLCO/VA. Gene expression analysis of GPR126 in human lung tissue revealed a decreased expression in patients with COPD and subjects with decreased DLCO/VA.ConclusionDLCO and DLCO/VA are heritable traits, with a considerable proportion of variance explained by genetics. A functional variant in GPR126 gene region was significantly associated with DLCO/VA. Pulmonary GPR126 expression was decreased in patients with COPD.



Agronomy ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 2386
Author(s):  
Pierre-Olivier Hébert ◽  
Martin Laforest ◽  
Dong Xu ◽  
Marie Ciotola ◽  
Mélanie Cadieux ◽  
...  

Bacterial leaf spot of lettuce, caused by Xanthomonas hortorum pv. vitians, is an economically important disease worldwide. For instance, it caused around 4 million CAD in losses in only a few months during the winter of 1992 in Florida. Because only one pesticide is registered to control this disease in Canada, the development of lettuce cultivars tolerant to bacterial leaf spot remains the most promising approach to reduce the incidence and severity of the disease in lettuce fields. The lack of information about the genetic diversity of the pathogen, however, impairs breeding programs, especially when disease resistance is tested on newly developed lettuce germplasm lines. To evaluate the diversity of X. hortorum pv. vitians, a multilocus sequence analysis was performed on 694 isolates collected in Eastern Canada through the summers of 2014 to 2017 and two isolates in 1996 and 2007. All isolates tested were clustered into five phylogroups. Six pathotypes were identified following pathogenicity tests conducted in greenhouses, but when phylogroups were compared with pathotypes, no correlation could be drawn. However, in vitro production of xanthan and xanthomonadins was investigated, and isolates with higher production of xanthomonadins were generally causing less severe symptoms on the tolerant cultivar Little Gem. Whole-genome sequencing was undertaken for 95 isolates belonging to the pathotypes identified, and de novo assembly made with reads unmapped to the reference strain’s genome sequence resulted in 694 contigs ranging from 128 to 120,795 bp. Variant calling was performed prior to genome-wide association studies computed with single-nucleotide polymorphisms (SNPs), copy-number variants and gaps. Polymorphisms with significant p-values were only found on the cultivar Little Gem. Our results allowed molecular identification of isolates likely to cause bacterial leaf spot of lettuce, using two SNPs identified through genome-wide association study.



2019 ◽  
Vol 22 (8) ◽  
pp. 1063-1069 ◽  
Author(s):  
N. S. Yudin ◽  
N. L. Podkolodnyy ◽  
T. A. Agarkova ◽  
E. V. Ignatieva

Selection by means of genetic markers is a promising approach to the eradication of infectious diseases in farm animals, especially in the absence of effective methods of treatment and prevention. Bovine leukemia virus (BLV) is spread throughout the world and represents one of the biggest problems for the livestock production and food security in Russia. However, recent genome-wide association studies have shown that sensitivity/resistance to BLV is polygenic. The aim of this study was to create a catalog of cattle genes and genes of other mammalian species involved in the pathogenesis of BLV-induced infection and to perform gene prioritization using bioinformatics methods. Based on manually collected information from a range of open sources, a total of 446 genes were included in the catalog of cattle genes and genes of other mammals involved in the pathogenesis of BLV-induced infection. The following criteria were used to prioritize 446 genes from the catalog: (1) the gene is associated with leukemia according to a genome-wide association study; (2) the gene is associated with leukemia according to a case-control study; (3) the role of the gene in leukemia development has been studied using knockout mice; (4) protein-protein interactions exist between the gene-encoded protein and either viral particles or individual viral proteins; (5) the gene is annotated with Gene Ontology terms that are overrepresented for a given list of genes; (6) the gene participates in biological pathways from the KEGG or REACTOME databases, which are over-represented for a given list of genes; (7) the protein encoded by the gene has a high number of protein-protein interactions with proteins encoded by other genes from the catalog. Based on each criterion, a rank was assigned to each gene. Then the ranks were summarized and an overall rank was determined. Prioritization of 446 candidate genes allowed us to identify 5 genes of interest (TNF,LTB,BOLA-DQA1,BOLA-DRB3,ATF2), which can affect the sensitivity/resistance of cattle to leukemia.



Sign in / Sign up

Export Citation Format

Share Document