Natural Selection Shapes the Mosaic Ancestry of the Drosophila Genetic Reference Panel and the D. melanogaster Reference Genome

Mapping Intimacies ◽

10.1101/014837 ◽

2015 ◽

Cited By ~ 1

Author(s):

John E Pool

Keyword(s):

Natural Selection ◽

Reference Genome ◽

African Ancestry ◽

Genetic Research ◽

Reference Panel ◽

Genome Wide ◽

A Genome ◽

Source Populations ◽

Genomic Regions ◽

Drosophila Genetic Reference Panel

North American populations of Drosophila melanogaster are thought to derive from both European and African source populations, but despite their importance for genetic research, patterns of admixture along their genomes are essentially undocumented. Here, I infer geographic ancestry along genomes of the Drosophila Genetic Reference Panel (DGRP) and the D. melanogaster reference genome. Overall, the proportion of African ancestry was estimated to be 20% for the DGRP and 9% for the reference genome. Based on the size of admixture tracts and the approximate timing of admixture, I estimate that the DGRP population underwent roughly 13.9 generations per year. Notably, ancestry levels varied strikingly among genomic regions, with significantly less African introgression on the X chromosome, in regions of high recombination, and at genes involved in specific processes such as circadian rhythm. An important role for natural selection during the admixture process was further supported by a genome-wide signal of ancestry disequilibrium, in that many between-chromosome pairs of loci showed a deficiency of Africa-Europe allele combinations. These results support the hypothesis that admixture between partially genetically isolated Drosophila populations led to natural selection against incompatible genetic variants, and that this process is ongoing. The ancestry blocks inferred here may be relevant for the performance of reference alignment in this species, and may bolster the design and interpretation of many population genetic and association mapping studies.

Genome-wide copy number variations in a large cohort of bantu African children

BMC Medical Genomics ◽

10.1186/s12920-021-00978-z ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Feyza Yilmaz ◽

Megan Null ◽

David Astling ◽

Hung-Chun Yu ◽

Joanne Cole ◽

...

Keyword(s):

Copy Number ◽

Developmental Disorders ◽

African Ancestry ◽

22Q11.2 Deletion Syndrome ◽

Copy Number Variations ◽

Genomic Variation ◽

Deletion Syndrome ◽

Genome Wide ◽

A Genome ◽

Genomic Regions

Abstract Background Copy number variations (CNVs) account for a substantial proportion of inter-individual genomic variation. However, a majority of genomic variation studies have focused on single-nucleotide variations (SNVs), with limited genome-wide analysis of CNVs in large cohorts, especially in populations that are under-represented in genetic studies including people of African descent. Methods We carried out a genome-wide copy number analysis in > 3400 healthy Bantu Africans from Tanzania. Signal intensity data from high density (> 2.5 million probes) genotyping arrays were used for CNV calling with three algorithms including PennCNV, DNAcopy and VanillaICE. Stringent quality metrics and filtering criteria were applied to obtain high confidence CNVs. Results We identified over 400,000 CNVs larger than 1 kilobase (kb), for an average of 120 CNVs (SE = 2.57) per individual. We detected 866 large CNVs (≥ 300 kb), some of which overlapped genomic regions previously associated with multiple congenital anomaly syndromes, including Prader-Willi/Angelman syndrome (Type1) and 22q11.2 deletion syndrome. Furthermore, several of the common CNVs seen in our cohort (≥ 5%) overlap genes previously associated with developmental disorders. Conclusions These findings may help refine the phenotypic outcomes and penetrance of variations affecting genes and genomic regions previously implicated in diseases. Our study provides one of the largest datasets of CNVs from individuals of African ancestry, enabling improved clinical evaluation and disease association of CNVs observed in research and clinical studies in African populations.

Genome-wide Copy Number Variations in a Large Cohort of Bantu African Children

10.1101/2020.12.24.424207 ◽

2020 ◽

Author(s):

Feyza Yilmaz ◽

Megan Null ◽

David Astling ◽

Hung-Chun Yu ◽

Joanne Cole ◽

...

Keyword(s):

Copy Number ◽

Developmental Disorders ◽

African Ancestry ◽

Copy Number Variations ◽

Genomic Variation ◽

Deletion Syndrome ◽

Genome Wide Analysis ◽

Genome Wide ◽

A Genome ◽

Genomic Regions

AbstractBackgroundCopy number variations (CNVs) account for a substantial proportion of inter-individual genomic variation. However, a majority of genomic variation studies have focused on single-nucleotide variations (SNVs), with limited genome-wide analysis of CNVs in large cohorts, especially in populations that are under-represented in genetic studies including people of African descent.ResultsIn this study, we carried out a genome-wide analysis in > 3400 healthy Bantu Africans from Tanzania using high density (> 2.5 million probes) genotyping arrays. We identified over 400000 CNVs larger than 1 kilobase (kb), for an average of 120 CNVs (SE = 2.57) per individual. We detected 866 large CNVs (≥ 300 kb), some of which overlapped genomic regions previously associated with multiple congenital anomaly syndromes, including Prader-Willi/Angelman syndrome (Type1) and 22q11.2 deletion syndrome. Furthermore, several of the common CNVs seen in our cohort (≥ 5%) overlap genes previously associated with developmental disorders.ConclusionThese findings may help refine the phenotypic outcomes and penetrance of variations affecting genes and genomic regions previously implicated in diseases. Our study provides one of the largest datasets of CNVs from individuals of African ancestry, enabling improved clinical evaluation and disease association of CNVs observed in research and clinical studies in African populations.

Epigenome-Wide Study Identified Methylation Sites Associated with the Risk of Obesity

Nutrients ◽

10.3390/nu13061984 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1984

Author(s):

Majid Nikpay ◽

Sepehr Ravati ◽

Robert Dent ◽

Ruth McPherson

Keyword(s):

Quantitative Trait Locus ◽

Quantitative Trait ◽

Rare Variants ◽

Genome Wide ◽

A Genome ◽

Genome Wide Search ◽

Risk Snps ◽

Trait Locus ◽

Genomic Regions ◽

Eqtl Data

Here, we performed a genome-wide search for methylation sites that contribute to the risk of obesity. We integrated methylation quantitative trait locus (mQTL) data with BMI GWAS information through a SNP-based multiomics approach to identify genomic regions where mQTLs for a methylation site co-localize with obesity risk SNPs. We then tested whether the identified site contributed to BMI through Mendelian randomization. We identified multiple methylation sites causally contributing to the risk of obesity. We validated these findings through a replication stage. By integrating expression quantitative trait locus (eQTL) data, we noted that lower methylation at cg21178254 site upstream of CCNL1 contributes to obesity by increasing the expression of this gene. Higher methylation at cg02814054 increases the risk of obesity by lowering the expression of MAST3, whereas lower methylation at cg06028605 contributes to obesity by decreasing the expression of SLC5A11. Finally, we noted that rare variants within 2p23.3 impact obesity by making the cg01884057 site more susceptible to methylation, which consequently lowers the expression of POMC, ADCY3 and DNAJC27. In this study, we identify methylation sites associated with the risk of obesity and reveal the mechanism whereby a number of these sites exert their effects. This study provides a framework to perform an omics-wide association study for a phenotype and to understand the mechanism whereby a rare variant causes a disease.

Drosophila Gain-of-Function Mutant RTK Torso Triggers Ectopic Dpp and STAT Signaling

Genetics ◽

10.1093/genetics/164.1.247 ◽

2003 ◽

Vol 164 (1) ◽

pp. 247-258 ◽

Cited By ~ 1

Author(s):

Jinghong Li ◽

Willis X Li

Keyword(s):

Tyrosine Kinases ◽

Target Genes ◽

Tor Signaling ◽

Gain Of Function ◽

Essential Requirement ◽

Downstream Target ◽

Rtk Signaling ◽

Genome Wide ◽

A Genome ◽

Genomic Regions

Abstract Overactivation of receptor tyrosine kinases (RTKs) has been linked to tumorigenesis. To understand how a hyperactivated RTK functions differently from wild-type RTK, we conducted a genome-wide systematic survey for genes that are required for signaling by a gain-of-function mutant Drosophila RTK Torso (Tor). We screened chromosomal deficiencies for suppression of a gain-of-function mutation tor (torGOF), which led to the identification of 26 genomic regions that, when in half dosage, suppressed the defects caused by torGOF. Testing of candidate genes in these regions revealed many genes known to be involved in Tor signaling (such as those encoding the Ras-MAPK cassette, adaptor and structural molecules of RTK signaling, and downstream target genes of Tor), confirming the specificity of this genetic screen. Importantly, this screen also identified components of the TGFβ (Dpp) and JAK/STAT pathways as being required for TorGOF signaling. Specifically, we found that reducing the dosage of thickveins (tkv), Mothers against dpp (Mad), or STAT92E (aka marelle), respectively, suppressed torGOF phenotypes. Furthermore, we demonstrate that in torGOF embryos, dpp is ectopically expressed and thus may contribute to the patterning defects. These results demonstrate an essential requirement of noncanonical signaling pathways for a persistently activated RTK to cause pathological defects in an organism.

Eyes of Africa: The Genetics of Blindness: Study Design and Methodology

BMC Ophthalmology ◽

10.1186/s12886-021-02029-8 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Olusola Olawoye ◽

Chimdi Chuka-Okosa ◽

Onoja Akpa ◽

Tony Realini ◽

Michael Hauser ◽

...

Keyword(s):

Genome Wide Association Study ◽

Case Control Study ◽

African Ancestry ◽

Collaborative Study ◽

Data Set ◽

Human Heredity ◽

Genome Wide ◽

A Genome ◽

Multiplex Families ◽

Control Study

Abstract Background This report describes the design and methodology of the “Eyes of Africa: The Genetics of Blindness,” a collaborative study funded through the Human Heredity and Health in Africa (H3Africa) program of the National Institute of Health. Methods This is a case control study that is collecting a large well phenotyped data set among glaucoma patients and controls for a genome wide association study. (GWAS). Multiplex families segregating Mendelian forms of early-onset glaucoma will also be collected for exome sequencing. Discussion A total of 4500 cases/controls have been recruited into the study at the end of the 3rd funded year of the study. All these participants have been appropriately phenotyped and blood samples have been received from these participants. Recent GWAS of POAG in African individuals demonstrated genome-wide significant association with the APBB2 locus which is an association that is unique to individuals of African ancestry. This study will add to the existing knowledge and understanding of POAG in the African population.

Learning a genome-wide score of human–mouse conservation at the functional genomics level

Nature Communications ◽

10.1038/s41467-021-22653-8 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Soo Bin Kwon ◽

Jason Ernst

Keyword(s):

Mouse Model ◽

Functional Genomics ◽

Functional Genomic ◽

Transcriptomic Data ◽

Model Studies ◽

Genome Wide ◽

A Genome ◽

Important Challenge ◽

Genomic Regions ◽

Human And Mouse

AbstractIdentifying genomic regions with functional genomic properties that are conserved between human and mouse is an important challenge in the context of mouse model studies. To address this, we develop a method to learn a score of evidence of conservation at the functional genomics level by integrating information from a compendium of epigenomic, transcription factor binding, and transcriptomic data from human and mouse. The method, Learning Evidence of Conservation from Integrated Functional genomic annotations (LECIF), trains neural networks to generate this score for the human and mouse genomes. The resulting LECIF score highlights human and mouse regions with shared functional genomic properties and captures correspondence of biologically similar human and mouse annotations. Analysis with independent datasets shows the score also highlights loci associated with similar phenotypes in both species. LECIF will be a resource for mouse model studies by identifying loci whose functional genomic properties are likely conserved.

Genome-Wide Analyses Identifies Known and New Markers Responsible of Chicken Plumage Color

Animals ◽

10.3390/ani10030493 ◽

2020 ◽

Vol 10 (3) ◽

pp. 493

Author(s):

Salvatore Mastrangelo ◽

Filippo Cendron ◽

Gianluca Sottile ◽

Giovanni Niero ◽

Baldassare Portolano ◽

...

Keyword(s):

Genome Wide Association Study ◽

Snp Array ◽

Fixation Index ◽

Phenotypic Traits ◽

Plumage Color ◽

Phenotypic Marker ◽

Genome Wide ◽

A Genome ◽

The Difference ◽

Genomic Regions

Through the development of the high-throughput genotyping arrays, molecular markers and genes related to phenotypic traits have been identified in livestock species. In poultry, plumage color is an important qualitative trait that can be used as phenotypic marker for breed identification. In order to assess sources of genetic variation related to the Polverara chicken breed plumage colour (black vs. white), we carried out a genome-wide association study (GWAS) and a genome-wide fixation index (FST) scan to uncover the genomic regions involved. A total of 37 animals (17 white and 20 black) were genotyped with the Affymetrix 600 K Chicken single nucleotide polymorphism (SNP) Array. The combination of results from GWAS and FST revealed a total of 40 significant markers distributed on GGA 01, 03, 08, 12 and 21, and located within or near known genes. In addition to the well-known TYR, other candidate genes have been identified in this study, such as GRM5, RAB38 and NOTCH2. All these genes could explain the difference between the two Polverara breeds. Therefore, this study provides the basis for further investigation of the genetic mechanisms involved in plumage color in chicken.

Emerging Technologies for Genome-Wide Profiling of DNA Breakage

Frontiers in Genetics ◽

10.3389/fgene.2020.610386 ◽

2021 ◽

Vol 11 ◽

Author(s):

Matthew J. Rybin ◽

Melina Ramic ◽

Natalie R. Ricciardi ◽

Philipp Kapranov ◽

Claes Wahlestedt ◽

...

Keyword(s):

Genome Instability ◽

Dna Double Strand Breaks ◽

Single Nucleotide ◽

Strand Breaks ◽

Single Strand Breaks ◽

Genome Wide ◽

A Genome ◽

Wide Scale ◽

Nucleotide Resolution ◽

Genomic Regions

Genome instability is associated with myriad human diseases and is a well-known feature of both cancer and neurodegenerative disease. Until recently, the ability to assess DNA damage—the principal driver of genome instability—was limited to relatively imprecise methods or restricted to studying predefined genomic regions. Recently, new techniques for detecting DNA double strand breaks (DSBs) and single strand breaks (SSBs) with next-generation sequencing on a genome-wide scale with single nucleotide resolution have emerged. With these new tools, efforts are underway to define the “breakome” in normal aging and disease. Here, we compare the relative strengths and weaknesses of these technologies and their potential application to studying neurodegenerative diseases.

SNP and Haplotype Regional Heritability Mapping (SNHap-RHM): joint mapping of common and rare variation affecting complex traits

10.1101/2021.08.02.454788 ◽

2021 ◽

Author(s):

Richard F Oppong ◽

Pau Navarro ◽

Chris S Haley ◽

Sara Knott

Keyword(s):

Major Depressive Disorder ◽

Depressive Disorder ◽

Complex Traits ◽

Health Study ◽

P Value ◽

Major Depressive ◽

Genome Wide ◽

A Genome ◽

Joint Mapping ◽

Genomic Regions

We describe a genome-wide analytical approach, SNP and Haplotype Regional Heritability Mapping (SNHap-RHM), that provides regional estimates of the heritability across locally defined regions in the genome. This approach utilises relationship matrices that are based on sharing of SNP and haplotype alleles at local haplotype blocks delimited by recombination boundaries in the genome. We implemented the approach on simulated data and show that the haplotype-based regional GRMs capture variation that is complementary to that captured by SNP-based regional GRMs, and thus justifying the fitting of the two GRMs jointly in a single analysis (SNHap-RHM). SNHap-RHM captures regions in the genome contributing to the phenotypic variation that existing genome-wide analysis methods may fail to capture. We further demonstrate that there are real benefits to be gained from this approach by applying it to real data from about 20,000 individuals from the Generation Scotland: Scottish Family Health Study. We analysed height and major depressive disorder (MDD). We identified seven genomic regions that are genome-wide significant for height, and three regions significant at a suggestive threshold (p-value <1x10^(-5) ) for MDD. These significant regions have genes mapped to within 400kb of them. The genes mapped for height have been reported to be associated with height in humans, whiles those mapped for MDD have been reported to be associated with major depressive disorder and other psychiatry phenotypes. The results show that SNHap-RHM presents an exciting new opportunity to analyse complex traits by allowing the joint mapping of novel genomic regions tagged by either SNPs or haplotypes, potentially leading to the recovery of some of the "missing" heritability.

Genome assembly of the JD17 soybean provides a new reference genome for Comparative genomics

10.1101/2021.11.23.469778 ◽

2021 ◽

Author(s):

Xinxin Yi ◽

Jing Liu ◽

Shengcai Chen ◽

Hao Wu ◽

Min Liu ◽

...

Keyword(s):

Nitrogen Fixation ◽

Genome Assembly ◽

Reference Genome ◽

De Novo ◽

Genomic Analysis ◽

Comparative Genomic ◽

High Quality ◽

Genome Wide ◽

A Genome ◽

Cultivated Soybean

Cultivated soybean (Glycine max) is an important source for protein and oil. Many elite cultivars with different traits have been developed for different conditions. Each soybean strain has its own genetic diversity, and the availability of more high-quality soybean genomes can enhance comparative genomic analysis for identifying genetic underpinnings for its unique traits. In this study, we constructed a high-quality de novo assembly of an elite soybean cultivar Jidou 17 (JD17) with chromsome contiguity and high accuracy. We annotated 52,840 gene models and reconstructed 74,054 high-quality full-length transcripts. We performed a genome-wide comparative analysis based on the reference genome of JD17 with three published soybeans (WM82, ZH13 and W05) , which identified five large inversions and two large translocations specific to JD17, 20,984 - 46,912 PAVs spanning 13.1 - 46.9 Mb in size, and 5 - 53 large PAV clusters larger than 500kb. 1,695,741 - 3,664,629 SNPs and 446,689 - 800,489 Indels were identified and annotated between JD17 and them. Symbiotic nitrogen fixation (SNF) genes were identified and the effects from these variants were further evaluated. It was found that the coding sequences of 9 nitrogen fixation-related genes were greatly affected. The high-quality genome assembly of JD17 can serve as a valuable reference for soybean functional genomics research.