Chromosome X-wide Analysis of Positive Selection in Human Populations: Common and Private Signals of Selection and its Impact on Inactivated Genes and Enhancers

The ability of detecting adaptive (positive) selection in the genome has opened the possibility of understanding the genetic basis of population-specific adaptations genome-wide. Here, we present the analysis of recent selective sweeps, specifically in the X chromosome, in human populations from the third phase of the 1,000 Genomes Project using three different haplotype-based statistics. We describe instances of recent positive selection that fit the criteria of hard or soft sweeps, and detect a higher number of events among sub-Saharan Africans than non-Africans (Europe and East Asia). A global enrichment of neural-related processes is observed and numerous genes related to fertility appear among the top candidates, reflecting the importance of reproduction in human evolution. Commonalities with previously reported genes under positive selection are found, while particularly strong new signals are reported in specific populations or shared across different continental groups. We report an enrichment of signals in genes that escape X chromosome inactivation, which may contribute to the differentiation between sexes. We also provide evidence of a widespread presence of soft-sweep-like signatures across the chromosome and a global enrichment of highly scoring regions that overlap potential regulatory elements. Among these, enhancers-like signatures seem to present putative signals of positive selection which might be in concordance with selection in their target genes. Also, particularly strong signals appear in regulatory regions that show differential activities, which might point to population-specific regulatory adaptations.

Download Full-text

Chromosome X-wide analysis of positive selection in human populations: from common and private signals to selection impact on inactivated genes and enhancers-like signatures

10.1101/2021.05.24.445399 ◽

2021 ◽

Author(s):

Pablo Villegas Mirón ◽

Sandra Acosta ◽

Jessica Nye ◽

Jaume Bertranpetit ◽

Hafid Laayouni

Keyword(s):

Positive Selection ◽

X Chromosome ◽

Target Genes ◽

Regulatory Elements ◽

Human Populations ◽

Third Phase ◽

Recent Positive Selection ◽

Genome Wide ◽

Private Signals ◽

Sub Saharan

The ability of detecting adaptive (positive) selection in the genome has opened the possibility of understanding the genetic bases of population-specific adaptations genome-wide. Here we present the analysis of recent selective sweeps specifically in the X chromosome in different human populations from the third phase of the 1000 Genomes Project using three different haplotype-based statistics. We describe numerous instances of genes under recent positive selection that fit the regimes of hard and soft sweeps, showing a higher amount of detectable sweeps in sub-Saharan Africans than in non-Africans (Europe and East Asia). A global enrichment is seen in neural-related processes while numerous genes related to fertility appear among the top candidates, reflecting the importance of reproduction in human evolution. Commonalities with previously reported genes under positive selection are found, while particularly strong new signals are reported in specific populations or shared across different continental groups. We report an enrichment of signals in genes that escape X chromosome inactivation, which may contribute to the differentiation between sexes. We also provide evidence of a widespread presence of soft-sweep-like signatures across the chromosome and a global enrichment of highly scoring regions that overlap potential regulatory elements. Among these, enhancers-like signatures seem to present putative signals of positive selection that might be in concordance with selection in their target genes. Also, particularly strong signals appear in regulatory regions that show differential activities, which might point to population-specific regulatory adaptations.

Download Full-text

HCR-FlowFISH: A flexible CRISPR screening method to identify cis-regulatory elements and their target genes

10.1101/2020.05.11.078675 ◽

2020 ◽

Author(s):

SK Reilly ◽

SJ Gosai ◽

A Gutierrez ◽

JC Ulirsch ◽

M Kanai ◽

...

Keyword(s):

Gene Expression ◽

Target Genes ◽

Screening Method ◽

Cell Types ◽

Regulatory Elements ◽

Hybridization Chain Reaction ◽

Genome Wide ◽

Wide Range ◽

Causal Variants ◽

Endogenous Loci

AbstractCRISPR screens for cis-regulatory elements (CREs) have shown unprecedented power to endogenously characterize the non-coding genome. To characterize CREs we developed HCR-FlowFISH (Hybridization Chain Reaction Fluorescent In-Situ Hybridization coupled with Flow Cytometry), which directly quantifies native transcripts within their endogenous loci following CRISPR perturbations of regulatory elements, eliminating the need for restrictive phenotypic assays such as growth or transcript-tagging. HCR-FlowFISH accurately quantifies gene expression across a wide range of transcript levels and cell types. We also developed CASA (CRISPR Activity Screen Analysis), a hierarchical Bayesian model to identify and quantify CRE activity. Using >270,000 perturbations, we identified CREs for GATA1, HDAC6, ERP29, LMO2, MEF2C, CD164, NMU, FEN1 and the FADS gene cluster. Our methods detect subtle gene expression changes and identify CREs regulating multiple genes, sometimes at different magnitudes and directions. We demonstrate the power of HCR-FlowFISH to parse genome-wide association signals by nominating causal variants and target genes.

Download Full-text

Addressing the Missing Heritability Problem With the Help of Regulatory Features

Evolutionary Bioinformatics ◽

10.1177/1176934319860861 ◽

2019 ◽

Vol 15 ◽

pp. 117693431986086

Author(s):

Shan-Shan Dong ◽

Yan Guo ◽

Tie-Lin Yang

Keyword(s):

Target Genes ◽

Association Studies ◽

Complex Diseases ◽

Regulatory Elements ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Susceptibility Loci ◽

Missing Heritability ◽

Genome Wide ◽

Missing Heritability Problem

Genome-wide association studies (GWASs) have successfully identified thousands of susceptibility loci for human complex diseases. However, missing heritability is still a challenging problem. Considering most GWAS loci are located in regulatory elements, we recently developed a pipeline named functional disease-associated single-nucleotide polymorphisms (SNPs) prediction (FDSP), to predict novel susceptibility loci for complex diseases based on the interpretation of regulatory features and published GWAS results with machine learning. When applied to type 2 diabetes and hypertension, the predicted susceptibility loci by FDSP were proved to be capable of explaining additional heritability. In addition, potential target genes of the predicted positive SNPs were significantly enriched in disease-related pathways. Our results suggested that taking regulatory features into consideration might be a useful way to address the missing heritability problem. We hope FDSP could offer help for the identification of novel susceptibility loci for complex diseases.

Download Full-text

Beyond the ENCODE project: using genomics and epigenomics strategies to study enhancer evolution

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2013.0022 ◽

2013 ◽

Vol 368 (1632) ◽

pp. 20130022 ◽

Cited By ~ 13

Author(s):

Noboru Jo Sakabe ◽

Marcelo A. Nobrega

Keyword(s):

Gene Expression ◽

Target Genes ◽

Expression Patterns ◽

Dna Interaction ◽

Regulatory Elements ◽

Specific Gene ◽

Genome Wide ◽

Identify Transcription Factor ◽

Specific Proteins ◽

Species Specific

The complex expression patterns observed for many genes are often regulated by distal transcription enhancers. Changes in the nucleotide sequences of enhancers may therefore lead to changes in gene expression, representing a central mechanism by which organisms evolve. With the development of the experimental technique of chromatin immunoprecipitation (ChIP), in which discrete regions of the genome bound by specific proteins can be identified, it is now possible to identify transcription factor binding events (putative cis -regulatory elements) in entire genomes. Comparing protein–DNA binding maps allows us, for the first time, to attempt to identify regulatory differences and infer global patterns of change in gene expression across species. Here, we review studies that used genome-wide ChIP to study the evolution of enhancers. The trend is one of high divergence of cis -regulatory elements between species, possibly compensated by extensive creation and loss of regulatory elements and rewiring of their target genes. We speculate on the meaning of the differences observed and discuss that although ChIP experiments identify the biochemical event of protein–DNA interaction, it cannot determine whether the event results in a biological function, and therefore more studies are required to establish the effect of divergence of binding events on species-specific gene expression.

Download Full-text

Genome-wide detection and characterization of positive selection in human populations

Nature ◽

10.1038/nature06250 ◽

2007 ◽

Vol 449 (7164) ◽

pp. 913-918 ◽

Cited By ~ 1139

Author(s):

Pardis C. Sabeti ◽

◽

Patrick Varilly ◽

Ben Fry ◽

Jason Lohmueller ◽

...

Keyword(s):

Positive Selection ◽

Human Populations ◽

Genome Wide

Download Full-text

A Novel Type 2 Diabetes Locus in sub-Saharan Africans, ZRANB3, is Implicated in Beta Cell Proliferation

10.1101/646513 ◽

2019 ◽

Author(s):

Adebowale A. Adeyemo ◽

Guanjie Chen ◽

Ayo P. Doumatey ◽

Timothy L. Hostelley ◽

Carmen C. Leitch ◽

...

Keyword(s):

Type 2 Diabetes ◽

Beta Cell ◽

Pancreatic Beta Cell ◽

European Ancestry ◽

Human Populations ◽

Beta Cell Proliferation ◽

Genome Wide ◽

Significant Locus ◽

Sub Saharan

AbstractGenome analysis of diverse human populations has contributed to the identification of novel genomic loci for diseases of major clinical and public health impact. Here, we report the largest genome-wide analysis of type 2 diabetes (T2D) in sub-Saharan Africans, an understudied ancestral group. We analyzed ~18 million autosomal SNPs in 5,231 individuals from Nigeria, Ghana and Kenya. TCF7L2 rs7903156 was the most significant locus (p=7.288 × 10−13). We identified a novel genome wide significant locus: ZRANB3 (Zinc Finger RANBP2-Type Containing 3, lead SNP chr2:136064024, T allele frequency=0.034, p=2.831×10−9). Knockdown of the zebrafish ortholog resulted in reduction in pancreatic beta cell number in the developing organism, suggesting a potential mechanism for its effect on glucose hemostasis. We also showed transferability in our study of 32 established T2D loci. Our findings provide evidence of a novel candidate T2D locus and advance understanding of the genetics of T2D in non-European ancestry populations.

Download Full-text

Genome-Wide cis-Regulatory Element Based Discovery of Auxin-Responsive Genes in Higher Plant

Genes ◽

10.3390/genes13010024 ◽

2021 ◽

Vol 13 (1) ◽

pp. 24

Author(s):

Jianfei Wu ◽

Fan Gao ◽

Tongtong Li ◽

Haixia Guo ◽

Li Zhang ◽

...

Keyword(s):

Target Genes ◽

Regulatory Element ◽

Regulatory Elements ◽

Auxin Response ◽

Higher Plant ◽

Response Factors ◽

Auxin Response Factors ◽

Genome Wide ◽

A Genome ◽

Almost All

Auxin has a profound impact on plant physiology and participates in almost all aspects of plant development processes. Auxin exerts profound pleiotropic effects on plant growth and differentiation by regulating the auxin response genes’ expressions. The classical auxin reaction is usually mediated by auxin response factors (ARFs), which bind to the auxin response element (AuxRE) in the promoter region of the target gene. Experiments have generated only a limited number of plant genes with well-characterized functions. It is still unknown how many genes respond to exogenous auxin treatment. An economical and effective method was proposed for the genome-wide discovery of genes responsive to auxin in a model plant, Arabidopsis thaliana (A. thaliana). Our method relies on cis-regulatory-element-based targeted gene finding across different promoters in a genome. We first exploit and analyze auxin-specific cis-regulatory elements for the transcription of the target genes, and then identify putative auxin responsive genes whose promoters contain the elements in the collection of over 25,800 promoters in the A. thaliana genome. Evaluating our result by comparing with a published database and the literature, we found that this method has an accuracy rate of 65.2% (309/474) for predicting candidate genes responsive to auxin. Chromosome distribution and annotation of the putative auxin-responsive genes predicted here were also mined. The results can markedly decrease the number of identified but merely potential auxin target genes and also provide useful clues for improving the annotation of gene that lack functional information.

Download Full-text

Genome-Wide Selection Scan in an Arabian Peninsula Population Identifies a TNKS Haplotype Linked to Metabolic Traits and Hypertension

Genome Biology and Evolution ◽

10.1093/gbe/evaa033 ◽

2020 ◽

Vol 12 (3) ◽

pp. 77-87 ◽

Cited By ~ 2

Author(s):

Muthukrishnan Eaaswarkhanth ◽

Andre Luiz Campelo dos Santos ◽

Omer Gokcumen ◽

Fahd Al-Mulla ◽

Thangavel Alphonse Thanaraj

Keyword(s):

Natural Selection ◽

Positive Selection ◽

Arabian Peninsula ◽

Integrative Approach ◽

Human Populations ◽

Extended Haplotype ◽

Major Health ◽

Fitness Advantage ◽

Genome Wide ◽

Kuwaiti Population

Abstract Despite the extreme and varying environmental conditions prevalent in the Arabian Peninsula, it has experienced several waves of human migrations following the out-of-Africa diaspora. Eventually, the inhabitants of the peninsula region adapted to the hot and dry environment. The adaptation and natural selection that shaped the extant human populations of the Arabian Peninsula region have been scarcely studied. In an attempt to explore natural selection in the region, we analyzed 662,750 variants in 583 Kuwaiti individuals. We searched for regions in the genome that display signatures of positive selection in the Kuwaiti population using an integrative approach in a conservative manner. We highlight a haplotype overlapping TNKS that showed strong signals of positive selection based on the results of the multiple selection tests conducted (integrated Haplotype Score, Cross Population Extended Haplotype Homozygosity, Population Branch Statistics, and log-likelihood ratio scores). Notably, the TNKS haplotype under selection potentially conferred a fitness advantage to the Kuwaiti ancestors for surviving in the harsh environment while posing a major health risk to present-day Kuwaitis.

Download Full-text

EnhancerAtlas 2.0: an updated resource with enhancer annotation in 586 tissue/cell types across nine species

Nucleic Acids Research ◽

10.1093/nar/gkz980 ◽

2019 ◽

Cited By ~ 8

Author(s):

Tianshun Gao ◽

Jiang Qian

Keyword(s):

Large Scale ◽

Target Gene ◽

Target Genes ◽

Cell Types ◽

Regulatory Elements ◽

Tissue Cell ◽

Normal Tissues ◽

Genome Wide ◽

Wide Range ◽

User Friendly

Abstract Enhancers are distal cis-regulatory elements that activate the transcription of their target genes. They regulate a wide range of important biological functions and processes, including embryogenesis, development, and homeostasis. As more and more large-scale technologies were developed for enhancer identification, a comprehensive database is highly desirable for enhancer annotation based on various genome-wide profiling datasets across different species. Here, we present an updated database EnhancerAtlas 2.0 (http://www.enhanceratlas.org/indexv2.php), covering 586 tissue/cell types that include a large number of normal tissues, cancer cell lines, and cells at different development stages across nine species. Overall, the database contains 13 494 603 enhancers, which were obtained from 16 055 datasets using 12 high-throughput experiment methods (e.g. H3K4me1/H3K27ac, DNase-seq/ATAC-seq, P300, POLR2A, CAGE, ChIA-PET, GRO-seq, STARR-seq and MPRA). The updated version is a huge expansion of the first version, which only contains the enhancers in human cells. In addition, we predicted enhancer–target gene relationships in human, mouse and fly. Finally, the users can search enhancers and enhancer–target gene relationships through five user-friendly, interactive modules. We believe the new annotation of enhancers in EnhancerAtlas 2.0 will facilitate users to perform useful functional analysis of enhancers in various genomes.

Download Full-text

A curated benchmark of enhancer-gene interactions for evaluating enhancer-target gene prediction methods

10.1101/745844 ◽

2019 ◽

Cited By ~ 3

Author(s):

Jill E. Moore ◽

Henry Pratt ◽

Michael Purcaro ◽

Zhiping Weng

Keyword(s):

Computational Methods ◽

Method Development ◽

Target Genes ◽

Gene Prediction ◽

Cell Types ◽

Regulatory Elements ◽

Distance Method ◽

Genome Wide ◽

Benchmark Datasets ◽

Better Than

ABSTRACTMany genome-wide collections of candidate cis-regulatory elements (cCREs) have been defined using genomic and epigenomic data, but it remains a major challenge to connect these elements to their target genes. To facilitate the development of computational methods for predicting target genes, we developed a Benchmark of candidate Enhancer-Gene Interactions (BENGI) by integrating the Registry of cCREs we developed recently with experimentally-derived genomic interactions. We used BENGI to test several published computational methods for linking enhancers with genes, including signal correlation and the supervised learning methods TargetFinder and PEP. We found that while TargetFinder was the best performing method, it was modestly better than a baseline distance method for most benchmark datasets while trained and tested within the same cell type and that TargetFinder often did not outperform the distance method when applied across cell types. Our results suggest that current computational methods need to be improved and that BENGI presents a useful framework for method development and testing.

Download Full-text