scholarly journals Recurrent evolution of vertebrate transcription factors by transposase capture

Science ◽  
2021 ◽  
Vol 371 (6531) ◽  
pp. eabc6405 ◽  
Author(s):  
Rachel L. Cosby ◽  
Julius Judd ◽  
Ruiling Zhang ◽  
Alan Zhong ◽  
Nathaniel Garry ◽  
...  

Genes with novel cellular functions may evolve through exon shuffling, which can assemble novel protein architectures. Here, we show that DNA transposons provide a recurrent supply of materials to assemble protein-coding genes through exon shuffling. We find that transposase domains have been captured—primarily via alternative splicing—to form fusion proteins at least 94 times independently over the course of ~350 million years of tetrapod evolution. We find an excess of transposase DNA binding domains fused to host regulatory domains, especially the Krüppel-associated box (KRAB) domain, and identify four independently evolved KRAB-transposase fusion proteins repressing gene expression in a sequence-specific fashion. The bat-specific KRABINER fusion protein binds its cognate transposons genome-wide and controls a network of genes and cis-regulatory elements. These results illustrate how a transcription factor and its binding sites can emerge.

2020 ◽  
Author(s):  
Rachel L. Cosby ◽  
Julius Judd ◽  
Ruiling Zhang ◽  
Alan Zhong ◽  
Nathaniel Garry ◽  
...  

AbstractHow genes with novel cellular functions evolve is a central biological question. Exon shuffling is one mechanism to assemble new protein architectures. Here we show that DNA transposons, which are mobile and pervasive in genomes, have provided a recurrent supply of exons and splice sites to assemble protein-coding genes in vertebrates via exon-shuffling. We find that transposase domains have been captured, primarily via alternative splicing, to form new fusion proteins at least 94 times independently over ∼350 million years of tetrapod evolution. Evolution favors fusion of transposase DNA-binding domains to host regulatory domains, especially the Krüppel-associated Box (KRAB), suggesting transposase capture frequently yields new transcriptional repressors. We show that four independently evolved KRAB-transposase fusion proteins repress gene expression in a sequence-specific fashion. Genetic knockout and rescue of the bat-specific KRABINER fusion gene in cells demonstrates that it binds its cognate transposons genome-wide and controls a vast network of genes and cis-regulatory elements. These results illustrate a powerful mechanism by which a transcription factor and its dispersed binding sites emerge at once from a transposon family.One Sentence SummaryHost-transposase fusion generates novel cellular genes, including deeply conserved and lineage specific transcription factors.


2019 ◽  
Author(s):  
Xiaomeng Zhao ◽  
Weilin Xu ◽  
Sarah Schaack ◽  
Cheng Sun

AbstractBumblebees (Hymenoptera: Apidae) are important pollinating insects that play pivotal roles in crop production and natural ecosystem services. To date, while the protein-coding sequences of bumblebees have been extensively annotated, regulatory elements, such as promoters and enhancers, have been poorly annotated in the bumblebee genome. To achieve a comprehensive profile of accessible chromatin regions and provide clues for all possible regulatory elements in the bumblebee genome, we did ATAC-seq (Assay for Transposase-Accessible Chromatin with high-throughput sequencing) for B. terrestris samples derived from its four developmental stages: egg, larva, pupa, and adult, respectively. The sequencing reads of ATAC-seq were mapped to B. terrestris reference genome, and the accessible chromatin regions of bumblebee were identified and characterized by using bioinformatic methods. Our study will provide important resources not only for uncovering regulatory elements in the bumblebee genome, but also for expanding our understanding of bumblebee biology. The ATAC-seq data generated in this study has been deposited in NCBI GEO (accession#: GSE131063).


2018 ◽  
Author(s):  
Xinchen Wang ◽  
David B. Goldstein

AbstractNon-coding transcriptional regulatory elements are critical for controlling the spatiotemporal expression of genes. Here, we demonstrate that the number of bases in enhancers linked to a gene reflects its disease pathogenicity. Moreover, genes with redundant enhancer domains are depleted of cis-acting genetic variants that disrupt gene expression, and are buffered against the effects of disruptive non-coding mutations. Our results demonstrate that dosage-sensitive genes have evolved robustness to the disruptive effects of genetic variation by expanding their regulatory domains. This resolves a puzzle in the genetic literature about why disease genes are depleted of cis-eQTLs, suggesting that eQTL information may implicate the wrong genes at genome-wide association study loci, and establishes a framework for identifying non-coding regulatory variation with phenotypic consequences.


eLife ◽  
2015 ◽  
Vol 4 ◽  
Author(s):  
Kamesh Narasimhan ◽  
Samuel A Lambert ◽  
Ally WH Yang ◽  
Jeremy Riddell ◽  
Sanie Mnaimneh ◽  
...  

Caenorhabditis elegans is a powerful model for studying gene regulation, as it has a compact genome and a wealth of genomic tools. However, identification of regulatory elements has been limited, as DNA-binding motifs are known for only 71 of the estimated 763 sequence-specific transcription factors (TFs). To address this problem, we performed protein binding microarray experiments on representatives of canonical TF families in C. elegans, obtaining motifs for 129 TFs. Additionally, we predict motifs for many TFs that have DNA-binding domains similar to those already characterized, increasing coverage of binding specificities to 292 C. elegans TFs (∼40%). These data highlight the diversification of binding motifs for the nuclear hormone receptor and C2H2 zinc finger families and reveal unexpected diversity of motifs for T-box and DM families. Motif enrichment in promoters of functionally related genes is consistent with known biology and also identifies putative regulatory roles for unstudied TFs.


2013 ◽  
Vol 11 (01) ◽  
pp. 1340007 ◽  
Author(s):  
EDGAR WINGENDER

By binding to cis-regulatory elements in a sequence-specific manner, transcription factors regulate the activity of nearby genes. Here, we discuss the criteria for a comprehensive classification of human TFs based on their DNA-binding domains. In particular, classification of basic leucine zipper (bZIP) and zinc finger factors is exemplarily discussed. The resulting classification can be used as a template for TFs of other biological species.


Author(s):  
Thomas Griebel ◽  
Dmitry Lapin ◽  
Barbara Kracher ◽  
Lorenzo Concia ◽  
Moussa Benhamed ◽  
...  

AbstractTimely and specific regulation of gene expression is critical for plant responses to environmental and developmental cues. Transcriptional coregulators have emerged as important factors in gene expression control, although they lack DNA-binding domains and the mechanisms by which they are recruited to and function at the chromatin are poorly understood. Plant Topless-related 1 (TPR1), belonging to a family of transcriptional corepressors found across eukaryotes, contributes to immunity signaling in Arabidopsis thaliana and wild tobacco. We performed chromatin immunoprecipitation and sequencing (ChIP-seq) on an Arabidopsis TPR1-GFP expressing transgenic line to characterize genome-wide TPR1-chromatin associations. The analysis revealed ∼1400 genes bound by TPR1, with the majority of binding sites located at gene upstream regions. Among the TPR1 bound genes, we find not only regulators of immunity but also genes controlling growth and development. To support further analysis of TPR1-chromatin complexes and other transcriptional corepressors in plants, we provide two ways to access the processed ChIP-seq data and enable their broader use by the research community.


2020 ◽  
Vol 118 (2) ◽  
pp. e2021171118
Author(s):  
Gi Bae Kim ◽  
Ye Gao ◽  
Bernhard O. Palsson ◽  
Sang Yup Lee

A transcription factor (TF) is a sequence-specific DNA-binding protein that modulates the transcription of a set of particular genes, and thus regulates gene expression in the cell. TFs have commonly been predicted by analyzing sequence homology with the DNA-binding domains of TFs already characterized. Thus, TFs that do not show homologies with the reported ones are difficult to predict. Here we report the development of a deep learning-based tool, DeepTFactor, that predicts whether a protein in question is a TF. DeepTFactor uses a convolutional neural network to extract features of a protein. It showed high performance in predicting TFs of both eukaryotic and prokaryotic origins, resulting in F1 scores of 0.8154 and 0.8000, respectively. Analysis of the gradients of prediction score with respect to input suggested that DeepTFactor detects DNA-binding domains and other latent features for TF prediction. DeepTFactor predicted 332 candidate TFs in Escherichia coli K-12 MG1655. Among them, 84 candidate TFs belong to the y-ome, which is a collection of genes that lack experimental evidence of function. We experimentally validated the results of DeepTFactor prediction by further characterizing genome-wide binding sites of three predicted TFs, YqhC, YiaU, and YahB. Furthermore, we made available the list of 4,674,808 TFs predicted from 73,873,012 protein sequences in 48,346 genomes. DeepTFactor will serve as a useful tool for predicting TFs, which is necessary for understanding the regulatory systems of organisms of interest. We provide DeepTFactor as a stand-alone program, available at https://bitbucket.org/kaistsystemsbiology/deeptfactor.


2020 ◽  
Author(s):  
Sarthok Rasique Rahman ◽  
Jonathan Cnaani ◽  
Lisa N. Kinch ◽  
Nick V. Grishin ◽  
Heather M. Hines

AbstractBackgroundIn the model bumble bee species B. terrestris, both males and females exhibit black coloration on the third thoracic and first metasomal segments. We discovered a fortuitous lab-generated mutant in which this typical black coloration is replaced by yellow. As this same color variant is found in several sister lineages to B. terrestris within the Bombus s.s. subgenus, this could be a result of ancestral allele sorting.ResultsUtilizing a combination of RAD-Seq and whole-genome re-sequencing approaches, we localized the color-generating variant to a single SNP in the protein-coding sequence of a homeobox transcription factor, cut. Sanger sequencing confirmed fixation of this SNP between wildtype and yellow mutants. Protein domain analysis revealed this SNP to generate an amino acid change (Ala38Pro) that modifies the conformation of coiled-coil structural elements which lie outside the characteristic DNA binding domains. We found all Hymenopterans including B. terrestris sister lineages possess the non-mutant allele, indicating different mechanism(s) are involved in the same black to yellow transition in nature.ConclusionsCut is a highly pleiotropic gene important for multiple facets of development, yet this mutation generated no noticeable external phenotypic effects outside of setal characteristics. Reproductive capacity was observed to be reduced, however, with queens being less likely to mate and produce female offspring, in a manner similar to workers. Our research implicates a novel developmental player in pigmentation, and potentially caste as well, thus contributing to a better understanding of the evolution of diversity in both of these processes.


2020 ◽  
Vol 23 (2) ◽  
pp. 113-120
Author(s):  
A. Athanassiadou

Determination of the DNA sequence of the human genome, revealing extensive genetic variation, and the mapping of the genes and the various regulatory elements of genome function within the genomic DNA, has revolutionized the way we view the states of health and disease in our time. Genetic complexity of the genome is manifested on different levels. The first level refers to the expression of protein coding genes, as regulated by their individual promoter in linear proximity. The next level of genetic complexity involves long distance action by far away enhancers, interacting with promoters through DNA looping. This 3- dimensional (3D) regulation is further developing by chromosome folding into the so called transcription factories, for fully physiological expression. Chromosome folding, mediated by specific genetic elements - insulators - is adding to the genetic complexity by facilitating movements of chromatin of specific genomic regions - the so-called topologically associated domains (TAD) in support of transcription and other cellular functions. Further genetic complexity has emerged with the finding that over 75% of the genome is transcribed and except of the coding genes, a plethora of RNA transcripts are produced - the non-coding RNA - that has important regulatory roles in the gene expression context. The great variation of genome sequence and regulatory elements of the genome architecture are exploited in studies of genome-wide association with disease, in the framework of Precision Medicine and in general of Genomic Medicine.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e8041
Author(s):  
Jian Gao ◽  
Hua Peng ◽  
Fabo Chen ◽  
Mao Luo ◽  
Wenbo Li

Carmine radish produced in Chongqing is famous for containing a natural red pigment (red radish pigment). However, the anthocyanin biosynthesis transcriptome and the expression of anthocyanin biosynthesis-related genes in carmine radish have not been fully investigated. Uncovering the mechanism of anthocyanin biosynthesis in the ‘Hongxin 1’ carmine radish cultivar has become a dominant research topic in this field. In this study, a local carmine radish cultivar named ‘Hongxin 1’ containing a highly natural red pigment was used to analyze transcription factors (TFs) related to anthocyanin biosynthesis during the dynamic development of fleshy roots. Based on RNA sequencing data, a total of 1,747 TFs in 64 TF families were identified according to their DNA-binding domains. Of those, approximately 71 differentially expressed transcription factors (DETFs) were commonly detected in any one stage compared with roots in the seedling stage (SS_root). Moreover, 26 transcripts of DETFs targeted by 74 miRNAs belonging to 25 miRNA families were identified, including MYB, WRKY, bHLH, ERF, GRAS, NF-YA, C2H2-Dof, and HD-ZIP. Finally, eight DETF transcripts belonging to the C2C2-Dof, bHLH and ERF families and their eight corresponding miRNAs were selected for qRT-PCR to verify their functions related to anthocyanin biosynthesis during the development of carmine radish fleshy roots. Finally, we propose a putative miRNA-target regulatory model associated with anthocyanin biosynthesis in carmine radish. Our findings suggest that sucrose synthase might act as an important regulator to modulate anthocyanin biosynthesis in carmine radish by inducing several miRNAs (miR165a-5p, miR172b, miR827a, miR166g and miR1432-5p) targeting different ERFs than candidate miRNAs in the traditional WMBW complex in biological processes.


Sign in / Sign up

Export Citation Format

Share Document