Arylamine N-Acetyltransferases in Prokaryotic and Eukaryotic Genomes: A Survey of Public Databases

AbstractThere is increasing evidence that some functionally related, co-expressed genes cluster within eukaryotic genomes. We present a novel pipeline that delineates such eukaryotic gene clusters. Using this tool for bread wheat, we uncovered 44 clusters of genes that are responsive to the fungal pathogen Fusarium graminearum. As expected, these Fusarium-responsive gene clusters (FRGCs) included metabolic gene clusters, many of which are associated with disease resistance, but hitherto not described for wheat. However, the majority of the FRGCs are non-metabolic, many of which contain clusters of paralogues, including those implicated in plant disease responses, such as glutathione transferases, MAP kinases, and germin-like proteins. 20 of the FRGCs encode nonhomologous, non-metabolic genes (including defence-related genes). One of these clusters includes the characterised Fusarium resistance orphan gene, TaFROG. Eight of the FRGCs map within 6 FHB resistance loci. One small QTL on chromosome 7D (4.7 Mb) encodes eight Fusarium-responsive genes, five of which are within a FRGC. This study provides a new tool to identify genomic regions enriched in genes responsive to specific traits of interest and applied herein it highlighted gene families, genetic loci and biological pathways of importance in the response of wheat to disease.

Download Full-text

Tc8, a Tourist-like Transposon in Caenorhabditis elegans

Genetics ◽

10.1093/genetics/158.3.1081 ◽

2001 ◽

Vol 158 (3) ◽

pp. 1081-1088 ◽

Cited By ~ 2

Author(s):

Quang Hien Le ◽

Kime Turcotte ◽

Thomas Bureau

Keyword(s):

Caenorhabditis Elegans ◽

Expressed Sequence Tags ◽

Large Family ◽

Insertion Sequences ◽

Normal Plant ◽

Nematode Caenorhabditis Elegans ◽

Plant Genomes ◽

Plant Genes ◽

Expressed Sequence ◽

Eukaryotic Genomes

Abstract Members of the Tourist family of miniature inverted-repeat transposable elements (MITEs) are very abundant among a wide variety of plants, are frequently found associated with normal plant genes, and thus are thought to be important players in the organization and evolution of plant genomes. In Arabidopsis, the recent discovery of a Tourist member harboring a putative transposase has shed new light on the mobility and evolution of MITEs. Here, we analyze a family of Tourist transposons endogenous to the genome of the nematode Caenorhabditis elegans (Bristol N2). One member of this large family is 7568 bp in length, harbors an ORF similar to the putative Tourist transposase from Arabidopsis, and is related to the IS5 family of bacterial insertion sequences (IS). Using database searches, we found expressed sequence tags (ESTs) similar to the putative Tourist transposases in plants, insects, and vertebrates. Taken together, our data suggest that Tourist-like and IS5-like transposons form a superfamily of potentially active elements ubiquitous to prokaryotic and eukaryotic genomes.

Download Full-text

Islands of complex DNA are widespread in Drosophila centric heterochromatin.

Genetics ◽

10.1093/genetics/141.1.283 ◽

1995 ◽

Vol 141 (1) ◽

pp. 283-303

Author(s):

M H Le ◽

D Duricka ◽

G H Karpen

Keyword(s):

Dna Sequences ◽

Structure And Function ◽

Southern Analysis ◽

Cloning And Sequencing ◽

Pulsed Field ◽

Drosophila Heterochromatin ◽

And Function ◽

Heterochromatin Structure ◽

Derivatives Of ◽

Eukaryotic Genomes

Abstract Heterochromatin is a ubiquitous yet poorly understood component of multicellular eukaryotic genomes. Major gaps exist in our knowledge of the nature and overall organization of DNA sequences present in heterochromatin. We have investigated the molecular structure of the 1 Mb of centric heterochromatin in the Drosophila minichromosome Dp1187. A genetic screen of irradiated minichromosomes yielded rearranged derivatives of Dp1187 whose structures were determined by pulsed-field Southern analysis and PCR. Three Dp1187 deletion derivatives and an inversion had one breakpoint in the euchromatin and one in the heterochromatin, providing direct molecular access to previously inaccessible parts of the heterochromatin. End-probed pulsed-field restriction mapping revealed the presence of at least three "islands" of complex DNA, Tahiti, Moorea, and Bora Bora, constituting approximately one half of the Dp1187 heterochromatin. Pulsed-field Southern analysis demonstrated that Drosophila heterochromatin in general is composed of alternating blocks of complex DNA and simple satellite DNA. Cloning and sequencing of a small part of one island, Tahiti, demonstrated the presence of a retroposon. The implications of these findings to heterochromatin structure and function are discussed.

Download Full-text

Protocol for HSDFinder: Identifying, annotating, categorizing, and visualizing duplicated genes in eukaryotic genomes

STAR Protocols ◽

10.1016/j.xpro.2021.100619 ◽

2021 ◽

Vol 2 (3) ◽

pp. 100619

Author(s):

Xi Zhang ◽

Yining Hu ◽

David Roy Smith

Keyword(s):

Duplicated Genes ◽

Eukaryotic Genomes

Download Full-text

CRISPR-Based Technologies for the Manipulation of Eukaryotic Genomes

Cell ◽

10.1016/j.cell.2017.04.005 ◽

2017 ◽

Vol 169 (3) ◽

pp. 559 ◽

Cited By ~ 91

Author(s):

Alexis C. Komor ◽

Ahmed H. Badran ◽

David R. Liu

Keyword(s):

Eukaryotic Genomes

Download Full-text

Transposable Elements and Teleost Migratory Behaviour

International Journal of Molecular Sciences ◽

10.3390/ijms22020602 ◽

2021 ◽

Vol 22 (2) ◽

pp. 602

Author(s):

Elisa Carotti ◽

Federica Carducci ◽

Adriana Canapa ◽

Marco Barucca ◽

Samuele Greco ◽

...

Keyword(s):

Transposable Elements ◽

Environmental Changes ◽

Chromosomal Rearrangements ◽

Quantitative Difference ◽

Regulatory Elements ◽

Phylogenetic Position ◽

Migratory Behaviour ◽

Relative Contribution ◽

Migratory Routes ◽

Eukaryotic Genomes

Transposable elements (TEs) represent a considerable fraction of eukaryotic genomes, thereby contributing to genome size, chromosomal rearrangements, and to the generation of new coding genes or regulatory elements. An increasing number of works have reported a link between the genomic abundance of TEs and the adaptation to specific environmental conditions. Diadromy represents a fascinating feature of fish, protagonists of migratory routes between marine and freshwater for reproduction. In this work, we investigated the genomes of 24 fish species, including 15 teleosts with a migratory behaviour. The expected higher relative abundance of DNA transposons in ray-finned fish compared with the other fish groups was not confirmed by the analysis of the dataset considered. The relative contribution of different TE types in migratory ray-finned species did not show clear differences between oceanodromous and potamodromous fish. On the contrary, a remarkable relationship between migratory behaviour and the quantitative difference reported for short interspersed nuclear (retro)elements (SINEs) emerged from the comparison between anadromous and catadromous species, independently from their phylogenetic position. This aspect is likely due to the substantial environmental changes faced by diadromous species during their migratory routes.

Download Full-text

OpenProt 2021: deeper functional annotation of the coding potential of eukaryotic genomes

Nucleic Acids Research ◽

10.1093/nar/gkaa1036 ◽

2020 ◽

Vol 49 (D1) ◽

pp. D380-D388 ◽

Cited By ~ 1

Author(s):

Marie A Brunet ◽

Jean-François Lucier ◽

Maxime Levesque ◽

Sébastien Leblanc ◽

Jean-Francois Jacques ◽

...

Keyword(s):

Confidence Score ◽

Ribosome Profiling ◽

Open Reading Frames ◽

Supporting Evidence ◽

Initial Release ◽

Ncbi Refseq ◽

Computational Resources ◽

Analysis Platform ◽

Eukaryotic Genomes ◽

Reading Frames

Abstract OpenProt (www.openprot.org) is the first proteogenomic resource supporting a polycistronic annotation model for eukaryotic genomes. It provides a deeper annotation of open reading frames (ORFs) while mining experimental data for supporting evidence using cutting-edge algorithms. This update presents the major improvements since the initial release of OpenProt. All species support recent NCBI RefSeq and Ensembl annotations, with changes in annotations being reported in OpenProt. Using the 131 ribosome profiling datasets re-analysed by OpenProt to date, non-AUG initiation starts are reported alongside a confidence score of the initiating codon. From the 177 mass spectrometry datasets re-analysed by OpenProt to date, the unicity of the detected peptides is controlled at each implementation. Furthermore, to guide the users, detectability statistics and protein relationships (isoforms) are now reported for each protein. Finally, to foster access to deeper ORF annotation independently of one’s bioinformatics skills or computational resources, OpenProt now offers a data analysis platform. Users can submit their dataset for analysis and receive the results from the analysis by OpenProt. All data on OpenProt are freely available and downloadable for each species, the release-based format ensuring a continuous access to the data. Thus, OpenProt enables a more comprehensive annotation of eukaryotic genomes and fosters functional proteomic discoveries.

Download Full-text

Genome-wide Analysis of Alternative Pre-mRNA Splicing

Journal of Biological Chemistry ◽

10.1074/jbc.r700033200 ◽

2007 ◽

Vol 283 (3) ◽

pp. 1229-1233 ◽

Cited By ~ 80

Author(s):

Claudia Ben-Dov ◽

Britta Hartmann ◽

Josefin Lundgren ◽

Juan Valcárcel

Keyword(s):

Alternative Splicing ◽

Large Scale ◽

Mrna Splicing ◽

Diagnostic Tools ◽

Primary Transcript ◽

Genome Wide ◽

Human Genes ◽

Multicellular Organisms ◽

Eukaryotic Genomes ◽

Key Questions

Alternative splicing of mRNA precursors allows the synthesis of multiple mRNAs from a single primary transcript, significantly expanding the information content and regulatory possibilities of higher eukaryotic genomes. High-throughput enabling technologies, particularly large-scale sequencing and splicing-sensitive microarrays, are providing unprecedented opportunities to address key questions in this field. The picture emerging from these pioneering studies is that alternative splicing affects most human genes and a significant fraction of the genes in other multicellular organisms, with the potential to greatly influence the evolution of complex genomes. A combinatorial code of regulatory signals and factors can deploy physiologically coherent programs of alternative splicing that are distinct from those regulated at other steps of gene expression. Pre-mRNA splicing and its regulation play important roles in human pathologies, and genome-wide analyses in this area are paving the way for improved diagnostic tools and for the identification of novel and more specific pharmaceutical targets.

Download Full-text

Compositional Determinants of Prion Formation in Yeast

Molecular and Cellular Biology ◽

10.1128/mcb.01140-09 ◽

2009 ◽

Vol 30 (1) ◽

pp. 319-332 ◽

Cited By ~ 113

Author(s):

James A. Toombs ◽

Blake R. McCarty ◽

Eric D. Ross

Keyword(s):

Amino Acid ◽

Amino Acid Composition ◽

Amyloid Formation ◽

Yeast Prion ◽

Hydrophobic Residues ◽

Predictive Methods ◽

Infectious Proteins ◽

Β Sheet ◽

Eukaryotic Genomes

ABSTRACT Numerous prions (infectious proteins) have been identified in yeast that result from the conversion of soluble proteins into β-sheet-rich amyloid-like protein aggregates. Yeast prion formation is driven primarily by amino acid composition. However, yeast prion domains are generally lacking in the bulky hydrophobic residues most strongly associated with amyloid formation and are instead enriched in glutamines and asparagines. Glutamine/asparagine-rich domains are thought to be involved in both disease-related and beneficial amyloid formation. These domains are overrepresented in eukaryotic genomes, but predictive methods have not yet been developed to efficiently distinguish between prion and nonprion glutamine/asparagine-rich domains. We have developed a novel in vivo assay to quantitatively assess how composition affects prion formation. Using our results, we have defined the compositional features that promote prion formation, allowing us to accurately distinguish between glutamine/asparagine-rich domains that can form prion-like aggregates and those that cannot. Additionally, our results explain why traditional amyloid prediction algorithms fail to accurately predict amyloid formation by the glutamine/asparagine-rich yeast prion domains.

Download Full-text

The intrinsic combinatorial organization and information theoretic content of a sequence are correlated to the DNA encoded nucleosome organization of eukaryotic genomes

Bioinformatics ◽

10.1093/bioinformatics/btv679 ◽

2015 ◽

Vol 32 (6) ◽

pp. 835-842 ◽

Cited By ~ 9

Author(s):

Filippo Utro ◽

Valeria Di Benedetto ◽

Davide F.V. Corona ◽

Raffaele Giancarlo

Keyword(s):

Closed Form ◽

Dna Sequence ◽

Chemical Properties ◽

Supplementary Information ◽

Information Theoretic ◽

Nucleosome Organization ◽

A Genome ◽

Intrinsic Complexity ◽

Mathematical Formulas ◽

Eukaryotic Genomes

Abstract Motivation: Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. Results: We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al. Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter ‘encoding’. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: [email protected].

Download Full-text