scholarly journals Animal, fungi, and plant genome sequences harbour different non-canonical splice sites

2019 ◽  
Author(s):  
Katharina Frey ◽  
Boas Pucker

AbstractMost protein encoding genes in eukaryotes contain introns which are interwoven with exons. After transcription, introns need to be removed in order to generate the final mRNA which can be translated into an amino acid sequence. Precise excision of introns by the spliceosome requires conserved dinucleotides which mark the splice sites. However, there are variations of the highly conserved combination of GT at the 5’ end and AG at the 3’ end of an intron in the genome. GC-AG and AT-AC are two major non-canonical splice site combinations which have been known for years. During the last years, various minor non-canonical splice site combinations were detected with numerous dinucleotide permutations. Here we expand systematic investigations of non-canonical splice site combinations in plants to all eukaryotes by analysing fungal and animal genome sequences. Comparisons of splice site combinations between these three kingdoms revealed several differences such as a substantially increased CT-AC frequency in fungal genome sequences. Canonical GT-AG splice site combinations in antisense transcripts could be one explanation for this observation. In addition, high numbers of GA-AG splice site combinations were observed in Eurytemora affinis and Oikopleura dioica. A variant in one U1 snRNA isoform might allow the recognition of GA as 5’ splice site. In depth investigation of splice site usage based on RNA-Seq read mappings indicates a generally higher flexibility of the 3’ splice site compared to the 5’ splice site across animals, fungi, and plants.

Cells ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 458 ◽  
Author(s):  
Katharina Frey ◽  
Boas Pucker

Most protein-encoding genes in eukaryotes contain introns, which are interwoven with exons. Introns need to be removed from initial transcripts in order to generate the final messenger RNA (mRNA), which can be translated into an amino acid sequence. Precise excision of introns by the spliceosome requires conserved dinucleotides, which mark the splice sites. However, there are variations of the highly conserved combination of GT at the 5′ end and AG at the 3′ end of an intron in the genome. GC-AG and AT-AC are two major non-canonical splice site combinations, which have been known for years. Recently, various minor non-canonical splice site combinations were detected with numerous dinucleotide permutations. Here, we expand systematic investigations of non-canonical splice site combinations in plants across eukaryotes by analyzing fungal and animal genome sequences. Comparisons of splice site combinations between these three kingdoms revealed several differences, such as an apparently increased CT-AC frequency in fungal genome sequences. Canonical GT-AG splice site combinations in antisense transcripts are a likely explanation for this observation, thus indicating annotation errors. In addition, high numbers of GA-AG splice site combinations were observed in Eurytemora affinis and Oikopleura dioica. A variant in one U1 small nuclear RNA (snRNA) isoform might allow the recognition of GA as a 5′ splice site. In depth investigation of splice site usage based on RNA-Seq read mappings indicates a generally higher flexibility of the 3′ splice site compared to the 5′ splice site across animals, fungi, and plants.


2018 ◽  
Author(s):  
Boas Pucker ◽  
Samuel F. Brockington

ABSTRACTMost eukaryotic genes comprise exons and introns thus requiring the precise removal of introns from pre-mRNAs to enable protein biosynthesis. U2 and U12 spliceosomes catalyze this step by recognizing motifs on the transcript in order to remove the introns. A process which is dependent on precise definition of exon-intron borders by splice sites, which are consequently highly conserved across species. Only very few combinations of terminal dinucleotides are frequently observed at intron ends, dominated by the canonical GT-AG splice sites on the DNA level.Here we investigate the occurrence of diverse combinations of dinucleotides at predicted splice sites. Analyzing 121 plant genome sequences based on their annotation revealed strong splice site conservation across species, annotation errors, and true biological divergence from canonical splice sites. The frequency of non-canonical splice sites clearly correlates with their divergence from canonical ones indicating either an accumulation of probably neutral mutations, or evolution towards canonical splice sites. Strong conservation across multiple species and non-random accumulation of substitutions in splice sites indicate a functional relevance of non-canonical splice sites. The average composition of splice sites across all investigated species is 98.7% for GT-AG, 1.2% for GC-AG, 0.06% for AT-AC, and 0.09% for minor non-canonical splice sites. RNA-Seq data sets of 35 species were incorporated to validate non-canonical splice site predictions through gaps in sequencing reads alignments and to demonstrate the expression of affected genes. We conclude thatbona fidenon-canonical splice sites are present and appear to be functionally relevant in most plant genomes, if at low abundance.


2018 ◽  
Vol 28 (12) ◽  
pp. 1826-1840 ◽  
Author(s):  
Steffen Erkelenz ◽  
Stephan Theiss ◽  
Wolfgang Kaisers ◽  
Johannes Ptok ◽  
Lara Walotka ◽  
...  

2017 ◽  
Vol 5 (35) ◽  
Author(s):  
Anja Poehlein ◽  
Jan Hendrik Wübbeler ◽  
Rolf Daniel ◽  
Alexander Steinbüchel

ABSTRACT Sphingomonas mucosissima and Sphingomonas dokdonensis are Gram-negative chemoheterotrophic strictly aerobic rods or cocci. The genomes (3.453 Mb and 3.587 Mb, respectively) contain 3,279 and 3,329 predicted protein-encoding genes, respectively. The genome of S. dokdonensis harbors a 90-kb plasmid.


1989 ◽  
Vol 9 (5) ◽  
pp. 2220-2223 ◽  
Author(s):  
D Johnson ◽  
S Henikoff

In two distantly related Drosophila species, the use of alternate 5' splice sites to process an intron in pre-mRNA from homologous adenine phosphoribosyltransferase (APRT)-encoding genes led to RNAs encoding nonfunctional peptides in addition to APRT. The production of aberrantly spliced transcripts as a normal feature of gene expression supports a general model of eucaryotic gene evolution through alternative splicing and moveable splice junctions.


1989 ◽  
Vol 9 (5) ◽  
pp. 2220-2223
Author(s):  
D Johnson ◽  
S Henikoff

In two distantly related Drosophila species, the use of alternate 5' splice sites to process an intron in pre-mRNA from homologous adenine phosphoribosyltransferase (APRT)-encoding genes led to RNAs encoding nonfunctional peptides in addition to APRT. The production of aberrantly spliced transcripts as a normal feature of gene expression supports a general model of eucaryotic gene evolution through alternative splicing and moveable splice junctions.


Author(s):  
Sabrina Sprotte ◽  
Erik Brinks ◽  
Natalia Wagner ◽  
Andrew M. Kropinski ◽  
Horst Neve ◽  
...  

AbstractThe complete genome sequence of the virulent bacteriophage PMBT3, isolated on the proteolytic Pseudomonas grimontii strain MBTL2-21, showed no significant similarity to other known phage genome sequences, making this phage the first reported to infect a strain of P. grimontii. Electron microscopy revealed PMBT3 to be a member of the family Siphoviridae, with notably long and flexible whiskers. The linear, double-stranded genome of 87,196 bp has a mol% G+C content of 60.4 and contains 116 predicted protein-encoding genes. A putative tellurite resistance (terB) gene, originally reported to occur in the genome of a bacterium, was detected in the genome of phage PMBT3.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Craig I Dent ◽  
Shilpi Singh ◽  
Sourav Mukherjee ◽  
Shikhar Mishra ◽  
Rucha D Sarwade ◽  
...  

Abstract RNA splicing, and variations in this process referred to as alternative splicing, are critical aspects of gene regulation in eukaryotes. From environmental responses in plants to being a primary link between genetic variation and disease in humans, splicing differences confer extensive phenotypic changes across diverse organisms (1–3). Regulation of splicing occurs through differential selection of splice sites in a splicing reaction, which results in variation in the abundance of isoforms and/or splicing events. However, genomic determinants that influence splice-site selection remain largely unknown. While traditional approaches for analyzing splicing rely on quantifying variant transcripts (i.e. isoforms) or splicing events (i.e. intron retention, exon skipping etc.) (4), recent approaches focus on analyzing complex/mutually exclusive splicing patterns (5–8). However, none of these approaches explicitly measure individual splice-site usage, which can provide valuable information about splice-site choice and its regulation. Here, we present a simple approach to quantify the empirical usage of individual splice sites reflecting their strength, which determines their selection in a splicing reaction. Splice-site strength/usage, as a quantitative phenotype, allows us to directly link genetic variation with usage of individual splice-sites. We demonstrate the power of this approach in defining the genomic determinants of splice-site choice through GWAS. Our pilot analysis with more than a thousand splice sites hints that sequence divergence in cis rather than trans is associated with variations in splicing among accessions of Arabidopsis thaliana. This approach allows deciphering principles of splicing and has broad implications from agriculture to medicine.


Sign in / Sign up

Export Citation Format

Share Document