LINE-2 transposable elements are a source for functional human microRNAs and target sites

AbstractTransposable elements (TEs) are dynamically expressed at high levels in multiple human tissues, but the function of TE-derived transcripts remains largely unknown. In this study, we identify numerous TE-derived microRNAs (miRNAs) by conducting Argonaute2 RNA Immunoprecipitation followed by small RNA sequencing (AGO2 RIP-seq) on human brain tissue. Many of these miRNAs originated from LINE-2 (L2) elements, which entered the human genome around 100-300 million years ago. We found that L2-miRNAs derive from the 3’ end of the L2 consensus sequence and thus share very similar sequences, indicating that they could target transcripts with L2s in their 3’UTR. In line with this, we found that many protein-coding genes carry fragments of L2-derived sequences in their 3’UTR, which serve as target sites for L2-miRNAs. L2-miRNAs and targets were generally ubiquitously expressed at low levels in multiple human tissues, suggesting a role for this network in buffering transcriptional levels of housekeeping genes. Interestingly, we also found evidence that this network is perturbed in glioblastoma. In summary, our findings uncover a TE-based post-transcriptional network that shapes transcriptional regulation in human cells.

Download Full-text

Annotation of snoRNA abundance across human tissues reveals complex snoRNA-host gene relationships

Genome Biology ◽

10.1186/s13059-021-02391-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Étienne Fafard-Couture ◽

Danny Bergeron ◽

Sonia Couture ◽

Sherif Abou-Elela ◽

Michelle S. Scott

Keyword(s):

Housekeeping Genes ◽

Host Gene ◽

Rna Modification ◽

Human Tissues ◽

Rna Seq ◽

Healthy Human ◽

Protein Coding ◽

Conservation Level ◽

Nucleolar Rnas ◽

Host Genes

Abstract Background Small nucleolar RNAs (snoRNAs) are mid-size non-coding RNAs required for ribosomal RNA modification, implying a ubiquitous tissue distribution linked to ribosome synthesis. However, increasing numbers of studies identify extra-ribosomal roles of snoRNAs in modulating gene expression, suggesting more complex snoRNA abundance patterns. Therefore, there is a great need for mapping the snoRNome in different human tissues as the blueprint for snoRNA functions. Results We used a low structure bias RNA-Seq approach to accurately quantify snoRNAs and compare them to the entire transcriptome in seven healthy human tissues (breast, ovary, prostate, testis, skeletal muscle, liver, and brain). We identify 475 expressed snoRNAs categorized in two abundance classes that differ significantly in their function, conservation level, and correlation with their host gene: 390 snoRNAs are uniformly expressed and 85 are enriched in the brain or reproductive tissues. Most tissue-enriched snoRNAs are embedded in lncRNAs and display strong correlation of abundance with them, whereas uniformly expressed snoRNAs are mostly embedded in protein-coding host genes and are mainly non- or anticorrelated with them. Fifty-nine percent of the non-correlated or anticorrelated protein-coding host gene/snoRNA pairs feature dual-initiation promoters, compared to only 16% of the correlated non-coding host gene/snoRNA pairs. Conclusions Our results demonstrate that snoRNAs are not a single homogeneous group of housekeeping genes but include highly regulated tissue-enriched RNAs. Indeed, our work indicates that the architecture of snoRNA host genes varies to uncouple the host and snoRNA expressions in order to meet the different snoRNA abundance levels and functional needs of human tissues.

Download Full-text

Dynamics and modulation of the human snoRNome

10.1101/2021.02.11.430834 ◽

2021 ◽

Author(s):

Étienne Fafard-Couture ◽

Danny Bergeron ◽

Sonia Couture ◽

Sherif Abou Elela ◽

Michelle S Scott

Keyword(s):

Expression Patterns ◽

Housekeeping Genes ◽

Host Gene ◽

Rna Modification ◽

Human Tissues ◽

Healthy Human ◽

Protein Coding ◽

Conservation Level ◽

Nucleolar Rnas ◽

Host Genes

AbstractBackgroundSmall nucleolar RNAs (snoRNAs) are mid-size non-coding RNAs required for ribosomal RNA modification, implying a ubiquitous tissue distribution linked to ribosome synthesis. However, increasing numbers of studies identify extra-ribosomal roles of snoRNAs in modulating gene expression, suggesting more complex snoRNA expression patterns. Therefore, there is a great need for mapping the snoRNome in different human tissues as the blueprint for snoRNA functions.ResultsWe used a low structure bias RNA-Seq approach to accurately quantify snoRNAs and compare them to the entire transcriptome in seven healthy human tissues (breast, ovary, prostate, testis, skeletal muscle, liver and brain). We identified 475 expressed snoRNAs categorized in two abundance classes that differ significantly in their function, conservation level and correlation with their host gene: 390 snoRNAs are uniformly expressed and 85 are enriched in the brain or reproductive tissues. Most tissue-enriched snoRNAs are embedded in lncRNAs and display strong correlation of abundance with them, whereas uniformly expressed snoRNAs are mostly embedded in protein-coding host genes and are mainly non- or anticorrelated with them. 59% of the non-correlated or anticorrelated protein-coding host gene/snoRNA pairs feature dual-initiation promoters, as opposed to only 16% of the correlated non-coding host gene/snoRNA pairs.ConclusionsOur results demonstrate that snoRNAs are not a single homogeneous group of housekeeping genes but include highly regulated tissue-enriched RNAs. Indeed, our work indicates that the architecture of snoRNA host genes varies to uncouple the host and snoRNA expressions in order to meet the different snoRNA abundance levels and functional needs of human tissues.

Download Full-text

Characterization of the nuclear and cytosolic transcriptomes in human brain tissue reveals new insights into the subcellular distribution of RNA transcripts

Scientific Reports ◽

10.1038/s41598-021-83541-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Ammar Zaghlool ◽

Adnan Niazi ◽

Åsa K. Björklund ◽

Jakub Orzechowski Westholm ◽

Adam Ameur ◽

...

Keyword(s):

Human Brain ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Adult Brain ◽

Sequencing Data ◽

Human Brain Tissue ◽

Protein Coding ◽

Rna Transcripts ◽

Nuclear Rna ◽

The Impact

AbstractTranscriptome analysis has mainly relied on analyzing RNA sequencing data from whole cells, overlooking the impact of subcellular RNA localization and its influence on our understanding of gene function, and interpretation of gene expression signatures in cells. Here, we separated cytosolic and nuclear RNA from human fetal and adult brain samples and performed a comprehensive analysis of cytosolic and nuclear transcriptomes. There are significant differences in RNA expression for protein-coding and lncRNA genes between cytosol and nucleus. We show that transcripts encoding the nuclear-encoded mitochondrial proteins are significantly enriched in the cytosol compared to the rest of protein-coding genes. Differential expression analysis between fetal and adult frontal cortex show that results obtained from the cytosolic RNA differ from results using nuclear RNA both at the level of transcript types and the number of differentially expressed genes. Our data provide a resource for the subcellular localization of thousands of RNA transcripts in the human brain and highlight differences in using the cytosolic or the nuclear transcriptomes for expression analysis.

Download Full-text

The tandem repeat domain in the Listeria monocytogenes ActA protein controls the rate of actin-based motility, the percentage of moving bacteria, and the localization of vasodilator-stimulated phosphoprotein and profilin.

The Journal of Cell Biology ◽

10.1083/jcb.135.3.647 ◽

1996 ◽

Vol 135 (3) ◽

pp. 647-660 ◽

Cited By ~ 150

Author(s):

G A Smith ◽

J A Theriot ◽

D A Portnoy

Keyword(s):

Listeria Monocytogenes ◽

Actin Polymerization ◽

Consensus Sequence ◽

Wild Type ◽

Movement Rate ◽

Bacterial Surface ◽

Rate Dependent ◽

Repeat Domain ◽

Low Levels ◽

Rate Of Movement

The ActA protein is responsible for the actin-based movement of Listeria monocytogenes in the cytosol of eukaryotic cells. Analysis of mutants in which we varied the number of proline-rich repeats (PRR; consensus sequence DFPPPPTDEEL) revealed a linear relationship between the number of PRRs and the rate of movement, with each repeat contributing approximately 2-3 microns/min. Mutants lacking all functional PRRs (generated by deletion or point mutation) moved at rates 30% of wild-type. Indirect immunofluorescence indicated that the PRRs were directly responsible for binding of vasodilator-stimulated phosphoprotein (VASP) and for the localization of profilin at the bacterial surface. The long repeats, which are interdigitated between the PRRs, increased the frequency with which actin-based motility occurred by a mechanism independent of the PRRs, VASP, and profilin. Lastly, a mutant which expressed low levels of ActA exhibited a phenotype indicative of a threshold; there was a very low percentage of moving bacteria, but when movement did occur, it was at wild-type rates. These results indicate that the ActA protein directs at least three separable events: (1) initiation of actin polymerization that is independent of the repeat region; (2) initiation of movement dependent on the long repeats and the amount of ActA; and (3) movement rate dependent on the PRRs.

Download Full-text

Rhizobium paknamense sp. nov., isolated from lesser duckweeds (Lemna aequinoctialis)

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijs.0.051888-0 ◽

2013 ◽

Vol 63 (Pt_10) ◽

pp. 3823-3828 ◽

Cited By ~ 21

Author(s):

Chokchai Kittiwongwattana ◽

Chitti Thawai

Keyword(s):

Dna Hybridization ◽

Type Species ◽

Sequence Similarity ◽

Housekeeping Genes ◽

Rrna Gene ◽

Content Type ◽

Link Type ◽

Low Levels ◽

Sequence Similarity Analysis ◽

Ph Range

A Gram-stain-negative, rod-shaped bacterium was isolated and designated strain L6-8T during a study of endophytic bacterial communities in lesser duckweed (Lemna aequinoctialis). Cells of strain L6-8T were motile with peritrichous flagella. The analysis of the nearly complete 16S rRNA gene sequence indicated that strain L6-8T was phylogenetically related to species of the genus Rhizobium . Its closest relatives were Rhizobium borbori DN316T (97.6 %), Rhizobium oryzae Alt 505T (97.3 %) and Rhizobium pseudoryzae J3-A127T (97.0 %). The sequence similarity analysis of housekeeping genes recA, glnII, atpD and gyrB showed low levels of sequence similarity (<91.5 %) between strain L6-8T and other species of the genus Rhizobium with validly published names. The pH range for growth was 4.0–9.0 (optimum 6.0–7.0), and the temperature range for growth was 20–45 °C (optimum 30 °C). Strain L6-8T tolerated NaCl up to 2 % (w/v) (optimum 1 % NaCl). The predominant components of cellular fatty acids were C19 : 0 cyclo ω8c (31.32 %), summed feature 8 (C18 : 1ω7c and/or C18 : 1ω6c; 25.39 %) and C16 : 0 (12.03 %). The DNA G+C content of strain L6-8T was 60.4 mol% (T m). nodC and nifH were not amplified in strain L6-8T. DNA–DNA relatedness between strain L6-8T and R. borbori DN316T, R. oryzae Alt505T and R. pseudoryzae J3-A127T was between 11.2 and 18.3 %. Based on the sequence similarity analyses, phenotypic, biochemical and physiological characteristics and DNA–DNA hybridization, strain L6-8T could be readily distinguished from its closest relatives and represents a novel species of the genus Rhizobium , for which the name Rhizobium paknamense sp. nov. is proposed. The type strain is L6-8T ( = NBRC 109338T = BCC 55142T).

Download Full-text

The integration preference of Sleeping Beauty at non-TA site is related to the transposon end sequences

10.21203/rs.2.19101/v1 ◽

2019 ◽

Author(s):

Yiting Zhou ◽

Guangwei Ma ◽

Jiawen Yang ◽

Yabin Guo

Keyword(s):

Site Selection ◽

Genomic Dna ◽

Consensus Sequence ◽

Mouse Cell ◽

Sleeping Beauty ◽

Consensus Sequences ◽

Target Site Selection ◽

Target Sites ◽

End Sequences ◽

Selection Of

Abstract Background: Sleeping Beauty (SB) transposon had been thought to strictly integrate into TA dinucleotides. Recently, we found that SB also integrates into non-TA sites at a lower frequency. Here we performed further study on the non-TA integration of SB. Results: 1) SB can integrate into non-TA sites in HEK293T cells as well as in mouse cell lines. 2) Both the hyperactive transposase SB100X and the traditional SB11 catalyze integrations at non-TA sites. 3) The consensus sequence of the non-TA target sites only occur at the opposite side of the sequenced junction between transposon end and the genomic sequences, indicating that the integrations at non-TA sites are mainly aberrant integrations. 4) The consensus sequence of the non-TA target sites is corresponding to the transposon end sequence. When the transposon end sequence is mutated, the consensus sequences changed too. Conclusion: The interaction between the SB transposon end and genomic DNA may be involved in the target site selection of the SB integrations at non-TA sites.

Download Full-text

Characterization of the nuclear and cytosolic transcriptomes in human brain tissue reveals new insights into the subcellular distribution of RNA transcripts

10.1101/2020.04.08.031419 ◽

2020 ◽

Author(s):

Ammar Zaghlool ◽

Adnan Niazi ◽

Åsa K. Björklund ◽

Jakub Orzechowski Westholm ◽

Adam Ameur ◽

...

Keyword(s):

Gene Expression ◽

Subcellular Localization ◽

Human Brain ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Adult Brain ◽

Sequencing Data ◽

Human Brain Tissue ◽

Protein Coding ◽

The Impact

AbstractTranscriptome analysis has mainly relied on analyzing RNA sequencing data from whole cells, overlooking the impact of subcellular RNA localization and its influence on our understanding of gene function, and interpretation of gene expression signatures in cells. Here, we performed a comprehensive analysis of cytosolic and nuclear transcriptomes in human fetal and adult brain samples. We show significant differences in RNA expression for protein-coding and lncRNA genes between cytosol and nucleus. Transcripts displaying differential subcellular localization belong to particular functional categories and display tissue-specific localization patterns. We also show that transcripts encoding the nuclear-encoded mitochondrial proteins are significantly enriched in the cytosol compared to the rest of protein-coding genes. Further investigation of the use of the cytosolic or the nuclear transcriptome for differential gene expression analysis indicates important differences in results depending on the cellular compartment. These differences were manifested at the level of transcript types and the number of differentially expressed genes. Our data provide a resource of RNA subcellular localization in the human brain and highlight differences in using the cytosolic or the nuclear transcriptomes for differential expression analysis.

Download Full-text

Genome-wide identification of MITE-derived microRNAs and their targets in bread wheat

10.21203/rs.3.rs-236927/v1 ◽

2021 ◽

Author(s):

Juan Manuel Crescente ◽

Diego Zavallo ◽

Mariana del Vas ◽

Sebastian Asurmendi ◽

Marcelo Helguera ◽

...

Keyword(s):

Gene Expression ◽

Triticum Aestivum ◽

Transposable Elements ◽

Small Rna ◽

Translational Repression ◽

High Homology ◽

Genome Wide ◽

Target Sites ◽

The Common ◽

Non Coding Rnas

Abstract Plant microRNAs (miRNAs) are a class of small non-coding RNAs that are 20–24 nucleotides length and can repress gene expression at post-transcriptional levels by target degradation or translational repression. There is increasing evidence that some microRNAs can be derived from a group of non-autonomous class II transposable elements called Miniature Inverted-repeat Transposable Elements (MITEs) in plants. We used public small RNA, degradome libraries and the common wheat (Triticum aestivum) genome to screen miRNAs production and target sites. We also created a comprehensive wheat MITE database using known and identifying novel elements. We found high homology between MITEs and 14% of all the miRNAs production sites in wheat. Furthermore, we show that MITE-derived miRNAs have preference for target degradation sites with MITE insertions in 3' UTR regions in wheat.

Download Full-text

Bioinformatics Methods for Studying MicroRNA and ARE-Mediated Regulation of Post-Transcriptional Gene Expression

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/jkdb.2010070106 ◽

2010 ◽

Vol 1 (3) ◽

pp. 97-112 ◽

Cited By ~ 3

Author(s):

Richipal Singh Bindra ◽

Jason T. L. Wang ◽

Paramjeet Singh Bagga

Keyword(s):

Mirna Target ◽

Target Prediction ◽

Untranslated Regions ◽

Regulatory Rna ◽

Rna Motifs ◽

Protein Coding ◽

Rna Molecules ◽

Cellular Processes ◽

Target Sites ◽

Transcriptional Gene Regulation

MicroRNAs (miRNAs) are short single-stranded RNA molecules with 21-22 nucleotides known to regulate post-transcriptional expression of protein-coding genes involved in most of the cellular processes. Prediction of miRNA targets is a challenging bioinformatics problem. AU-rich elements (AREs) are regulatory RNA motifs found in the 3’ untranslated regions (UTRs) of mRNAs, and they play dominant roles in the regulated decay of short-lived human mRNAs via specific interactions with proteins. In this paper, the authors review several miRNA target prediction tools and data sources, as well as computational methods used for the prediction of AREs. The authors discuss the connection between miRNA and ARE-mediated post-transcriptional gene regulation. Finally, a data mining method for identifying the co-occurrences of miRNA target sites in ARE containing genes is presented.

Download Full-text

Long-read assembly of a Great Dane genome highlights the contribution of GC-rich sequence and mobile elements to canine genomes

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2016274118 ◽

2021 ◽

Vol 118 (11) ◽

pp. e2016274118 ◽

Cited By ~ 1

Author(s):

Julia V. Halo ◽

Amanda L. Pendleton ◽

Feichen Shen ◽

Aurélien J. Doucet ◽

Thomas Derrien ◽

...

Keyword(s):

Consensus Sequence ◽

Gc Content ◽

Open Reading Frames ◽

Reference Sequence ◽

Transcription Start ◽

Protein Coding ◽

Short Interspersed Elements ◽

Technological Advances ◽

Retrotransposon Activity ◽

Great Dane

Technological advances have allowed improvements in genome reference sequence assemblies. Here, we combined long- and short-read sequence resources to assemble the genome of a female Great Dane dog. This assembly has improved continuity compared to the existing Boxer-derived (CanFam3.1) reference genome. Annotation of the Great Dane assembly identified 22,182 protein-coding gene models and 7,049 long noncoding RNAs, including 49 protein-coding genes not present in the CanFam3.1 reference. The Great Dane assembly spans the majority of sequence gaps in the CanFam3.1 reference and illustrates that 2,151 gaps overlap the transcription start site of a predicted protein-coding gene. Moreover, a subset of the resolved gaps, which have an 80.95% median GC content, localize to transcription start sites and recombination hotspots more often than expected by chance, suggesting the stable canine recombinational landscape has shaped genome architecture. Alignment of the Great Dane and CanFam3.1 assemblies identified 16,834 deletions and 15,621 insertions, as well as 2,665 deletions and 3,493 insertions located on secondary contigs. These structural variants are dominated by retrotransposon insertion/deletion polymorphisms and include 16,221 dimorphic canine short interspersed elements (SINECs) and 1,121 dimorphic long interspersed element-1 sequences (LINE-1_Cfs). Analysis of sequences flanking the 3′ end of LINE-1_Cfs (i.e., LINE-1_Cf 3′-transductions) suggests multiple retrotransposition-competent LINE-1_Cfs segregate among dog populations. Consistent with this conclusion, we demonstrate that a canine LINE-1_Cf element with intact open reading frames can retrotranspose its own RNA and that of a SINEC_Cf consensus sequence in cultured human cells, implicating ongoing retrotransposon activity as a driver of canine genetic variation.

Download Full-text