scholarly journals dbMTS: a comprehensive database of putative human microRNA target site SNVs and their functional predictions

2019 ◽  
Author(s):  
Chang Li ◽  
Michael D. Swartz ◽  
Bing Yu ◽  
Yongsheng Bai ◽  
Xiaoming Liu

AbstractmicroRNAs (miRNAs) are short non-coding RNAs that can repress the expression of protein coding messenger RNAs (mRNAs) by binding to the 3’UTR of the target. Genetic mutations such as single nucleotide variants (SNVs) in the 3’UTR of the mRNAs can disrupt this regulatory effect. In this study, we presented dbMTS, the database for miRNA target site (MTS) SNVs, which includes all potential MTS SNVs in the 3’UTR of human genome along with hundreds of functional annotations. This database can help studies easily identify putative SNVs that affect miRNA targeting and facilitate the prioritization of their functional importance. dbMTS is freely available at: https://sites.google.com/site/jpopgen/dbNSFP.

2018 ◽  
Vol 16 (02) ◽  
pp. 1840013 ◽  
Author(s):  
Oxana A. Volkova ◽  
Yury V. Kondrakhin ◽  
Timur A. Kashapov ◽  
Ruslan N. Sharipov

RNA plays an important role in the intracellular cell life and in the organism in general. Besides the well-established protein coding RNAs (messenger RNAs, mRNAs), long non-coding RNAs (lncRNAs) have gained the attention of recent researchers. Although lncRNAs have been classified as non-coding, some authors reported the presence of corresponding sequences in ribosome profiling data (Ribo-seq). Ribo-seq technology is a powerful experimental tool utilized to characterize RNA translation in cell with focus on initiation (harringtonine, lactimidomycin) and elongation (cycloheximide). By exploiting translation starts obtained from the Ribo-seq experiment, we developed a novel position weight matrix model for the prediction of translation starts. This model allowed us to achieve 96% accuracy of discrimination between human mRNAs and lncRNAs. When the same model was used for the prediction of putative ORFs in RNAs, we discovered that the majority of lncRNAs contained only small ORFs ([Formula: see text][Formula: see text]nt) in contrast to mRNAs.


2020 ◽  
Vol 48 (W1) ◽  
pp. W287-W291
Author(s):  
Milad Miladi ◽  
Martin Raden ◽  
Sven Diederichs ◽  
Rolf Backofen

Abstract RNA molecules fold into complex structures as a result of intramolecular interactions between their nucleotides. The function of many non-coding RNAs and some cis-regulatory elements of messenger RNAs highly depends on their fold. Single-nucleotide variants (SNVs) and other types of mutations can disrupt the native function of an RNA element by altering its base pairing pattern. Identifying the effect of a mutation on an RNA’s structure is, therefore, a crucial step in evaluating the impact of mutations on the post-transcriptional regulation and function of RNAs within the cell. Even though a single nucleotide variation can have striking impacts on the structure formation, interpreting and comparing the impact usually needs expertise and meticulous efforts. Here, we present MutaRNA, a web server for visualization and interpretation of mutation-induced changes on the RNA structure in an intuitive and integrative fashion. To this end, probabilities of base pairing and position-wise unpaired probabilities of wildtype and mutated RNA sequences are computed and compared. Differential heatmap-like dot plot representations in combination with circular plots and arc diagrams help to identify local structure abberations, which are otherwise hidden in standard outputs. Eventually, MutaRNA provides a comprehensive and comparative overview of the mutation-induced changes in base pairing potentials and accessibility. The MutaRNA web server is freely available at http://rna.informatik.uni-freiburg.de/MutaRNA.


2019 ◽  
Author(s):  
Arjun A. Rao ◽  
Ada A. Madejska ◽  
Jacob Pfeil ◽  
Benedict Paten ◽  
Sofie R. Salama ◽  
...  

AbstractSomatic mutations in cancers affecting protein coding genes can give rise to potentially therapeutic neoepitopes. These neoepitopes can guide Adoptive Cell Therapies (ACTs) and Peptide Vaccines (PVs) to selectively target tumor cells using autologous patient cytotoxic T-cells. Currently, researchers have to independently align their data, call somatic mutations and haplotype the patient’s HLA to use existing neoepitope prediction tools. We present ProTECT, a fully automated, reproducible, scalable, and efficient end-to-end analysis pipeline to identify and rank therapeutically relevant tumor neoepitopes in terms of immunogenicity starting directly from raw patient sequencing data, or from pre-processed data. The ProTECT pipeline encompasses alignment, HLA haplotyping, mutation calling (single nucleotide variants, short insertions and deletions, and gene fusions), peptide:MHC (pMHC) binding prediction, and ranking of final candidates. We demonstrate ProTECT on 326 samples from the TCGA Prostate Adenocarcinoma cohort, and compare it with published tools. ProTECT can be run on a standalone computer, a local cluster, or on a compute cloud using a Mesos backend. ProTECT is highly scalable and can process TCGA data in under 30 minutes per sample when run in large batches. ProTECT is freely available at https://www.github.com/BD2KGenomics/protect.


2021 ◽  
Author(s):  
Roberta Esposito ◽  
Andres Lanzos ◽  
Taisia Polidori ◽  
Hugo Guillen-Ramirez ◽  
Bernard Merlin ◽  
...  

Tumour DNA contains thousands of single nucleotide variants (SNVs) in non-protein-coding regions, yet it remains unclear which are driver mutations that promote cell fitness. Amongst the most highly mutated non-coding elements are long noncoding RNAs (lncRNAs), which can promote cancer and may be targeted therapeutically. We here searched for evidence that driver mutations may act through alteration of lncRNA function. Using an integrative driver discovery algorithm, we analysed single nucleotide variants (SNVs) from 2583 primary tumours and 3527 metastases to reveal 54 candidate driver lncRNAs (FDR<0.1). Their relevance is supported by enrichment for previously-reported cancer genes and by clinical and genomic features. Using knockdown and transgene overexpression, we show that tumour SNVs in two novel lncRNAs can boost cell fitness. Researchers have noted particularly high yet unexplained mutation rates in the iconic cancer lncRNA, NEAT1. We apply in cellulo mutagenesis by CRISPR-Cas9 to identify vulnerable regions of NEAT1 where SNVs reproducibly increase cell fitness in both transformed and normal backgrounds. In particular, mutations in the 5-prime region of NEAT1 alter ribonucleoprotein assembly and boost the population of subnuclear paraspeckles. Together, this work reveals function-altering somatic lncRNA mutations as a new route to enhanced cell fitness during transformation and metastasis.


Author(s):  
Daniel R Mende ◽  
Ivica Letunic ◽  
Oleksandr M Maistrenko ◽  
Thomas S B Schmidt ◽  
Alessio Milanese ◽  
...  

Abstract Microbiology depends on the availability of annotated microbial genomes for many applications. Comparative genomics approaches have been a major advance, but consistent and accurate annotations of genomes can be hard to obtain. In addition, newer concepts such as the pan-genome concept are still being implemented to help answer biological questions. Hence, we present proGenomes2, which provides 87 920 high-quality genomes in a user-friendly and interactive manner. Genome sequences and annotations can be retrieved individually or by taxonomic clade. Every genome in the database has been assigned to a species cluster and most genomes could be accurately assigned to one or multiple habitats. In addition, general functional annotations and specific annotations of antibiotic resistance genes and single nucleotide variants are provided. In short, proGenomes2 provides threefold more genomes, enhanced habitat annotations, updated taxonomic and functional annotation and improved linkage to the NCBI BioSample database. The database is available at http://progenomes.embl.de/.


Biomolecules ◽  
2020 ◽  
Vol 10 (3) ◽  
pp. 475
Author(s):  
Javier Murillo ◽  
Flavio Spetale ◽  
Serge Guillaume ◽  
Pilar Bulacio ◽  
Ignacio Garcia Labari ◽  
...  

Single nucleotide variants (SNVs) occurring in a protein coding gene may disrupt its function in multiple ways. Predicting this disruption has been recognized as an important problem in bioinformatics research. Many tools, hereafter p-tools, have been designed to perform these predictions and many of them are now of common use in scientific research, even in clinical applications. This highlights the importance of understanding the semantics of their outputs. To shed light on this issue, two questions are formulated, (i) do p-tools provide similar predictions? (inner consistency), and (ii) are these predictions consistent with the literature? (outer consistency). To answer these, six p-tools are evaluated with exhaustive SNV datasets from the BRCA1 gene. Two indices, called K a l l and K s t r o n g , are proposed to quantify the inner consistency of pairs of p-tools while the outer consistency is quantified by standard information retrieval metrics. While the inner consistency analysis reveals that most of the p-tools are not consistent with each other, the outer consistency analysis reveals they are characterized by a low prediction performance. Although this result highlights the need of improving the prediction performance of individual p-tools, the inner consistency results pave the way to the systematic design of truly diverse ensembles of p-tools that can overcome the limitations of individual members.


2016 ◽  
Vol 40 (1-2) ◽  
pp. 219-229 ◽  
Author(s):  
Yan Pan ◽  
Chen Li ◽  
Jing Chen ◽  
Kai Zhang ◽  
Xiaoyuan Chu ◽  
...  

To date, there is only up to 2% of protein-coding genes that are stably transcribed, whereas the vast majority are non-coding RNAs (ncRNAs). These ncRNAs, also known as non-messenger RNAs (nmRNAs) or functional RNAs (fRNAs), include transfer RNAs, ribosomal RNAs, microRNAs and long non-coding RNAs (lncRNAs). With the advance of high-resolution microarrays and massively parallel sequencing technology, lncRNAs have gained extended attentions nowadays and are found to play important roles in tumorigenesis and progression of human cancers. Long intergenic non-protein coding RNA, regulator of reprogramming (linc-ROR), was first discovered in induced pluripotent stem cells (iPSCs), where it was controlled by the key pluripotency factors Oct4, Sox2 and Nanog. Linc-ROR has been shown to be dysregulated in many types of cancers, including breast cancer (BC), pancreatic cancer (PC), hepatocellular cancer (HCC), endometrial cancer (EC), and nasopharyngeal carcinoma (NPC). Also, linc-ROR functions as regulatory molecule in a large amount of biological processes. However, the underlying mechanisms of its contribution to carcinogenesis remain to be elucidated. In this review, we will emphasize on the characteristics of linc-ROR and their roles in different types of human cancers.


2012 ◽  
Vol 108 (10) ◽  
pp. 599-604 ◽  
Author(s):  
Seema Dangwal ◽  
Thomas Thum

SummaryPlatelets are important to maintain primary haemostasis and play a key role in pathology of thrombotic and occlusive vascular disorders such as acute coronary syndrome or stroke. Despite of lacking a nucleus and genomic DNA, platelets possess diverse types of RNAs, ranging from protein coding messenger RNAs to small non-coding RNAs inherited from their parent megakaryocytes. Indeed, platelets are capable of using their own translational machinery to synthesise proteins upon their activation suggesting the possibility of post-transcriptional gene regulation in platelets. MicroRNAs (miRNAs) are highly conserved, tiny non-coding RNAs exhibiting a fine-tune control of protein expression by complementary sequence recognition, binding and translational repression of protein coding mRNA transcripts. Multiple functional aspects of miRNAs as well as their expression in platelets or megakaryocytes underscore a role in platelet biology. Changes in miRNA expression patterns have been noted during platelet genesis and activation. In the present review we highlight recently identified megakaryocytic/platelet miRNAs and discuss their role in platelet biogenesis and functions essential to maintain haemostasis in the body.


2018 ◽  
Author(s):  
Yi-Fei Huang ◽  
Adam Siepel

AbstractA central challenge in human genomics is to understand the cellular, evolutionary, and clinical significance of genetic variants. Here we introduce a unified population-genetic and machine-learning model, called Linear Allele-Specific Selection InferencE (LASSIE), for estimating the fitness effects of all potential single-nucleotide variants, based on polymorphism data and predictive genomic features. We applied LASSIE to 51 high-coverage genome sequences annotated with 33 genomic features, and constructed a map of allele-specific selection coefficients across all protein-coding sequences in the human genome. We show that this map is informative about both human evolution and disease.


Author(s):  
Hecun Zou ◽  
Lan-Xiang Wu ◽  
Lihong Tan ◽  
Fei-Fei Shang ◽  
Hong-Hao Zhou

Sign in / Sign up

Export Citation Format

Share Document