MiRACLe: an individual-specific approach to improve microRNA-target prediction based on a random contact model

Author(s):  
Pan Wang ◽  
Qi Li ◽  
Nan Sun ◽  
Yibo Gao ◽  
Jun S Liu ◽  
...  

Abstract Deciphering microRNA (miRNA) targets is important for understanding the function of miRNAs as well as miRNA-based diagnostics and therapeutics. Given the highly cell-specific nature of miRNA regulation, recent computational approaches typically exploit expression data to identify the most physiologically relevant target messenger RNAs (mRNAs). Although effective, those methods usually require a large sample size to infer miRNA–mRNA interactions, thus limiting their applications in personalized medicine. In this study, we developed a novel miRNA target prediction algorithm called miRACLe (miRNA Analysis by a Contact modeL). It integrates sequence characteristics and RNA expression profiles into a random contact model, and determines the target preferences by relative probability of effective contacts in an individual-specific manner. Evaluation by a variety of measures shows that fitting TargetScan, a frequently used prediction tool, into the framework of miRACLe can improve its predictive power with a significant margin and consistently outperform other state-of-the-art methods in prediction accuracy, regulatory potential and biological relevance. Notably, the superiority of miRACLe is robust to various biological contexts, types of expression data and validation datasets, and the computation process is fast and efficient. Additionally, we show that the model can be readily applied to other sequence-based algorithms to improve their predictive power, such as DIANA-microT-CDS, miRanda-mirSVR and MirTarget4. MiRACLe is publicly available at https://github.com/PANWANG2014/miRACLe.

2020 ◽  
Vol 21 (S8) ◽  
Author(s):  
Giorgio Bertolazzi ◽  
Panayiotis V. Benos ◽  
Michele Tumminello ◽  
Claudia Coronnello

Abstract MicroRNA are small non-coding RNAs that post-transcriptionally regulate the expression levels of messenger RNAs. MicroRNA regulation activity depends on the recognition of binding sites located on mRNA molecules. ComiR is a web tool realized to predict the targets of a set of microRNAs, starting from their expression profile. ComiR was trained with the information regarding binding sites in the 3’utr region, by using a reliable dataset containing the targets of endogenously expressed microRNA in D. melanogaster S2 cells. This dataset was obtained by comparing the results from two different experimental approaches, i.e., inhibition, and immunoprecipitation of the AGO1 protein--a component of the microRNA induced silencing complex. In this work, we tested whether including coding region binding sites in ComiR algorithm improves the performance of the tool in predicting microRNA targets. We focused the analysis on the D. melanogaster species and updated the ComiR underlying database with the currently available releases of mRNA and microRNA sequences. As a result, we find that ComiR algorithm trained with the information related to the coding regions is more efficient in predicting the microRNA targets, with respect to the algorithm trained with 3’utr information. On the other hand, we show that 3’utr based predictions can be seen as complementary to the coding region based predictions, which suggests that both predictions, from 3’utr and coding regions, should be considered in comprehensive analysis. Furthermore, we observed that the lists of targets obtained by analyzing data from one experimental approach only, that is, inhibition or immunoprecipitation of AGO1, are not reliable enough to test the performance of our microRNA target prediction algorithm. Further analysis will be conducted to investigate the effectiveness of the tool with data from other species, provided that validated datasets, as obtained from the comparison of RISC proteins inhibition and immunoprecipitation experiments, will be available for the same samples. Finally, we propose to upgrade the existing ComiR web-tool by including the coding region based trained model, available together with the 3’utr based one.


2014 ◽  
Vol 13s7 ◽  
pp. CIN.S16348 ◽  
Author(s):  
Zixing Wang ◽  
Wenlong Xu ◽  
Haifeng Zhu ◽  
Yin Liu

MicroRNAs (miRNAs) are small regulatory RNAs that play key gene-regulatory roles in diverse biological processes, particularly in cancer development. Therefore, inferring miRNA targets is an essential step to fully understanding the functional properties of miRNA actions in regulating tumorigenesis. Bayesian linear regression modeling has been proposed for identifying the interactions between miRNAs and mRNAs on the basis of the integrated sequence information and matched miRNA and mRNA expression data; however, this approach does not use the full spectrum of available features of putative miRNA targets. In this study, we integrated four important sequence and structural features of miRNA targeting with paired miRNA and mRNA expression data to improve miRNA-target prediction in a Bayesian framework. We have applied this approach to a gene-expression study of liver cancer patients and examined the posterior probability of each miRNA-mRNA interaction being functional in the development of liver cancer. Our method achieved better performance, in terms of the number of true targets identified, than did other methods.


2010 ◽  
Vol 08 (04) ◽  
pp. 763-788 ◽  
Author(s):  
YUN ZHENG ◽  
WEIXIONG ZHANG

Many recent studies have shown that access of animal microRNAs (miRNAs) to their complementary sites in target mRNAs is determined by several sequence-specific determinants beyond the seed regions in the 5′ end of miRNAs. These factors have been related to the repressive power of miRNAs and used in some programs to predict the efficacy of miRNA complementary sites. However, these factors have not been systematically examined regarding their capacities for improving miRNA target prediction. We develop a new miRNA target prediction algorithm, called Hitsensor, by incorporating many sequence-specific features that determine complementarities between miRNAs and their targets, in addition to the canonical seed regions in the 5′ ends of miRNAs. We evaluate the performance of our algorithm on 720 known animal miRNA:target pairs in four species, Homo sapiens, Mus musculus, Drosophila melanogaster and Caenorhabditis elegans. Our experimental results show that Hitsensor outperforms five popular existing algorithms, indicating that our unique scheme for quantifying the determinants of complementary sites is effective in improving the performance of a miRNA target prediction algorithm. We also examine the effectiveness of miRNA-mediated repression for the predicted targets by using a published quantitative protein expression dataset of miR-223 knockout in mouse neutrophils. Hitsensor identifies more targets than the existing algorithms, and the predicted targets of Hitsensor show comparable protein level changes to those of the existing algorithms.


2019 ◽  
Vol 48 (D1) ◽  
pp. D127-D131 ◽  
Author(s):  
Yuhao Chen ◽  
Xiaowei Wang

Abstract MicroRNAs (miRNAs) are small noncoding RNAs that act as master regulators in many biological processes. miRNAs function mainly by downregulating the expression of their gene targets. Thus, accurate prediction of miRNA targets is critical for characterization of miRNA functions. To this end, we have developed an online database, miRDB, for miRNA target prediction and functional annotations. Recently, we have performed major updates for miRDB. Specifically, by employing an improved algorithm for miRNA target prediction, we now present updated transcriptome-wide target prediction data in miRDB, including 3.5 million predicted targets regulated by 7000 miRNAs in five species. Further, we have implemented the new prediction algorithm into a web server, allowing custom target prediction with user-provided sequences. Another new database feature is the prediction of cell-specific miRNA targets. miRDB now hosts the expression profiles of over 1000 cell lines and presents target prediction data that are tailored for specific cell models. At last, a new web query interface has been added to miRDB for prediction of miRNA functions by integrative analysis of target prediction and Gene Ontology data. All data in miRDB are freely accessible at http://mirdb.org.


Author(s):  
Zihai Qin ◽  
Junji Li ◽  
Ye Zhang ◽  
Yufei Xiao ◽  
Xiaoning Zhang ◽  
...  

Abstract MicroRNAs (miRNAs) are small noncoding RNAs (18∼24 nt) and function in many biological processes in plants. Although Eucalyptus trees are widely planted across the world, our understanding of the miRNA regulation in the somatic embryogenesis (SE) of Eucalyptus is still poor. Here we reported, for the first time, the miRNA profiles of differentiated and dedifferentiated tissues of two Eucalyptus species and identified miRNAs involved in SE of Eucalyptus. Stem and tissue-culture induced callus were obtained from the subculture seedlings of E. camaldulensis and E. grandis x urophylla, and were used as differentiated and dedifferentiated samples, respectively. Small RNA sequencing generated 304.2 million clean reads for the Eucalyptus samples (n = 3) and identified 888 miRNA precursors (197 known and 691 novel) for Eucalyptus. These miRNAs were mainly distributed in chromosomes Chr03, Chr05 and Chr08, and can produce 46 miRNA clusters. Then, we identified 327 and 343 differentially expressed miRNAs (DEmiRs) in the dedifferentiation process of E. camaldulensis and E. grandis x urophylla, respectively. DEmiRs shared by the two Eucalyptus species might be involved in the development of embryonic callus, such as MIR156, MIR159, MIR160, MIR164, MIR166, MIR169, MIR171, MIR399 and MIR482. Notably, we identified 81 up-regulated and 67 down-regulated miRNAs specific to E. camaldulensis, which might be associated with the high embryogenic potential. Target prediction and functional analysis showed they might be involved in longevity regulating and plant hormone signal transduction pathways. Further, using the gene expression profiles we observed the negative regulation of miRNA∼target pairs, such as MIR160∼ARF18, MIR396∼GRF6, MIR166∼ATHB15/HD-ZIP and MIR156/MIR157∼SPL1. Interestingly, transcription factors such as WRKY, MYB, GAMYB, TCP4 and PIL1 were found to be regulated by the DEmiRs. The genes encoding PIL1 and RPS21C, regulated by up-regulated miRNAs (e.g., egd-N-miR63-5p, egd-N-miR63-5p and MIR169,) were down-regulated exclusively in the dedifferentiation of E. camaldulensis. This is the first time to study the miRNA regulation in the dedifferentiation process of Eucalyptus and it will provide a valuable resource for future studies. More importantly, it will improve our understanding of miRNA regulation during the somatic embryogenesis of Eucalyptus and benefit the Eucalyptus breeding program.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gilad Ben Or ◽  
Isana Veksler-Lublinsky

Abstract Background MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression post-transcriptionally via base-pairing with complementary sequences on messenger RNAs (mRNAs). Due to the technical challenges involved in the application of high-throughput experimental methods, datasets of direct bona fide miRNA targets exist only for a few model organisms. Machine learning (ML)-based target prediction models were successfully trained and tested on some of these datasets. There is a need to further apply the trained models to organisms in which experimental training data are unavailable. However, it is largely unknown how the features of miRNA–target interactions evolve and whether some features have remained fixed during evolution, raising questions regarding the general, cross-species applicability of currently available ML methods. Results We examined the evolution of miRNA–target interaction rules and used data science and ML approaches to investigate whether these rules are transferable between species. We analyzed eight datasets of direct miRNA–target interactions in four species (human, mouse, worm, cattle). Using ML classifiers, we achieved high accuracy for intra-dataset classification and found that the most influential features of all datasets overlap significantly. To explore the relationships between datasets, we measured the divergence of their miRNA seed sequences and evaluated the performance of cross-dataset classification. We found that both measures coincide with the evolutionary distance between the compared species. Conclusions The transferability of miRNA–targeting rules between species depends on several factors, the most associated factors being the composition of seed families and evolutionary distance. Furthermore, our feature-importance results suggest that some miRNA–target features have evolved while others remained fixed during the evolution of the species. Our findings lay the foundation for the future development of target prediction tools that could be applied to “non-model” organisms for which minimal experimental data are available. Availability and implementation The code is freely available at https://github.com/gbenor/TPVOD.


2015 ◽  
Vol 14s5 ◽  
pp. CIN.S30563 ◽  
Author(s):  
Xuepo Ma ◽  
Ying Zhu ◽  
Yufei Huang ◽  
Tony Tegeler ◽  
Shou-Jiang Gao ◽  
...  

Motivation Among many large-scale proteomic quantification methods, 18O/16O labeling requires neither specific amino acid in peptides nor label incorporation through several cell cycles, as in metabolic labeling; it does not cause significant elution time shifts between heavy- and light-labeled peptides, and its dynamic range of quantification is larger than that of tandem mass spectrometry-based quantification methods. These properties offer 18O/16O labeling the maximum flexibility in application. However, 18O/16O labeling introduces large quantification variations due to varying labeling efficiency. There lacks a processing pipeline that warrants the reliable identification of differentially expressed proteins (DEPs). This motivates us to develop a quantitative proteomic approach based on 18O/16O labeling and apply it on Kaposi sarcoma-associated herpesvirus (KSHV) microRNA (miR) target prediction. KSHV is a human pathogenic y-herpesvirus strongly associated with the development of B-cell proliferative disorders, including primary effusion lymphoma. Recent studies suggest that miRs have evolved a highly complex network of interactions with the cellular and viral transcriptomes, and relatively few KSHV miR targets have been characterized at the functional level. While the new miR target prediction method, photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation (PAR-CLIP), allows the identification of thousands of miR targets, the link between miRs and their targets still cannot be determined. We propose to apply the developed proteomic approach to establish such links. Method We integrate several 18O/16O data processing algorithms that we published recently and identify the messenger RNAs of downregulated proteins as potential targets in KSHV miR-transfected human embryonic kidney 293T cells. Various statistical tests are employed for picking DEPs, and we select the best test by examining the enrichment of PAR-CLIP-reported targets with seed match to the miRs of interest among top ranked DEPs returned by statistical tests. Subsequently, the list of DEPs picked by the selected statistical test is filtered with the criteria that they must have downregulated gene expressions, must have reported as targets by an miR target prediction algorithm SVMcrio, and must have reported as targets by PAR-CLIP. Result We test the developed approach in the problem of finding targets of KSHV miR-K1. The RNAs of three DEPs are identified as miR-K1 targets, among which RAB23 and HNRNPU are novel. Results from both Western blotting and Luciferase reporter assays confirm the novel targets. These results show that the developed quantitative approach based on 18O/16O labeling can be combined with genomic, PAR-CLIP, and target prediction algorithms for the confident identification of KSHV miR targets. The developed approach could also be applied in other applications.


2021 ◽  
Author(s):  
Gilad Ben Or ◽  
Isana Veksler-Lublinsky

AbstractBackgroundMicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression post-transcriptionally via base-pairing with complementary sequences on messenger RNAs (mRNAs). Due to the technical challenges involved in the application of high-throughput experimental methods, datasets of direct bona-fide miRNA targets exist only for a few model organisms. Machine learning (ML) based target prediction methods were successfully trained and tested on some of these datasets. There is a need to further apply the trained models to organisms where experimental training data is unavailable. However, it is largely unknown how the features of miRNA-target interactions evolve and whether there are features that have been fixed during evolution, questioning the general applicability of these ML methods across species.ResultsIn this paper, we examined the evolution of miRNA-target interaction rules and used data science and ML approaches to investigate whether these rules are transferable between species. We analyzed eight datasets of direct miRNA-target interactions in four organisms (human, mouse, worm, cattle). Using ML classifiers, we achieved high accuracy for intra-dataset classification and found that the most influential features of all datasets significantly overlap. To explore the relationships between datasets we measured the divergence of their miRNA seed sequences and evaluated the performance of cross-datasets classification. We showed that both measures coincide with the evolutionary distance of the compared organisms.ConclusionsOur results indicate that the transferability of miRNA-targeting rules between organisms depends on several factors, the most associated factors being the composition of seed families and evolutionary distance. Furthermore, our feature importance results suggest that some miRNA-target features have been evolving while some have been fixed during evolution. Our study lays the foundation for the future developments of target prediction tools that could be applied to “non-model” organisms for which minimal experimental data is available.Availability and implementation The code is freely available at https://github.com/gbenor/TPVOD


Sign in / Sign up

Export Citation Format

Share Document