scholarly journals Multitask regression for condition-specific prioritization of miRNA targets in transcripts

Author(s):  
Azim Dehghani Amirabad ◽  
Marcel H Schulz

Deregulation of miRNAs is implicated in many diseases in particular cancer, where miRNAs can act as tumour suppressors or oncogenes. As sequence-based miRNA target predictions do not provide condition-specific information, many algorithms combine expression data for miRNAs and genes for prioritization of miRNA targets. However, common strategies prioritize miRNA-gene associations, although a miRNA may only target a subset of the alternative transcripts produced by a gene. Thus, current approaches are suboptimal. Here we address the problem of transcript and not gene based miRNA target prioritization. We show how to leverage methods that were developed for gene expression based miRNA-target prioritization for transcripts. In addition, we introduce a new multitasking based learning (MTL) method that uses structured-sparsity inducing regularization to improve accuracy of the learning. The new MTL approach performs especially favorable in small sample size settings, for genes with many transcripts and with noisy transcript expression level estimates as shown with simulated data. In an analysis of real liver cancer RNA-seq data we show that the MTL approach better predicts transcript expression and outperforms simpler approaches for miRNA-target prediction.

2016 ◽  
Author(s):  
Azim Dehghani Amirabad ◽  
Marcel H Schulz

Deregulation of miRNAs is implicated in many diseases in particular cancer, where miRNAs can act as tumour suppressors or oncogenes. As sequence-based miRNA target predictions do not provide condition-specific information, many algorithms combine expression data for miRNAs and genes for prioritization of miRNA targets. However, common strategies prioritize miRNA-gene associations, although a miRNA may only target a subset of the alternative transcripts produced by a gene. Thus, current approaches are suboptimal. Here we address the problem of transcript and not gene based miRNA target prioritization. We show how to leverage methods that were developed for gene expression based miRNA-target prioritization for transcripts. In addition, we introduce a new multitasking based learning (MTL) method that uses structured-sparsity inducing regularization to improve accuracy of the learning. The new MTL approach performs especially favorable in small sample size settings, for genes with many transcripts and with noisy transcript expression level estimates as shown with simulated data. In an analysis of real liver cancer RNA-seq data we show that the MTL approach better predicts transcript expression and outperforms simpler approaches for miRNA-target prediction.


2016 ◽  
Author(s):  
Azim Dehghani Amirabad ◽  
Marcel H Schulz

Deregulation of miRNAs is implicated in many diseases in particular cancer, where miRNAs can act as tumour suppressors or oncogenes. As sequence-based miRNA target predictions do not provide condition-specific information, many algorithms combine expression data for miRNAs and genes for prioritization of miRNA targets. However, common strategies prioritize miRNA-gene as- sociations, although a miRNA may only target a subset of the alternative transcripts produced by a gene. Thus, current approaches are suboptimal. Here we address the problem of transcript and not gene based miRNA target prioritization. We show how to leverage methods that were developed for gene expression based miRNA-target prioritization for transcripts. In addition, we introduce a new multitasking based learning (MTL) method that uses structured-sparsity inducing regularization to improve accuracy of the learning. The new MTL approach performs especially favorable in small sample size settings, for genes with many transcripts and with noisy transcript expression level es- timates as shown with simulated data. In an analysis of real liver cancer RNA-seq data we show that the MTL approach better predicts transcript expression and outperforms simpler approaches for miRNA-target prediction.


2019 ◽  
Vol 14 (5) ◽  
pp. 432-445 ◽  
Author(s):  
Muniba Faiza ◽  
Khushnuma Tanveer ◽  
Saman Fatihi ◽  
Yonghua Wang ◽  
Khalid Raza

Background: MicroRNAs (miRNAs) are small non-coding RNAs that control gene expression at the post-transcriptional level through complementary base pairing with the target mRNA, leading to mRNA degradation and blocking translation process. Many dysfunctions of these small regulatory molecules have been linked to the development and progression of several diseases. Therefore, it is necessary to reliably predict potential miRNA targets. Objective: A large number of computational prediction tools have been developed which provide a faster way to find putative miRNA targets, but at the same time, their results are often inconsistent. Hence, finding a reliable, functional miRNA target is still a challenging task. Also, each tool is equipped with different algorithms, and it is difficult for the biologists to know which tool is the best choice for their study. Methods: We analyzed eleven miRNA target predictors on Drosophila melanogaster and Homo sapiens by applying significant empirical methods to evaluate and assess their accuracy and performance using experimentally validated high confident mature miRNAs and their targets. In addition, this paper also describes miRNA target prediction algorithms, and discusses common features of frequently used target prediction tools. Results: The results show that MicroT, microRNA and CoMir are the best performing tool on Drosopihla melanogaster; while TargetScan and miRmap perform well for Homo sapiens. The predicted results of each tool were combined in order to improve the performance in both the datasets, but any significant improvement is not observed in terms of true positives. Conclusion: The currently available miRNA target prediction tools greatly suffer from a large number of false positives. Therefore, computational prediction of significant targets with high statistical confidence is still an open challenge.


2018 ◽  
Vol 16 (04) ◽  
pp. 1850013 ◽  
Author(s):  
Mohammad Mohebbi ◽  
Liang Ding ◽  
Russell L. Malmberg ◽  
Cory Momany ◽  
Khaled Rasheed ◽  
...  

miRNAs are involved in many critical cellular activities through binding to their mRNA targets, e.g. in cell proliferation, differentiation, death, growth control, and developmental timing. Accurate prediction of miRNA targets can assist efficient experimental investigations on the functional roles of miRNAs. Their prediction, however, remains a challengeable task due to the lack of experimental data about the tertiary structure of miRNA-target binding duplexes. In particular, correlations of nucleotides in the binding duplexes may not be limited to the canonical Watson Crick base pairs (BPs) as they have been perceived; methods based on secondary structure prediction (typically minimum free energy (MFE)) have only had mix success. In this work, we characterized miRNA binding duplexes with a graph model to capture the correlations between pairs of nucleotides of an miRNA and its target sequences. We developed machine learning algorithms to train the graph model to predict the target sites of miRNAs. In particular, because imbalance between positive and negative samples can significantly deteriorate the performance of machine learning methods, we designed a novel method to re-sample available dataset to produce more informative data learning process. We evaluated our model and miRNA target prediction method on human miRNAs and target data obtained from mirTarBase, a database of experimentally verified miRNA-target interactions. The performance of our method in target prediction achieved a sensitivity of 86% with a false positive rate below 13%. In comparison with the state-of-the-art methods miRanda and RNAhybrid on the test data, our method outperforms both of them by a significant margin. The source codes, test sets and model files all are available at http://rna-informatics.uga.edu/?f=software&p=GraB-miTarget .


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Tongjun Gu ◽  
Xiwu Zhao ◽  
William Bradley Barbazuk ◽  
Ji-Hyun Lee

Abstract Background microRNAs (miRNAs) have been shown to play essential roles in a wide range of biological processes. Many computational methods have been developed to identify targets of miRNAs. However, the majority of these methods depend on pre-defined features that require considerable efforts and resources to compute and often prove suboptimal at predicting miRNA targets. Results We developed a novel hybrid deep learning-based (DL-based) approach that is capable of predicting miRNA targets at a higher accuracy. This approach integrates convolutional neural networks (CNNs) that excel in learning spatial features and recurrent neural networks (RNNs) that discern sequential features. Therefore, our approach has the advantages of learning both the intrinsic spatial and sequential features of miRNA:target. The inputs for our approach are raw sequences of miRNAs and genes that can be obtained effortlessly. We applied our approach on two human datasets from recently miRNA target prediction studies and trained two models. We demonstrated that the two models consistently outperform the previous methods according to evaluation metrics on test datasets. Comparing our approach with currently available alternatives on independent datasets shows that our approach delivers substantial improvements in performance. We also show with multiple evidences that our approach is more robust than other methods on small datasets. Our study is the first study to perform comparisons across multiple existing DL-based approaches on miRNA target prediction. Furthermore, we examined the contribution of a Max pooling layer in between the CNN and RNN and demonstrated that it improves the performance of all our models. Finally, a unified model was developed that is robust on fitting different input datasets. Conclusions We present a new DL-based approach for predicting miRNA targets and demonstrate that our approach outperforms the current alternatives. We supplied an easy-to-use tool, miTAR, at https://github.com/tjgu/miTAR. Furthermore, our analysis results support that Max Pooling generally benefits the hybrid models and potentially prevents overfitting for hybrid models.


2020 ◽  
Author(s):  
Tongjun Gu ◽  
Xiwu Zhao ◽  
William Bradley Barbazuk ◽  
Ji-Hyun Lee

AbstractmicroRNAs (miRNAs) are a major type of small RNA that alter gene expression at the post-transcriptional or translational level. They have been shown to play important roles in a wide range of biological processes. Many computational methods have been developed to predict targets of miRNAs in order to understand miRNAs’ function. However, the majority of the methods depend on a set of pre-defined features that require considerable effort and resources to compute, and these methods often do not effectively on the prediction of miRNA targets. Therefore, we developed a novel hybrid deep learning-based approach that is capable to predict miRNA targets at a higher accuracy. Our approach integrates two deep learning methods: convolutional neural networks (CNNs) that excel in learning spatial features, and recurrent neural networks (RNNs) that discern sequential features. By combining CNNs and RNNs, our approach has the advantages of learning both the intrinsic spatial and sequential features of miRNA:target. The inputs for the approach are raw sequences of miRNA and gene sequences. Data from two latest miRNA target prediction studies were used in our study: the DeepMirTar dataset and the miRAW dataset. Two models were obtained by training on the two datasets separately. The models achieved a higher accuracy than the methods developed in the previous studies: 0.9787 vs. 0.9348 for the DeepMirTar dataset; 0.9649 vs. 0.935 for the miRAW dataset. We also calculated a series of model evaluation metrics including sensitivity, specificity, F-score and Brier Score. Our approach consistently outperformed the current methods. In addition, we compared our approach with earlier developed deep learning methods, resulting in an overall better performance. Lastly, a unified model for both datasets was developed with an accuracy higher than the current methods (0.9545). We named the unified model miTAR for miRNA target prediction. The source code and executable are available at https://github.com/tjgu/miTAR.


Author(s):  
Lianbo Yu ◽  
Parul Gulati ◽  
Soledad Fernandez ◽  
Michael Pennell ◽  
Lawrence Kirschner ◽  
...  

Gene expression microarray experiments with few replications lead to great variability in estimates of gene variances. Several Bayesian methods have been developed to reduce this variability and to increase power. Thus far, moderated t methods assumed a constant coefficient of variation (CV) for the gene variances. We provide evidence against this assumption, and extend the method by allowing the CV to vary with gene expression. Our CV varying method, which we refer to as the fully moderated t-statistic, was compared to three other methods (ordinary t, and two moderated t predecessors). A simulation study and a familiar spike-in data set were used to assess the performance of the testing methods. The results showed that our CV varying method had higher power than the other three methods, identified a greater number of true positives in spike-in data, fit simulated data under varying assumptions very well, and in a real data set better identified higher expressing genes that were consistent with functional pathways associated with the experiments.


2019 ◽  
Vol 48 (D1) ◽  
pp. D127-D131 ◽  
Author(s):  
Yuhao Chen ◽  
Xiaowei Wang

Abstract MicroRNAs (miRNAs) are small noncoding RNAs that act as master regulators in many biological processes. miRNAs function mainly by downregulating the expression of their gene targets. Thus, accurate prediction of miRNA targets is critical for characterization of miRNA functions. To this end, we have developed an online database, miRDB, for miRNA target prediction and functional annotations. Recently, we have performed major updates for miRDB. Specifically, by employing an improved algorithm for miRNA target prediction, we now present updated transcriptome-wide target prediction data in miRDB, including 3.5 million predicted targets regulated by 7000 miRNAs in five species. Further, we have implemented the new prediction algorithm into a web server, allowing custom target prediction with user-provided sequences. Another new database feature is the prediction of cell-specific miRNA targets. miRDB now hosts the expression profiles of over 1000 cell lines and presents target prediction data that are tailored for specific cell models. At last, a new web query interface has been added to miRDB for prediction of miRNA functions by integrative analysis of target prediction and Gene Ontology data. All data in miRDB are freely accessible at http://mirdb.org.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Jing Liu ◽  
Xiaonan Liu ◽  
Siju Zhang ◽  
Shanshan Liang ◽  
Weijiang Luan ◽  
...  

Abstract Background In plants, microRNAs (miRNAs) are pivotal regulators of plant development and stress responses. Different computational tools and web servers have been developed for plant miRNA target prediction; however, in silico prediction normally contains false positive results. In addition, many plant miRNA target prediction servers lack information for miRNA-triggered phased small interfering RNAs (phasiRNAs). Creating a comprehensive and relatively high-confidence plant miRNA target database is much needed. Results Here, we report TarDB, an online database that collects three categories of relatively high-confidence plant miRNA targets: (i) cross-species conserved miRNA targets; (ii) degradome/PARE (Parallel Analysis of RNA Ends) sequencing supported miRNA targets; (iii) miRNA-triggered phasiRNA loci. TarDB provides a user-friendly interface that enables users to easily search, browse and retrieve miRNA targets and miRNA initiated phasiRNAs in a broad variety of plants. TarDB has a comprehensive collection of reliable plant miRNA targets containing previously unreported miRNA targets and miRNA-triggered phasiRNAs even in the well-studied model species. Most of these novel miRNA targets are relevant to lineage-specific or species-specific miRNAs. TarDB data is freely available at http://www.biosequencing.cn/TarDB. Conclusions In summary, TarDB serves as a useful web resource for exploring relatively high-confidence miRNA targets and miRNA-triggered phasiRNAs in plants.


2015 ◽  
Author(s):  
Wilson Wen Bin Goh ◽  
Limsoon Wong

Proteomics is poised to play critical roles in clinical research. However, due to limited coverage and high noise, integration with powerful analysis algorithms is necessary. In particular, network-based algorithms can improve selection of reproducible features in spite of incomplete proteome coverage, technical inconsistency or high inter-sample variability. We define analytical reliability on three benchmarks --- precision/recall rates, feature-selection stability and cross-validation accuracy. Using these, we demonstrate the insufficiencies of commonly used Student???s t-test and Hypergeometric enrichment. Given advances in sample sizes, quantitation accuracy and coverage, we are now able to introduce and evaluate Ranked-Based Network Approaches (RBNAs) for the first time in proteomics. These include SNET (SubNETwork), FSNET (FuzzySNET), PFSNET (PairedFSNET). We also introduce for the first time, PPFSNET(samplePairedPFSNET), which is a paired-sample variant of PFSNET. RBNAs (particularly PFSNET and PPFSNET) excelled on all three benchmarks and can make consistent and reproducible predictions even in the small-sample size scenario (n=4). Given these qualities, RBNAs represent an important advancement in network biology, and is expected to see practical usage, particularly in clinical biomarker and drug target prediction.


Sign in / Sign up

Export Citation Format

Share Document