mirna prediction
Recently Published Documents


TOTAL DOCUMENTS

19
(FIVE YEARS 5)

H-INDEX

5
(FIVE YEARS 1)

Genes ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 1280
Author(s):  
Dashuai Fan ◽  
Yuangen Yao ◽  
Ming Yi

MicroRNAs (miRNAs) are a kind of short non-coding ribonucleic acid molecules that can regulate gene expression. The computational identification of plant miRNAs is of great significance to understanding biological functions. In our previous studies, we have put firstly forward and further developed a set of knowledge-based energy features to construct two plant pre-miRNA prediction tools (plantMirP and riceMirP). However, these two tools cannot be used for miRNA prediction from NGS (Next-Generation Sequencing) data. In addition, for further improving the prediction performance and accessibility, plantMirP2 has been developed. Based on the latest dataset, plantMirP2 achieves a promising performance: 0.9968 (Area Under Curve, AUC), 0.9754 (accuracy), 0.9675 (sensitivity) and 0.9876 (specificity). Additionally, the comparisons with other plant pre-miRNA tools show that plantMirP2 performs better. Finally, the webserver and stand-alone version of plantMirP2 are available.


2020 ◽  
Vol 12 (4) ◽  
pp. 395-413
Author(s):  
Tianyang Yu ◽  
Na Xu ◽  
Neshatul Haque ◽  
Chang Gao ◽  
Wenhua Huang ◽  
...  

Genes ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 662
Author(s):  
Huiyu Zhang ◽  
Hua Wang ◽  
Yuangen Yao ◽  
Ming Yi

Rice microRNAs (miRNAs) are important post-transcriptional regulation factors and play vital roles in many biological processes, such as growth, development, and stress resistance. Identification of these molecules is the basis of dissecting their regulatory functions. Various machine learning techniques have been developed to identify precursor miRNAs (pre-miRNAs). However, no tool is implemented specifically for rice pre-miRNAs. This study aims at improving prediction performance of rice pre-miRNAs by constructing novel features with high discriminatory power and developing a training model with species-specific data. PlantMirP-rice, a stand-alone random forest-based miRNA prediction tool, achieves a promising accuracy of 93.48% based on independent (unseen) rice data. Comparisons with other competitive pre-miRNA prediction methods demonstrate that plantMirP-rice performs better than existing tools for rice and other plant pre-miRNA classification.


Data in Brief ◽  
2019 ◽  
Vol 25 ◽  
pp. 104209 ◽  
Author(s):  
L.A. Bugnon ◽  
C. Yones ◽  
J. Raad ◽  
D.H. Milone ◽  
G. Stegmayer

2018 ◽  
Author(s):  
R.J. Peace ◽  
M. Sheikh Hassani ◽  
J.R. Green

AbstractMethods for the de novo identification of microRNA (miRNA) have been developed using a range of sequence-based features. With the increasing availability of next generation sequencing (NGS) transcriptome data, there is a need for miRNA identification that integrates both NGS transcript expression-based patterns as well as advanced genomic sequence-based methods. While miRDeep2 does examine the predicted secondary structure of putative miRNA sequences, it does not leverage many of the sequence-based features used in state-of-the-art de novo methods. Meanwhile, other NGS-based methods, such as miRanalyzer, place an emphasis on sequence-based features without leveraging advanced expression-based features reflecting miRNA biosynthesis. This represents an opportunity to combine the strengths of NGS-based analysis with recent advances in de novo sequence-based miRNA prediction. We here develop a method, microRNA Prediction using Integrated Evidence (miPIE), which integrates both expression-based and sequence-based features to achieve significantly improved miRNA prediction performance. Feature selection identifies the 20 most discriminative features, 3 of which reflect strictly expression-based information. Evaluation using precision-recall curves, for six NGS data sets representing six diverse species, demonstrates substantial improvements in prediction performance compared to miRDeep2 and miRanalyzer. The individual contributions of expression-based and sequence-based features are also examined and we demonstrate that their combination is more effective than either alone.


2018 ◽  
Vol 20 (5) ◽  
pp. 1607-1620 ◽  
Author(s):  
Georgina Stegmayer ◽  
Leandro E Di Persia ◽  
Mariano Rubiolo ◽  
Matias Gerard ◽  
Milton Pividori ◽  
...  

Abstract Motivation The importance of microRNAs (miRNAs) is widely recognized in the community nowadays because these short segments of RNA can play several roles in almost all biological processes. The computational prediction of novel miRNAs involves training a classifier for identifying sequences having the highest chance of being precursors of miRNAs (pre-miRNAs). The big issue with this task is that well-known pre-miRNAs are usually few in comparison with the hundreds of thousands of candidate sequences in a genome, which results in high class imbalance. This imbalance has a strong influence on most standard classifiers, and if not properly addressed in the model and the experiments, not only performance reported can be completely unrealistic but also the classifier will not be able to work properly for pre-miRNA prediction. Besides, another important issue is that for most of the machine learning (ML) approaches already used (supervised methods), it is necessary to have both positive and negative examples. The selection of positive examples is straightforward (well-known pre-miRNAs). However, it is difficult to build a representative set of negative examples because they should be sequences with hairpin structure that do not contain a pre-miRNA. Results This review provides a comprehensive study and comparative assessment of methods from these two ML approaches for dealing with the prediction of novel pre-miRNAs: supervised and unsupervised training. We present and analyze the ML proposals that have appeared during the past 10 years in literature. They have been compared in several prediction tasks involving two model genomes and increasing imbalance levels. This work provides a review of existing ML approaches for pre-miRNA prediction and fair comparisons of the classifiers with same features and data sets, instead of just a revision of published software tools. The results and the discussion can help the community to select the most adequate bioinformatics approach according to the prediction task at hand. The comparative results obtained suggest that from low to mid-imbalance levels between classes, supervised methods can be the best. However, at very high imbalance levels, closer to real case scenarios, models including unsupervised and deep learning can provide better performance.


2018 ◽  
Vol 11 (1) ◽  
pp. 17-24 ◽  
Author(s):  
Sasti Gopal Das ◽  
Hirak Jyoti Chak ◽  
Abhijit Datta

2017 ◽  
Vol 1 (Special Issue-Supplement) ◽  
pp. 230-230
Author(s):  
Pooja Viswam ◽  
Manuel Philip ◽  
M.J. Jayasankar ◽  
Jubina Benny ◽  
C.P. Rajadurai ◽  
...  

2017 ◽  
Vol 14 (6) ◽  
pp. 1316-1326 ◽  
Author(s):  
Georgina Stegmayer ◽  
Cristian Yones ◽  
Laura Kamenetzky ◽  
Diego H. Milone

Sign in / Sign up

Export Citation Format

Share Document