iRNAD: a computational tool for identifying D modification sites in RNA sequence

2019 ◽  
Vol 35 (23) ◽  
pp. 4922-4929 ◽  
Author(s):  
Zhao-Chun Xu ◽  
Peng-Mian Feng ◽  
Hui Yang ◽  
Wang-Ren Qiu ◽  
Wei Chen ◽  
...  

Abstract Motivation Dihydrouridine (D) is a common RNA post-transcriptional modification found in eukaryotes, bacteria and a few archaea. The modification can promote the conformational flexibility of individual nucleotide bases. And its levels are increased in cancerous tissues. Therefore, it is necessary to detect D in RNA for further understanding its functional roles. Since wet-experimental techniques for the aim are time-consuming and laborious, it is urgent to develop computational models to identify D modification sites in RNA. Results We constructed a predictor, called iRNAD, for identifying D modification sites in RNA sequence. In this predictor, the RNA samples derived from five species were encoded by nucleotide chemical property and nucleotide density. Support vector machine was utilized to perform the classification. The final model could produce the overall accuracy of 96.18% with the area under the receiver operating characteristic curve of 0.9839 in jackknife cross-validation test. Furthermore, we performed a series of validations from several aspects and demonstrated the robustness and reliability of the proposed model. Availability and implementation A user-friendly web-server called iRNAD can be freely accessible at http://lin-group.cn/server/iRNAD, which will provide convenience and guide to users for further studying D modification.

2020 ◽  
Vol 20 (8) ◽  
pp. 592-601
Author(s):  
Zhe Ju ◽  
Shi-Yun Wang

Introduction: Neddylation is a highly dynamic and reversible post-translatiNeddylation is a highly dynamic and reversible post-translational modification. The abnormality of neddylation has previously been shown to be closely related to some human diseases. The detection of neddylation sites is essential for elucidating the regulation mechanisms of protein neddylation.onal modification which has been found to be involved in various biological processes and closely associated with many diseases. The accurate identification of neddylation sites is necessary to elucidate the underlying molecular mechanisms of neddylation. As the traditional experimental methods are time consuming and expensive, it is desired to develop computational methods to predict neddylation sites. In this study, a novel predictor named NeddPred is proposed to predict lysine neddylation sites. An effective feature extraction method, bi-profile bayes encoding, is employed to encode neddylation sites. Moreover, a fuzzy support vector machine algorithm is proposed to solve the class imbalance and noise problem in the prediction of neddylation sites. As illustrated by 10-fold cross-validation, NeddPred achieves an excellent performance with a Matthew's correlation coefficient of 0.7082 and an area under receiver operating characteristic curve of 0.9769. Independent tests show that NeddPred significantly outperforms existing neddylation sites predictor NeddyPreddy. Therefore, NeddPred can be a complement to the existing tools for the prediction of neddylation sites. A user-friendly web-server for NeddPred is established at 123.206.31.171/NeddPred/. Objective: As the detection of the lysine neddylation sites by the traditional experimental method is often expensive and time-consuming, it is imperative to design computational methods to identify neddylation sites. Methods: In this study, a bioinformatics tool named NeddPred is developed to identify underlying protein neddylation sites. A bi-profile bayes feature extraction is used to encode neddylation sites and a fuzzy support vector machine model is utilized to overcome the problem of noise and class imbalance in the prediction. Results: Matthew's correlation coefficient of NeddPred achieved 0.7082 and an area under the receiver operating characteristic curve of 0.9769. Independent tests show that NeddPred significantly outperforms existing lysine neddylation sites predictor NeddyPreddy. Conclusion: Therefore, NeddPred can be a complement to the existing tools for the prediction of neddylation sites. A user-friendly webserver for NeddPred is accessible at 123.206.31.171/NeddPred/.


2021 ◽  
Vol 22 (24) ◽  
pp. 13607
Author(s):  
Zhou Huang ◽  
Yu Han ◽  
Leibo Liu ◽  
Qinghua Cui ◽  
Yuan Zhou

MicroRNAs (miRNAs) are associated with various complex human diseases and some miRNAs can be directly involved in the mechanisms of disease. Identifying disease-causative miRNAs can provide novel insight in disease pathogenesis from a miRNA perspective and facilitate disease treatment. To date, various computational models have been developed to predict general miRNA–disease associations, but few models are available to further prioritize causal miRNA–disease associations from non-causal associations. Therefore, in this study, we constructed a Levenshtein-Distance-Enhanced miRNA–Disease Causal Association Predictor (LE-MDCAP), to predict potential causal miRNA–disease associations. Specifically, Levenshtein distance matrixes covering the sequence, expression and functional miRNA similarities were introduced to enhance the previous Gaussian interaction profile kernel-based similarity matrix. LE-MDCAP integrated miRNA similarity matrices, disease semantic similarity matrix and known causal miRNA–disease associations to make predictions. For regular causal vs. non-disease association discrimination task, LF-MDCAP achieved area under the receiver operating characteristic curve (AUROC) of 0.911 and 0.906 in 10-fold cross-validation and independent test, respectively. More importantly, LE-MDCAP prominently outperformed the previous MDCAP model in distinguishing causal versus non-causal miRNA–disease associations (AUROC 0.820 vs. 0.695). Case studies performed on diabetic retinopathy and hsa-mir-361 also validated the accuracy of our model. In summary, LE-MDCAP could be useful for screening causal miRNA–disease associations from general miRNA–disease associations.


2021 ◽  
Vol 11 (3) ◽  
pp. 199
Author(s):  
Fajar Javed ◽  
Syed Omer Gilani ◽  
Seemab Latif ◽  
Asim Waris ◽  
Mohsin Jamil ◽  
...  

Perinatal depression and anxiety are defined to be the mental health problems a woman faces during pregnancy, around childbirth, and after child delivery. While this often occurs in women and affects all family members including the infant, it can easily go undetected and underdiagnosed. The prevalence rates of antenatal depression and anxiety worldwide, especially in low-income countries, are extremely high. The wide majority suffers from mild to moderate depression with the risk of leading to impaired child–mother relationship and infant health, few women end up taking their own lives. Owing to high costs and non-availability of resources, it is almost impossible to diagnose every pregnant woman for depression/anxiety whereas under-detection can have a lasting impact on mother and child’s health. This work proposes a multi-layer perceptron based neural network (MLP-NN) classifier to predict the risk of depression and anxiety in pregnant women. We trained and evaluated our proposed system on a Pakistani dataset of 500 women in their antenatal period. ReliefF was used for feature selection before classifier training. Evaluation metrics such as accuracy, sensitivity, specificity, precision, F1 score, and area under the receiver operating characteristic curve were used to evaluate the performance of the trained model. Multilayer perceptron and support vector classifier achieved an area under the receiving operating characteristic curve of 88% and 80% for antenatal depression and 85% and 77% for antenatal anxiety, respectively. The system can be used as a facilitator for screening women during their routine visits in the hospital’s gynecology and obstetrics departments.


2018 ◽  
Vol 26 (1) ◽  
pp. 141-155 ◽  
Author(s):  
Li Luo ◽  
Fengyi Zhang ◽  
Yao Yao ◽  
RenRong Gong ◽  
Martina Fu ◽  
...  

Surgery cancellations waste scarce operative resources and hinder patients’ access to operative services. In this study, the Wilcoxon and chi-square tests were used for predictor selection, and three machine learning models – random forest, support vector machine, and XGBoost – were used for the identification of surgeries with high risks of cancellation. The optimal performances of the identification models were as follows: sensitivity − 0.615; specificity − 0.957; positive predictive value − 0.454; negative predictive value − 0.904; accuracy − 0.647; and area under the receiver operating characteristic curve − 0.682. Of the three models, the random forest model achieved the best performance. Thus, the effective identification of surgeries with high risks of cancellation is feasible with stable performance. Models and sampling methods significantly affect the performance of identification. This study is a new application of machine learning for the identification of surgeries with high risks of cancellation and facilitation of surgery resource management.


Author(s):  
Shiqian He ◽  
Liang Kong ◽  
Jing Chen

Accurate detection of N6-methyladenine (6mA) sites by biochemical experiments will help to reveal their biological functions, still, these wet experiments are laborious and expensive. Therefore, it is necessary to introduce a powerful computational model to identify the 6mA sites on a genomic scale, especially for plant genomes. In view of this, we proposed a model called iDNA6mA-Rice-DL for the effective identification of 6mA sites in rice genome, which is an intelligent computing model based on deep learning method. Traditional machine learning methods assume the preparation of the features for analysis. However, our proposed model automatically encodes and extracts key DNA features through an embedded layer and several groups of dense layers. We use an independent dataset to evaluate the generalization ability of our model. An area under the receiver operating characteristic curve (auROC) of 0.98 with an accuracy of 95.96% was obtained. The experiment results demonstrate that our model had good performance in predicting 6mA sites in the rice genome. A user-friendly local web server has been established. The Docker image of the local web server can be freely downloaded at https://hub.docker.com/r/his1server/idna6ma-rice-dl .


Stroke ◽  
2019 ◽  
Vol 50 (4) ◽  
pp. 837-844 ◽  
Author(s):  
Carlina E. van Donkelaar ◽  
Nicolaas A. Bakker ◽  
Jaqueline Birks ◽  
Nic J.G.M. Veeger ◽  
Jan D.M. Metzemaekers ◽  
...  

Background and Purpose— Early prediction of clinical outcome after aneurysmal subarachnoid hemorrhage (aSAH) is still lacking accuracy. In this observational cohort study, we aimed to develop and validate an accurate bedside prediction model for clinical outcome after aSAH, to aid decision-making at an early stage. Methods— For the development of the prediction model, a prospectively kept single-center cohort of 1215 aSAH patients, admitted between 1998 and 2014, was used. For temporal validation, a prospective cohort of 224 consecutive aSAH patients from the same center, admitted between 2015 and 2017, was used. External validation was performed using the ISAT (International Subarachnoid Aneurysm Trial) database (2143 patients). Primary outcome measure was poor functional outcome 2 months after aSAH, defined as modified Rankin Scale score 4–6. The model was constructed using multivariate regression analyses. Performance of the model was examined in terms of discrimination and calibration. Results— The final model included 4 predictors independently associated with poor outcome after 2 months: age, World Federation of Neurosurgical Societies grade after resuscitation, aneurysm size, and Fisher grade. Temporal validation showed high discrimination (area under the receiver operating characteristic curve, 0.90; 95% CI, 0.85–0.94), external validation showed fair to good discrimination (area under the receiver operating characteristic curve, 0.73; 95% CI, 0.70–0.76). The model showed satisfactory calibration in both validation cohorts. The SAFIRE grading scale was derived from the final model: size of the aneurysm, age, Fisher grade, world federation of neurosurgical societies after resuscitation. Conclusions— The SAFIRE grading scale is an accurate, generalizable, and easily applicable model for early prediction of clinical outcome after aSAH.


Genes ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 898 ◽  
Author(s):  
Mobeen Ur Rehman ◽  
Kil To Chong

DNA N6-methyladenine (6mA) is part of numerous biological processes including DNA repair, DNA replication, and DNA transcription. The 6mA modification sites hold a great impact when their biological function is under consideration. Research in biochemical experiments for this purpose is carried out and they have demonstrated good results. However, they proved not to be a practical solution when accessed under cost and time parameters. This led researchers to develop computational models to fulfill the requirement of modification identification. In consensus, we have developed a computational model recommended by Chou’s 5-steps rule. The Neural Network (NN) model uses convolution layers to extract the high-level features from the encoded binary sequence. These extracted features were given an optimal interpretation by using a Long Short-Term Memory (LSTM) layer. The proposed architecture showed higher performance compared to state-of-the-art techniques. The proposed model is evaluated on Mus musculus, Rice, and “Combined-species” genomes with 5- and 10-fold cross-validation. Further, with access to a user-friendly web server, publicly available can be accessed freely.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Qianqian Han ◽  
Bo Yan ◽  
Guobao Ning ◽  
B. Yu

An improved SVM model is presented to forecast dry bulk freight index (BDI) in this paper, which is a powerful tool for operators and investors to manage the market trend and avoid price risking shipping industry. The BDI is influenced by many factors, especially the random incidents in dry bulk market, inducing the difficulty in forecasting of BDI. Therefore, to eliminate the impact of random incidents in dry bulk market, wavelet transform is adopted to denoise the BDI data series. Hence, the combined model of wavelet transform and support vector machine is developed to forecast BDI in this paper. Lastly, the BDI data in 2005 to 2012 are presented to test the proposed model. The 84 prior consecutive monthly BDI data are the inputs of the model, and the last 12 monthly BDI data are the outputs of model. The parameters of the model are optimized by genetic algorithm and the final model is conformed through SVM training. This paper compares the forecasting result of proposed method and three other forecasting methods. The result shows that the proposed method has higher accuracy and could be used to forecast the short-term trend of the BDI.


2012 ◽  
Vol 229-231 ◽  
pp. 2276-2279
Author(s):  
Yu An Pan ◽  
Xuan Xiao ◽  
Pu Wang

Antimicrobial peptides (AMP) are potent, broad spectrum antibiotics which demonstrate potential as novel therapeutic agents. Because it is both time-consuming and laborious to identify new AMPs by experiment, this paper tries to resolve this problem by pattern recognition. Two major contents included: Firstly, up to six kinds of physicochemical properties value are selected to code the AMP sequence as physical-chemical property matrix (PCM), then auto and cross covariance transformation is performed to extract features from the PCM for AMP sequence expression; Secondly, these feature vectors are input to a powerful Support Vector Machine (SVM) classifier for training and new query AMP recognition. For a newly constructed AMP benchmark dataset, the overall classification accuracy about 96% has been achieved through the rigorous Leave-One-Out cross-validation. For convenience, a user-friendly web server, AMPpred, has been established at http://icpr.jci.jx.cn/bioinfo/AMPpred. It is anticipated that this on-line predictor may become a useful bioinformatics tool for molecular biology and drug development. Also, its novel approach will further stimulate the development of predicting peptide attributes.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yemei Liu ◽  
Pei Yang ◽  
Yong Pi ◽  
Lisha Jiang ◽  
Xiao Zhong ◽  
...  

Abstract Background We aimed to construct an artificial intelligence (AI) guided identification of suspicious bone metastatic lesions from the whole-body bone scintigraphy (WBS) images by convolutional neural networks (CNNs). Methods We retrospectively collected the 99mTc-MDP WBS images with confirmed bone lesions from 3352 patients with malignancy. 14,972 bone lesions were delineated manually by physicians and annotated as benign and malignant. The lesion-based differentiating performance of the proposed network was evaluated by fivefold cross validation, and compared with the other three popular CNN architectures for medical imaging. The average sensitivity, specificity, accuracy and the area under receiver operating characteristic curve (AUC) were calculated. To delve the outcomes of this study, we conducted subgroup analyses, including lesion burden number and tumor type for the classifying ability of the CNN. Results In the fivefold cross validation, our proposed network reached the best average accuracy (81.23%) in identifying suspicious bone lesions compared with InceptionV3 (80.61%), VGG16 (81.13%) and DenseNet169 (76.71%). Additionally, the CNN model's lesion-based average sensitivity and specificity were 81.30% and 81.14%, respectively. Based on the lesion burden numbers of each image, the area under the receiver operating characteristic curve (AUC) was 0.847 in the few group (lesion number n ≤ 3), 0.838 in the medium group (n = 4–6), and 0.862 in the extensive group (n > 6). For the three major primary tumor types, the CNN-based lesion identifying AUC value was 0.870 for lung cancer, 0.900 for prostate cancer, and 0.899 for breast cancer. Conclusion The CNN model suggests potential in identifying suspicious benign and malignant bone lesions from whole-body bone scintigraphic images.


Sign in / Sign up

Export Citation Format

Share Document