scholarly journals tRFTars: predicting the targets of tRNA-derived fragments

2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Qiong Xiao ◽  
Peng Gao ◽  
Xuanzhang Huang ◽  
Xiaowan Chen ◽  
Quan Chen ◽  
...  

Abstract Background tRNA-derived fragments (tRFs) are 14–40-nucleotide-long, small non-coding RNAs derived from specific tRNA cleavage events with key regulatory functions in many biological processes. Many studies have shown that tRFs are associated with Argonaute (AGO) complexes and inhibit gene expression in the same manner as miRNAs. However, there are currently no tools for accurately predicting tRF target genes. Methods We used tRF-mRNA pairs identified by crosslinking, ligation, and sequencing of hybrids (CLASH) and covalent ligation of endogenous AGO-bound RNAs (CLEAR)-CLIP to assess features that may participate in tRF targeting, including the sequence context of each site and tRF-mRNA interactions. We applied genetic algorithm (GA) to select key features and support vector machine (SVM) to construct tRF prediction models. Results We first identified features that globally influenced tRF targeting. Among these features, the most significant were the minimum free folding energy (MFE), position 8 match, number of bases paired in the tRF-mRNA duplex, and length of the tRF, which were consistent with previous findings. Our constructed model yielded an area under the receiver operating characteristic (ROC) curve (AUC) = 0.980 (0.977–0.983) in the training process and an AUC = 0.847 (0.83–0.861) in the test process. The model was applied to all the sites with perfect Watson–Crick complementarity to the seed in the 3′ untranslated region (3′-UTR) of the human genome. Seven of nine target/nontarget genes of tRFs confirmed by reporter assay were predicted. We also validated the predictions via quantitative real-time PCR (qRT-PCR). Thirteen potential target genes from the top of the predictions were significantly down-regulated at the mRNA levels by overexpression of the tRFs (tRF-3001a, tRF-3003a or tRF-3009a). Conclusions Predictions can be obtained online, tRFTars, freely available at http://trftars.cmuzhenninglab.org:3838/tar/, which is the first tool to predict targets of tRFs in humans with a user-friendly interface.

2021 ◽  
Author(s):  
Qiong Xiao ◽  
Peng Gao ◽  
Xuanzhang Huang ◽  
Xiaowan Chen ◽  
Quan Chen ◽  
...  

Abstract Background: tRNA-derived fragments (tRFs) are 14–40-nucleotide-long, small non-coding RNAs derived from specific tRNA cleavage events with key regulatory functions in many biological processes. Many studies have shown that tRFs are associated with Argonaute (AGO) complexes and inhibit gene expression in the same manner as miRNAs. However, there are currently no tools for accurately predicting tRF target genes.Methods: We used tRF-mRNA pairs identified by crosslinking, ligation, and sequencing of hybrids (CLASH) and covalent ligation of endogenous AGO-bound RNAs (CLEAR)-CLIP to assess features that may participate in tRF targeting, including the sequence context of each site and tRF-mRNA interactions. We applied genetic algorithm (GA) to select key features and support vector machine (SVM) to construct tRF prediction models.Results: We first identified features that globally influenced tRF targeting. Among these features, the most significant were the minimum free folding energy (MFE), position 8 match, number of bases paired in the tRF-mRNA duplex, and length of the tRF, which were consistent with previous findings. Our constructed model yielded an area under the receiver operating characteristic (ROC) curve (AUC) = 0.980 (0.977-0.983) in the training process and an AUC = 0.847 (0.83-0.861) in the test process. The model was applied to all the sites with perfect Watson-Crick complementarity to the seed in the 3' untranslated region (3'-UTR) of the human genome. Seven of nine target/nontarget genes of tRFs confirmed by reporter assay were predicted. We also validated the predictions via quantitative real-time PCR (qRT-PCR). Thirteen potential target genes from the top of the predictions were significantly down-regulated at the mRNA levels by overexpression of the tRFs (tRF-3001a, tRF-3003a or tRF-3009a).Conclusions: Predictions can be obtained online, tRFTars, freely available at http://trftars.cmuzhenninglab.org:3838/tar/, which is the first tool to predict targets of tRFs in humans with a user-friendly interface.


2020 ◽  
Author(s):  
Qiong Xiao ◽  
Peng Gao ◽  
Xuanzhang Huang ◽  
Xiaowan Chen ◽  
Quan Chen ◽  
...  

Abstract Background: tRNA-derived fragments (tRFs) are 14–40-nucleotide-long, small non-coding RNAs derived from specific tRNA cleavage events with key regulatory functions in many biological processes. Many studies have shown that tRFs are associated with Argonaute (AGO) complexes and inhibit gene expression in the same manner as miRNAs. However, there are currently no tools for accurately predicting tRF target genes. Methods: We used tRF-mRNA pairs identified by crosslinking, ligation, and sequencing of hybrids (CLASH) and covalent ligation of endogenous AGO-bound RNAs (CLEAR)-CLIP to assess features that may participate in tRF targeting, including the sequence context of each site and tRF-mRNA interactions. We applied genetic algorithm (GA) to select key features and support vector machine (SVM) to construct tRF prediction models. Results: We first identified features that globally influenced tRF targeting. Among these features, the most significant were the minimum free folding energy (MFE), position 8 match, number of bases paired in the tRF-mRNA duplex, and length of the tRF, which were consistent with previous findings. Our constructed model yielded an area under the receiver operating characteristic (ROC) curve (AUC) = 0.980 (0.977-0.983) in the training process and an AUC = 0.847 (0.83-0.861) in the test process. The model was applied to all the sites with perfect Watson-Crick complementarity to the seed in the 3' untranslated region (3'-UTR) of the human genome. Seven of nine target/nontarget genes of tRFs confirmed by reporter assay were predicted. Conclusions: Predictions can be obtained online, tRFTar, freely available at http://trftar.cmuzhenninglab.org:3838/tar/, which is the first tool to predict targets of tRFs in humans with a user-friendly interface.


2020 ◽  
Author(s):  
Qiong Xiao ◽  
Peng Gao ◽  
Xuanzhang Huang ◽  
Xiaowan Chen ◽  
Quan Chen ◽  
...  

Abstract Background: The tRNA-derived fragments (tRFs) are 14–40 nucleotides, small non-coding RNAs from specific tRNA cleavages, and they have key regulatory functions in many biological processes. Many studies showed that tRFs are associated with Argonaute complexes and inhibit gene expression in the same manner as miRNAs. However, there are currently no tools to accurately predict tRF target genes. Methods: We used tRF-mRNA pairs identified by crosslinking, ligation, and sequencing of hybrids (CLASH) and covalent ligation of endogenous Argonaute-bound RNAs (CLEAR)-CLIP to assess features that may participate in tRF targeting, including sequence context of each individual site and tRF-mRNA interactions. We applied genetic algorithm (GA) to select key features and support vector machine (SVM) to construct tRF predicting models. Results: We first identified features that globally influenced tRF targeting. Among them, the most significant ones were minimum free folding energy (MFE), position 8 match, number of bases paired in tRF-mRNA duplex, and length of tRF, which were consistent with previous findings. We built the model with the area under the receiver operating characteristic (ROC) curve (AUC) = 0.980 (0.977-0.983) in the training process and AUC = 0.847 (0.83-0.861) in the test process. The model was applied to all the sites with perfect Watson-Crick complementarity to the seed in the 3'-UTR of human genome. Seven of nine target / non-target genes of tRFs confirmed by reporter assay were predicted. Conclusions: Predictions can be obtained online, tRFTar, freely available at http://trftar.cmuzhenninglab.org:3838/tar/, which is the first tool to predict targets of tRFs in human with a user-friendly interface.


2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


2021 ◽  
Vol 10 (4) ◽  
pp. 199
Author(s):  
Francisco M. Bellas Aláez ◽  
Jesus M. Torres Palenzuela ◽  
Evangelos Spyrakos ◽  
Luis González Vilas

This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.


2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Ming Gu ◽  
Shengjie Fan ◽  
Gaigai Liu ◽  
Lu Guo ◽  
Xiaobo Ding ◽  
...  

Wax gourd is a popular vegetable in East Asia. In traditional Chinese medicine, wax gourd peel is used to prevent and treat metabolic diseases such as hyperlipidemia, hyperglycemia, obesity, and cardiovascular disease. However, there is no experimental evidence to support these applications. Here, we examined the effect of the extract of wax gourd peel (EWGP) on metabolic disorders in diet-induced C57BL/6 obese mice. In the preventive experiment, EWGP blocked body weight gain and lowered serum total cholesterol (TC), low-density lipoprotein cholesterol (LDL-c), liver TG and TC contents, and fasting blood glucose in mice fed with a high-fat diet. In the therapeutic study, we induced obesity in the mice and treated with EWGP for two weeks. We found that EWGP treatment reduced serum and liver triglyceride (TG) contents and fasting blood glucose and improved glucose tolerance in the mice. Reporter assay and gene expression analysis showed that EWGP could inhibit peroxisome proliferator-activated receptorγ(PPARγ) transactivities and could decrease mRNA levels of PPARγand its target genes. We also found that HMG-CoA reductase (HMGCR) was downregulated in the mouse liver by EWGP. Our data suggest that EWGP lowers hyperlipidemia of C57BL/6 mice induced by high-fat diet via the inhibition of PPARγand HMGCR signaling.


Author(s):  
Cheng-Chien Lai ◽  
Wei-Hsin Huang ◽  
Betty Chia-Chen Chang ◽  
Lee-Ching Hwang

Predictors for success in smoking cessation have been studied, but a prediction model capable of providing a success rate for each patient attempting to quit smoking is still lacking. The aim of this study is to develop prediction models using machine learning algorithms to predict the outcome of smoking cessation. Data was acquired from patients underwent smoking cessation program at one medical center in Northern Taiwan. A total of 4875 enrollments fulfilled our inclusion criteria. Models with artificial neural network (ANN), support vector machine (SVM), random forest (RF), logistic regression (LoR), k-nearest neighbor (KNN), classification and regression tree (CART), and naïve Bayes (NB) were trained to predict the final smoking status of the patients in a six-month period. Sensitivity, specificity, accuracy, and area under receiver operating characteristic (ROC) curve (AUC or ROC value) were used to determine the performance of the models. We adopted the ANN model which reached a slightly better performance, with a sensitivity of 0.704, a specificity of 0.567, an accuracy of 0.640, and an ROC value of 0.660 (95% confidence interval (CI): 0.617–0.702) for prediction in smoking cessation outcome. A predictive model for smoking cessation was constructed. The model could aid in providing the predicted success rate for all smokers. It also had the potential to achieve personalized and precision medicine for treatment of smoking cessation.


2020 ◽  
Vol 10 (24) ◽  
pp. 9151
Author(s):  
Yun-Chia Liang ◽  
Yona Maimury ◽  
Angela Hsiang-Ling Chen ◽  
Josue Rodolfo Cuevas Juarez

Air, an essential natural resource, has been compromised in terms of quality by economic activities. Considerable research has been devoted to predicting instances of poor air quality, but most studies are limited by insufficient longitudinal data, making it difficult to account for seasonal and other factors. Several prediction models have been developed using an 11-year dataset collected by Taiwan’s Environmental Protection Administration (EPA). Machine learning methods, including adaptive boosting (AdaBoost), artificial neural network (ANN), random forest, stacking ensemble, and support vector machine (SVM), produce promising results for air quality index (AQI) level predictions. A series of experiments, using datasets for three different regions to obtain the best prediction performance from the stacking ensemble, AdaBoost, and random forest, found the stacking ensemble delivers consistently superior performance for R2 and RMSE, while AdaBoost provides best results for MAE.


Sensors ◽  
2019 ◽  
Vol 19 (2) ◽  
pp. 419 ◽  
Author(s):  
Dongdong Du ◽  
Jun Wang ◽  
Bo Wang ◽  
Luyi Zhu ◽  
Xuezhen Hong

Postharvest kiwifruit continues to ripen for a period until it reaches the optimal “eating ripe” stage. Without damaging the fruit, it is very difficult to identify the ripeness of postharvest kiwifruit by conventional means. In this study, an electronic nose (E-nose) with 10 metal oxide semiconductor (MOS) gas sensors was used to predict the ripeness of postharvest kiwifruit. Three different feature extraction methods (the max/min values, the difference values and the 70th s values) were employed to discriminate kiwifruit at different ripening times by linear discriminant analysis (LDA), and results showed that the 70th s values method had the best performance in discriminating kiwifruit at different ripening stages, obtaining a 100% original accuracy rate and a 99.4% cross-validation accuracy rate. Partial least squares regression (PLSR), support vector machine (SVM) and random forest (RF) were employed to build prediction models for overall ripeness, soluble solids content (SSC) and firmness. The regression results showed that the RF algorithm had the best performance in predicting the ripeness indexes of postharvest kiwifruit compared with PLSR and SVM, which illustrated that the E-nose data had high correlations with overall ripeness (training: R2 = 0.9928; testing: R2 = 0.9928), SSC (training: R2 = 0.9749; testing: R2 = 0.9143) and firmness (training: R2 = 0.9814; testing: R2 = 0.9290). This study demonstrated that E-nose could be a comprehensive approach to predict the ripeness of postharvest kiwifruit through aroma volatiles.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Fu-Qing Cui ◽  
Wei Zhang ◽  
Zhi-Yun Liu ◽  
Wei Wang ◽  
Jian-bing Chen ◽  
...  

The comprehensive understanding of the variation law of soil thermal conductivity is the prerequisite of design and construction of engineering applications in permafrost regions. Compared with the unfrozen soil, the specimen preparation and experimental procedures of frozen soil thermal conductivity testing are more complex and challengeable. In this work, considering for essentially multiphase and porous structural characteristic information reflection of unfrozen soil thermal conductivity, prediction models of frozen soil thermal conductivity using nonlinear regression and Support Vector Regression (SVR) methods have been developed. Thermal conductivity of multiple types of soil samples which are sampled from the Qinghai-Tibet Engineering Corridor (QTEC) are tested by the transient plane source (TPS) method. Correlations of thermal conductivity between unfrozen and frozen soil has been analyzed and recognized. Based on the measurement data of unfrozen soil thermal conductivity, the prediction models of frozen soil thermal conductivity for 7 typical soils in the QTEC are proposed. To further facilitate engineering applications, the prediction models of two soil categories (coarse and fine-grained soil) have also been proposed. The results demonstrate that, compared with nonideal prediction accuracy of using water content and dry density as the fitting parameter, the ternary fitting model has a higher thermal conductivity prediction accuracy for 7 types of frozen soils (more than 98% of the soil specimens’ relative error are within 20%). The SVR model can further improve the frozen soil thermal conductivity prediction accuracy and more than 98% of the soil specimens’ relative error are within 15%. For coarse and fine-grained soil categories, the above two models still have reliable prediction accuracy and determine coefficient (R2) ranges from 0.8 to 0.91, which validates the applicability for small sample soils. This study provides feasible prediction models for frozen soil thermal conductivity and guidelines of the thermal design and freeze-thaw damage prevention for engineering structures in cold regions.


Sign in / Sign up

Export Citation Format

Share Document