scholarly journals Prediction of cell penetrating peptides and their uptake efficiency using random forest-based feature selections

Author(s):  
Peng Liu ◽  
Yijie Ding ◽  
Ying Rong ◽  
Dong Chen

Cell penetrating peptides (CPPs) are short peptides that can carry biomolecules of varying sizes across the cell membrane into the cytoplasm. Correctly identifying CPPs is the basis for studying their functions and mechanisms. Here, we propose a novel CPP predictor that is able to predict CPPs and their uptake efficiency. In our method, five feature descriptors are applied to encode the sequence and compose a hybrid feature vector. Afterward, the wrapper + random forest algorithm is employed, which combines feature selection with the prediction process to find features that are crucial for identifying CPPs. The jackknife cross validation result shows that our predictor is comparable to state-of-the-art CPP predictors, and our method reduces the feature dimension, which improves computational efficiency and avoids overfitting, allowing our predictor to be adopted to identify large-scale CPP data.

2018 ◽  
Vol 17 (8) ◽  
pp. 2715-2726 ◽  
Author(s):  
Balachandran Manavalan ◽  
Sathiyamoorthy Subramaniyam ◽  
Tae Hwan Shin ◽  
Myeong Ok Kim ◽  
Gwang Lee

2016 ◽  
Vol 10 (2) ◽  
pp. 87-95 ◽  
Author(s):  
Samad Mussa Farkhani ◽  
Ali Shirani ◽  
Samaneh Mohammadi ◽  
Parvin Zakeri‐Milani ◽  
Javid Shahbazi Mojarrad ◽  
...  

2018 ◽  
Vol 284 ◽  
pp. 84-102 ◽  
Author(s):  
S. Pescina ◽  
C. Ostacolo ◽  
I.M. Gomez-Monterrey ◽  
M. Sala ◽  
A. Bertamino ◽  
...  

2019 ◽  
Vol 11 (14) ◽  
pp. 1665 ◽  
Author(s):  
Tianle He ◽  
Chuanjie Xie ◽  
Qingsheng Liu ◽  
Shiying Guan ◽  
Gaohuan Liu

Machine learning comprises a group of powerful state-of-the-art techniques for land cover classification and cropland identification. In this paper, we proposed and evaluated two models based on random forest (RF) and attention-based long short-term memory (A-LSTM) networks that can learn directly from the raw surface reflectance of remote sensing (RS) images for large-scale winter wheat identification in Huanghuaihai Region (North-Central China). We used a time series of Moderate Resolution Imaging Spectroradiometer (MODIS) images over one growing season and the corresponding winter wheat distribution map for the experiments. Each training sample was derived from the raw surface reflectance of MODIS time-series images. Both models achieved state-of-the-art performance in identifying winter wheat, and the F1 scores of RF and A-LSTM were 0.72 and 0.71, respectively. We also analyzed the impact of the pixel-mixing effect. Training with pure-mixed-pixel samples (the training set consists of pure and mixed cells and thus retains the original distribution of data) was more precise than training with only pure-pixel samples (the entire pixel area belongs to one class). We also analyzed the variable importance along the temporal series, and the data acquired in March or April contributed more than the data acquired at other times. Both models could predict winter wheat coverage in past years or in other regions with similar winter wheat growing seasons. The experiments in this paper showed the effectiveness and significance of our methods.


Biochemistry ◽  
2006 ◽  
Vol 45 (5) ◽  
pp. 1408-1420 ◽  
Author(s):  
Stéphane Balayssac ◽  
Fabienne Burlina ◽  
Odile Convert ◽  
Gérard Bolbach ◽  
Gérard Chassaing ◽  
...  

2012 ◽  
Vol 102 (3) ◽  
pp. 487a
Author(s):  
Hanna A. Rydberg ◽  
Maria Matson ◽  
Helene L. Åmand ◽  
Elin K. Esbjörner ◽  
Bengt Nordén

2011 ◽  
Vol 2011 ◽  
pp. 1-9 ◽  
Author(s):  
Nada Basit ◽  
Harry Wechsler

Wet laboratory mutagenesis to determine enzyme activity changes is expensive and time consuming. This paper expands on standard one-shot learning by proposing an incremental transductive method (T2bRF) for the prediction of enzyme mutant activity during mutagenesis using Delaunay tessellation and 4-body statistical potentials for representation. Incremental learning is in tune with both eScience and actual experimentation, as it accounts for cumulative annotation effects of enzyme mutant activity over time. The experimental results reported, using cross-validation, show that overall the incremental transductive method proposed, using random forest as base classifier, yields better results compared to one-shot learning methods. T2bRF is shown to yield 90% on T4 and LAC (and 86% on HIV-1). This is significantly better than state-of-the-art competing methods, whose performance yield is at 80% or less using the same datasets.


2017 ◽  
Vol 16 (5) ◽  
pp. 2044-2053 ◽  
Author(s):  
Leyi Wei ◽  
PengWei Xing ◽  
Ran Su ◽  
Gaotao Shi ◽  
Zhanshan Sam Ma ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document