A computational model for predicting transmembrane regions of retroviruses

2017 ◽  
Vol 15 (03) ◽  
pp. 1750010 ◽  
Author(s):  
Ze Liu ◽  
Hongqiang Lv ◽  
Jiuqiang Han ◽  
Ruiling Liu

Transmembrane region (TR) is a conserved region of transmembrane (TM) subunit in envelope (env) glycoprotein of retrovirus. Evidences have shown that TR is responsible for anchoring the env glycoprotein on the lipid bilayer and substitution of the TR for a covalently linked lipid anchor abrogates fusion. However, universal software could not achieve sufficient accuracy as TM in env also has several motifs such as signal peptide, fusion peptide and immunosuppressive domain composed largely of hydrophobic residues. In this paper, a support vector machine-based (SVM) model is proposed to identify TRs in retroviruses. Firstly, physicochemical and evolutionary information properties were extracted as original features. And then, the feature importance was analyzed by minimum Redundancy Maximum Relevance (mRMR) feature selection criterion. Our model achieved an Sn of 0.955, Sp of 0.998, ACC of 0.995, MCC of 0.954 using 10-fold cross-validation on the training dataset. These results suggest that the proposed model can be used to predict TRs in non-annotation retroviruses and 11917, 3344, 2, 289 and 6 new putative TRs were found in HERV, HIV, HTLV, SIV, MLV, respectively.

2020 ◽  
Vol 13 (3) ◽  
pp. 531-535
Author(s):  
Vijayasherly Velayutham ◽  
Srimathi Chandrasekaran

Aim: To develop a prediction model grounded on Machine Learning using Support Vector Machine (SVM). Background: Prediction of workload in a Cloud Environment is one of the primary task in provisioning resources. Forecasting the requirements of future workload lies in the competency of predicting technique which could maximize the usage of resources in a cloud computing environment. Objective: To reduce the training time of SVM model. Methods: K-Means clustering is applied on the training dataset to form ‘n’ clusters firstly. Then, for every tuple in the cluster, the tuple’s class label is compared with the tuple’s cluster label. If the two labels are identical then the tuple is rightly classified and such a tuple would not contribute much during the SVM training process that formulates the separating hyperplane with lowest generalization error. Otherwise the tuple is added to the reduced training dataset. This selective addition of tuples to train SVM is carried for all clusters. The support vectors are a few among the samples in reduced training dataset that determines the optimal separating hyperplane. Results: On Google Cluster Trace dataset, the proposed model incurred a reduction in the training time, Root Mean Square Error and a marginal increase in the R2 Score than the traditional SVM. The model has also been tested on Los Alamos National Laboratory’s Mustang and Trinity cluster traces. Conclusion: The Cloudsim’s CPU utilization (VM and Cloudlet utilization) was measured and it was found to increase upon running the same set of tasks through our proposed model.


Algorithms ◽  
2018 ◽  
Vol 11 (12) ◽  
pp. 193
Author(s):  
Yuchuang Wang ◽  
Guoyou Shi ◽  
Xiaotong Sun

Container ships must pass through multiple ports of call during a voyage. Therefore, forecasting container volume information at the port of origin followed by sending such information to subsequent ports is crucial for container terminal management and container stowage personnel. Numerous factors influence container allocation to container ships for a voyage, and the degree of influence varies, engendering a complex nonlinearity. Therefore, this paper proposes a model based on gray relational analysis (GRA) and mixed kernel support vector machine (SVM) for predicting container allocation to a container ship for a voyage. First, in this model, the weights of influencing factors are determined through GRA. Then, the weighted factors serve as the input of the SVM model, and SVM model parameters are optimized through a genetic algorithm. Numerical simulations revealed that the proposed model could effectively predict the number of containers for container ship voyage and that it exhibited strong generalization ability and high accuracy. Accordingly, this model provides a new method for predicting container volume for a voyage.


Electronics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1253
Author(s):  
Muhammad Afzal ◽  
Beom Joo Park ◽  
Maqbool Hussain ◽  
Sungyoung Lee

A major blockade to support the evidence-based clinical decision-making is accurately and efficiently recognizing appropriate and scientifically rigorous studies in the biomedical literature. We trained a multi-layer perceptron (MLP) model on a dataset with two textual features, title and abstract. The dataset consisting of 7958 PubMed citations classified in two classes: scientific rigor and non-rigor, is used to train the proposed model. We compare our model with other promising machine learning models such as Support Vector Machine (SVM), Decision Tree, Random Forest, and Gradient Boosted Tree (GBT) approaches. Based on the higher cumulative score, deep learning was chosen and was tested on test datasets obtained by running a set of domain-specific queries. On the training dataset, the proposed deep learning model obtained significantly higher accuracy and AUC of 97.3% and 0.993, respectively, than the competitors, but was slightly lower in the recall of 95.1% as compared to GBT. The trained model sustained the performance of testing datasets. Unlike previous approaches, the proposed model does not require a human expert to create fresh annotated data; instead, we used studies cited in Cochrane reviews as a surrogate for quality studies in a clinical topic. We learn that deep learning methods are beneficial to use for biomedical literature classification. Not only do such methods minimize the workload in feature engineering, but they also show better performance on large and noisy data.


2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Yang Sun ◽  
Xianda Feng ◽  
Lingqiang Yang

Tunnel squeezing is one of the major geological disasters that often occur during the construction of tunnels in weak rock masses subjected to high in situ stresses. It could cause shield jamming, budget overruns, and construction delays and could even lead to tunnel instability and casualties. Therefore, accurate prediction or identification of tunnel squeezing is extremely important in the design and construction of tunnels. This study presents a modified application of a multiclass support vector machine (SVM) to predict tunnel squeezing based on four parameters, that is, diameter (D), buried depth (H), support stiffness (K), and rock tunneling quality index (Q). We compiled a database from the literature, including 117 case histories obtained from different countries such as India, Nepal, and Bhutan, to train the multiclass SVM model. The proposed model was validated using 8-fold cross validation, and the average error percentage was approximately 11.87%. Compared with existing approaches, the proposed multiclass SVM model yields a better performance in predictive accuracy. More importantly, one could estimate the severity of potential squeezing problems based on the predicted squeezing categories/classes.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Qianqian Han ◽  
Bo Yan ◽  
Guobao Ning ◽  
B. Yu

An improved SVM model is presented to forecast dry bulk freight index (BDI) in this paper, which is a powerful tool for operators and investors to manage the market trend and avoid price risking shipping industry. The BDI is influenced by many factors, especially the random incidents in dry bulk market, inducing the difficulty in forecasting of BDI. Therefore, to eliminate the impact of random incidents in dry bulk market, wavelet transform is adopted to denoise the BDI data series. Hence, the combined model of wavelet transform and support vector machine is developed to forecast BDI in this paper. Lastly, the BDI data in 2005 to 2012 are presented to test the proposed model. The 84 prior consecutive monthly BDI data are the inputs of the model, and the last 12 monthly BDI data are the outputs of model. The parameters of the model are optimized by genetic algorithm and the final model is conformed through SVM training. This paper compares the forecasting result of proposed method and three other forecasting methods. The result shows that the proposed method has higher accuracy and could be used to forecast the short-term trend of the BDI.


2020 ◽  
Vol 11 (3) ◽  
pp. 38-56
Author(s):  
S. R. Mani Sekhar ◽  
Siddesh G. M. ◽  
Sunilkumar S. Manvi

Identification and analysis of protein play a vital role in drug design and disease prediction. There are several open-source applications that have been developed for identifying essential proteins which are based on biological or topological features. These techniques infer the possibility of proteins to be essential by using the network topology and feature selection, which can ignore some of the features to reduce the complexity and, subsequently, results in less accuracy. In the paper, the authors have used selenium driver to scrap the dataset. Later, the authors integrated the chi-square method with support vector machine for the prediction of essential proteins in baker yeast. Here, chi-square is a test of dissimilarity used for altering the record, and afterward, the support vector machine is used to classify the test dataset. The results show that the proposed model Chi-SVM model achieves an accuracy of 99.56%, whereas BC and CC achieved an accuracy of 84.0% and 86.0%. Finally, the proposed model is validated using Statistical performance measures such as PPA, NPA, SA, and STA.


2014 ◽  
Vol 628 ◽  
pp. 383-389 ◽  
Author(s):  
Ya Hui Peng ◽  
Kang Peng ◽  
Jian Zhou ◽  
Zhi Xiang Liu

Due to the complex features of rock burst hazard assessment systems, a support vector machine (SVM) model for predicting of classification of rock burst was established based on the SVM theory and the actual characteristics of the project in this study. The main factors of rock burst, such as coal seam, dip, buried depth, structure situation, change of pitch angle, change of coal thickness, gas concentration, roof management, pressure relief and shooting were defined as the criterion indices for rock burst prediction in the proposed model. In order to determine reasonable and efficient the parameters of SVM, Firstly, the appropriate fitness function for genetic algorithms (GA) operation was determined, and then optimization parameters of SVM model were selected by real coded GA, therefore, the genetic algorithms and support vector machine (GSVM) model was established. A GSVM model was obtained through training 23 sets of measured data, the cross-validation method was introduced to verify the stability of GSVM model and the ratio of mis-discrimination is 0. Moreover, the proposed model was used to predict 12 new samples rock burst, the correct rate of prediction results is 91.6667% and are identical with actual situation. The results show that the genetic algorithm can speed up SVM parameter optimization search, the proposed model has a high credibility in the study of rock burst prediction of risk classification, which can be applied to practical engineering.


2016 ◽  
Vol 17 (1) ◽  
pp. 52-60 ◽  
Author(s):  
Yihui Fang ◽  
Xingwei Chen ◽  
Nian-Sheng Cheng

Estuary salinity predictions can help to improve water safety in coastal areas. Coupled genetic algorithm-support vector machine (GA-SVM) models, which adopt a GA to optimize the SVM parameters, have been successfully applied in some research fields. In light of previous research findings, an application of a GA-SVM model for tidal estuary salinity prediction is proposed in this paper. The corresponding model is developed to predict the salinity of the Min River Estuary (MRE). By conducting an analysis of the time series of daily salinity and the results of simulation experiments, the high-tide level, runoff and previous salinity are considered as the major factors that influence salinity variation. The prediction accuracy of the GA-SVM model is satisfactory, with coefficient of determination (R2) of 0.85, Nash–Sutcliffe efficiency of 0.84 and root mean square error of 119 (μS/cm). The proposed model performs significantly better than the traditional SVM model in terms of prediction accuracy and computing time. It can be concluded that the proposed model can successfully predict the salinity of MRE based on the high-tide level, runoff and previous salinity.


2021 ◽  
Vol 11 ◽  
Author(s):  
Wen Chen ◽  
Tao Zhang ◽  
Lin Xu ◽  
Liang Zhao ◽  
Huan Liu ◽  
...  

ObjectivesTo investigate the value of contrast-enhanced computer tomography (CT)-based on radiomics in discriminating high-grade and low-grade hepatocellular carcinoma (HCC) before surgery.MethodsThe retrospective study including 161 consecutive subjects with HCC which was approved by the institutional review board, and the patients were divided into a training group (n = 112) and test group (n = 49) from January 2013 to January 2018. The least absolute shrinkage and selection operator (LASSO) was used to select the most valuable features to build a support vector machine (SVM) model. The performance of the predictive model was evaluated using the area under the curve (AUC), accuracy, sensitivity, and specificity.ResultsThe SVM model showed an acceptable ability to differentiate high-grade from low-grade HCC, with an AUC of 0.904 in the training dataset and 0.937 in the test dataset, accuracy (92.2% versus 95.7%), sensitivity(82.5% versus 88.0%), and specificity (92.7% versus 95.8%), respectively.ConclusionThe machine learning-based radiomics reflects a better evaluating performance in differentiating HCC between low-grade and high-grade, which may contribute to personalized treatment.


2013 ◽  
Vol 433-435 ◽  
pp. 545-549
Author(s):  
Zhi Jie Song ◽  
Zan Fu ◽  
Han Wang ◽  
Gui Bin Hou

Demand forecasting for port critical spare parts (CSP) is notoriously difficult as it is expensive, lumpy and intermittent with high variability. In this paper, some influential factors which have an effect on CSP consumption were proposed according to port CSP characteristics and historical data. Combined with the influential factors, a least squares support vector machines (LS-SVM) model optimized by particle swarm optimization (PSO) was developed to forecast the demand. And the effectiveness of the model is demonstrated through a real case study, which shows that the proposed model can forecast the demand of port CSP more accurately, and effectively reduce inventory backlog.


Sign in / Sign up

Export Citation Format

Share Document