SNAREs-SAP: SNARE Proteins Identification With PSSM Profiles

Soluble N-ethylmaleimide sensitive factor activating protein receptor (SNARE) proteins are a large family of transmembrane proteins located in organelles and vesicles. The important roles of SNARE proteins include initiating the vesicle fusion process and activating and fusing proteins as they undergo exocytosis activity, and SNARE proteins are also vital for the transport regulation of membrane proteins and non-regulatory vesicles. Therefore, there is great significance in establishing a method to efficiently identify SNARE proteins. However, the identification accuracy of the existing methods such as SNARE CNN is not satisfied. In our study, we developed a method based on a support vector machine (SVM) that can effectively recognize SNARE proteins. We used the position-specific scoring matrix (PSSM) method to extract features of SNARE protein sequences, used the support vector machine recursive elimination correlation bias reduction (SVM-RFE-CBR) algorithm to rank the importance of features, and then screened out the optimal subset of feature data based on the sorted results. We input the feature data into the model when building the model, used 10-fold crossing validation for training, and tested model performance by using an independent dataset. In independent tests, the ability of our method to identify SNARE proteins achieved a sensitivity of 68%, specificity of 94%, accuracy of 92%, area under the curve (AUC) of 84%, and Matthew’s correlation coefficient (MCC) of 0.48. The results of the experiment show that the common evaluation indicators of our method are excellent, indicating that our method performs better than other existing classification methods in identifying SNARE proteins.

Download Full-text

Predicting Future Occurrence of Acute Hypotensive Episodes Using Noninvasive and Invasive Features

Military Medicine ◽

10.1093/milmed/usaa418 ◽

2021 ◽

Vol 186 (Supplement_1) ◽

pp. 445-451

Author(s):

Yifei Sun ◽

Navid Rashedi ◽

Vikrant Vaze ◽

Parikshit Shah ◽

Ryan Halter ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Real World ◽

Short Term Memory ◽

Model Performance ◽

Learning Technologies ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor ◽

Continuous Map

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.

Download Full-text

To Explore the Predictive Power of Visuomotor Network Dysfunctions in Mild Cognitive Impairment and Alzheimer’s Disease

Frontiers in Neuroscience ◽

10.3389/fnins.2021.654003 ◽

2021 ◽

Vol 15 ◽

Author(s):

Justine Staal ◽

Francesco Mattace-Raso ◽

Hennie A. M. Daniels ◽

Johannes van der Steen ◽

Johan J. M. Pel

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Support Vector Machine ◽

Cognitive Impairment ◽

Mild Cognitive Impairment ◽

Predictive Power ◽

Area Under The Curve ◽

Classification Performance ◽

Support Vector ◽

Potential Biomarker

BackgroundResearch into Alzheimer’s disease has shifted toward the identification of minimally invasive and less time-consuming modalities to define preclinical stages of Alzheimer’s disease.MethodHere, we propose visuomotor network dysfunctions as a potential biomarker in AD and its prodromal stage, mild cognitive impairment with underlying the Alzheimer’s disease pathology. The functionality of this network was tested in terms of timing, accuracy, and speed with goal-directed eye-hand tasks. The predictive power was determined by comparing the classification performance of a zero-rule algorithm (baseline), a decision tree, a support vector machine, and a neural network using functional parameters to classify controls without cognitive disorders, mild cognitive impaired patients, and Alzheimer’s disease patients.ResultsFair to good classification was achieved between controls and patients, controls and mild cognitive impaired patients, and between controls and Alzheimer’s disease patients with the support vector machine (77–82% accuracy, 57–93% sensitivity, 63–90% specificity, 0.74–0.78 area under the curve). Classification between mild cognitive impaired patients and Alzheimer’s disease patients was poor, as no algorithm outperformed the baseline (63% accuracy, 0% sensitivity, 100% specificity, 0.50 area under the curve).Comparison with Existing Method(s)The classification performance found in the present study is comparable to that of the existing CSF and MRI biomarkers.ConclusionThe data suggest that visuomotor network dysfunctions have potential in biomarker research and the proposed eye-hand tasks could add to existing tests to form a clear definition of the preclinical phenotype of AD.

Download Full-text

Prediction of Sudden Cardiac Death Risk with a Support Vector Machine Based on Heart Rate Variability and Heartprint Indices

Sensors ◽

10.3390/s20195483 ◽

2020 ◽

Vol 20 (19) ◽

pp. 5483

Author(s):

Marisol Martinez-Alanis ◽

Erik Bojorges-Valdez ◽

Niels Wessel ◽

Claudia Lerma

Keyword(s):

Heart Rate ◽

Support Vector Machine ◽

Heart Rate Variability ◽

Sudden Cardiac Death ◽

Cardiac Death ◽

Area Under The Curve ◽

Premature Ventricular Complex ◽

Support Vector ◽

Auc Value ◽

Sudden Cardiac Death Risk

Most methods for sudden cardiac death (SCD) prediction require long-term (24 h) electrocardiogram recordings to measure heart rate variability (HRV) indices or premature ventricular complex indices (with the heartprint method). This work aimed to identify the best combinations of HRV and heartprint indices for predicting SCD based on short-term recordings (1000 heartbeats) through a support vector machine (SVM). Eleven HRV indices and five heartprint indices were measured in 135 pairs of recordings (one before an SCD episode and another without SCD as control). SVMs (defined with a radial basis function kernel with hyperparameter optimization) were trained with this dataset to identify the 13 best combinations of indices systematically. Through 10-fold cross-validation, the best area under the curve (AUC) value as a function of γ (gamma) and cost was identified. The predictive value of the identified combinations had AUCs between 0.80 and 0.86 and accuracies between 80 and 86%. Further SVM performance tests on a different dataset of 68 recordings (33 before SCD and 35 as control) showed AUC = 0.68 and accuracy = 67% for the best combination. The developed SVM may be useful for preventing imminent SCD through early warning based on electrocardiogram (ECG) or heart rate monitoring.

Download Full-text

Hierarchical attention networks for information extraction from cancer pathology reports

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocx131 ◽

2017 ◽

Vol 25 (3) ◽

pp. 321-330 ◽

Cited By ~ 29

Author(s):

Shang Gao ◽

Michael T Young ◽

John X Qiu ◽

Hong-Jun Yoon ◽

James B Christian ◽

...

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Information Extraction ◽

Model Performance ◽

Gradient Boosting ◽

Support Vector ◽

Attention Networks ◽

Cancer Pathology ◽

Extreme Gradient Boosting ◽

Pathology Reports

Abstract Objective We explored how a deep learning (DL) approach based on hierarchical attention networks (HANs) can improve model performance for multiple information extraction tasks from unstructured cancer pathology reports compared to conventional methods that do not sufﬁciently capture syntactic and semantic contexts from free-text documents. Materials and Methods Data for our analyses were obtained from 942 deidentiﬁed pathology reports collected by the National Cancer Institute Surveillance, Epidemiology, and End Results program. The HAN was implemented for 2 information extraction tasks: (1) primary site, matched to 12 International Classification of Diseases for Oncology topography codes (7 breast, 5 lung primary sites), and (2) histological grade classiﬁcation, matched to G1–G4. Model performance metrics were compared to conventional machine learning (ML) approaches including naive Bayes, logistic regression, support vector machine, random forest, and extreme gradient boosting, and other DL models, including a recurrent neural network (RNN), a recurrent neural network with attention (RNN w/A), and a convolutional neural network. Results Our results demonstrate that for both information tasks, HAN performed signiﬁcantly better compared to the conventional ML and DL techniques. In particular, across the 2 tasks, the mean micro and macroF-scores for the HAN with pretraining were (0.852,0.708), compared to naive Bayes (0.518, 0.213), logistic regression (0.682, 0.453), support vector machine (0.634, 0.434), random forest (0.698, 0.508), extreme gradient boosting (0.696, 0.522), RNN (0.505, 0.301), RNN w/A (0.637, 0.471), and convolutional neural network (0.714, 0.460). Conclusions HAN-based DL models show promise in information abstraction tasks within unstructured clinical pathology reports.

Download Full-text

Classification of Tight Sandstone Reservoir Based on the Conventional Logging

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.633-634.526 ◽

2014 ◽

Vol 633-634 ◽

pp. 526-529 ◽

Cited By ~ 1

Author(s):

Xiao Ling Xiao ◽

Jia Li Cui ◽

Yu Peng Zhang ◽

Xiang Zhang ◽

Han Wu

Keyword(s):

Support Vector Machine ◽

Oil And Gas ◽

Identification Accuracy ◽

Support Vector ◽

Tight Sandstone ◽

Oil And Gas Exploration ◽

Unconventional Oil ◽

Unconventional Oil And Gas ◽

Sandstone Reservoir

With the increasing social demand for oil and gas resources, the exploration and development of unconventional oil and gas reservoirs will pay more and more attention. Tight sandstone reservoir classification is one of the important tasks in the research of unconventional oil and gas exploration and development.Limitations exist in tight sandstone reservoir classification by various conventional logging.A method for the classification of tight sandstone reservoir based on support vector machine is presented in this paper, combining with the core data and flow unit to establish the reservoir classification standard. Tight sandstone reservoirs of no coring wells are classified based on the model made by support vector machine using conventional logging.The application results show that this method has high suitability and identification accuracy.

Download Full-text

Identify Lysine Neddylation Sites Using Bi-profile Bayes Feature Extraction via the Chou’s 5-steps Rule and General Pseudo Components

Current Genomics ◽

10.2174/1389202921666191223154629 ◽

2020 ◽

Vol 20 (8) ◽

pp. 592-601

Author(s):

Zhe Ju ◽

Shi-Yun Wang

Keyword(s):

Support Vector Machine ◽

Feature Extraction ◽

Operating Characteristic ◽

Characteristic Curve ◽

Class Imbalance ◽

Support Vector ◽

Fuzzy Support Vector Machine ◽

Operating Characteristic Curve ◽

Matthew’S Correlation Coefficient ◽

User Friendly

Introduction: Neddylation is a highly dynamic and reversible post-translatiNeddylation is a highly dynamic and reversible post-translational modification. The abnormality of neddylation has previously been shown to be closely related to some human diseases. The detection of neddylation sites is essential for elucidating the regulation mechanisms of protein neddylation.onal modification which has been found to be involved in various biological processes and closely associated with many diseases. The accurate identification of neddylation sites is necessary to elucidate the underlying molecular mechanisms of neddylation. As the traditional experimental methods are time consuming and expensive, it is desired to develop computational methods to predict neddylation sites. In this study, a novel predictor named NeddPred is proposed to predict lysine neddylation sites. An effective feature extraction method, bi-profile bayes encoding, is employed to encode neddylation sites. Moreover, a fuzzy support vector machine algorithm is proposed to solve the class imbalance and noise problem in the prediction of neddylation sites. As illustrated by 10-fold cross-validation, NeddPred achieves an excellent performance with a Matthew's correlation coefficient of 0.7082 and an area under receiver operating characteristic curve of 0.9769. Independent tests show that NeddPred significantly outperforms existing neddylation sites predictor NeddyPreddy. Therefore, NeddPred can be a complement to the existing tools for the prediction of neddylation sites. A user-friendly web-server for NeddPred is established at 123.206.31.171/NeddPred/. Objective: As the detection of the lysine neddylation sites by the traditional experimental method is often expensive and time-consuming, it is imperative to design computational methods to identify neddylation sites. Methods: In this study, a bioinformatics tool named NeddPred is developed to identify underlying protein neddylation sites. A bi-profile bayes feature extraction is used to encode neddylation sites and a fuzzy support vector machine model is utilized to overcome the problem of noise and class imbalance in the prediction. Results: Matthew's correlation coefficient of NeddPred achieved 0.7082 and an area under the receiver operating characteristic curve of 0.9769. Independent tests show that NeddPred significantly outperforms existing lysine neddylation sites predictor NeddyPreddy. Conclusion: Therefore, NeddPred can be a complement to the existing tools for the prediction of neddylation sites. A user-friendly webserver for NeddPred is accessible at 123.206.31.171/NeddPred/.

Download Full-text

Eggshell Crack Detection and Egg Classification Using Resonance and Support Vector Machine Methods

Applied Engineering in Agriculture ◽

10.13031/aea.12749 ◽

2019 ◽

Vol 35 (1) ◽

pp. 23-30

Author(s):

Ching-Wei Cheng ◽

Pei-Hsuan Feng ◽

Jun-Hong Xie ◽

Yu-Kai Weng

Keyword(s):

Support Vector Machine ◽

Characteristic Frequency ◽

Crack Detection ◽

Detection Method ◽

Maximum Amplitude ◽

Identification Accuracy ◽

Support Vector ◽

Training Set ◽

Processed Products ◽

Resonance Detection

Abstract. Cracks in eggshells not only affect the egg preservation time but also reduce the success rate for the end-processed products. This study was based on the theory of resonant inspection (RI). The use of the support vector machine (SVM) method as a means of more accurate eggshell crack detection was evaluated. The results revealed that comparing the resonant frequency and amplitude by using a microphone as a sensor allowed non-cracked eggs to be distinguished from cracked eggs. The characteristic frequency of a non-cracked egg was between 4130 and 5500 Hz, and its amplitude was between 0.16 and 0.20 V. The spectrum of a cracked egg was fuzzy, with no obvious characteristic frequency, and the maximum amplitude was approximately 0.06 V. The identification accuracy was 99% and 98% for the SVM training set and testing set, respectively. These results prove that the resonance detection method is effective for identifying eggs with cracked shells. Keywords: Eggshells, Resonant inspection, Fast Fourier transform, Support vector machine.

Download Full-text

Hazard identification and prediction system for aircraft electrical system based on SRA and SVM

Proceedings of the Institution of Mechanical Engineers Part G Journal of Aerospace Engineering ◽

10.1177/0954410019894121 ◽

2020 ◽

Vol 234 (4) ◽

pp. 1014-1026 ◽

Cited By ~ 1

Author(s):

Di Zhou ◽

Xiao Zhuang ◽

Hongfu Zuo ◽

Jing Cai ◽

Han Bao

Keyword(s):

Support Vector Machine ◽

Variable Selection ◽

Relative Error ◽

Hazard Identification ◽

Identification Accuracy ◽

Normal Operation ◽

Support Vector ◽

Prediction System ◽

Electrical System ◽

Bus Voltage

The aircraft electrical system provides power for the normal operation of the aircraft. Its normal operation is critical to ensure the safe flight of the aircraft. Therefore, it is very important to identify the hazards in the aircraft electrical system. In this paper, a hazard identification and prediction system which can intelligently identify potential hazards in aircraft electrical system is proposed. The proposed hazard identification and prediction system mainly includes three processes: variable selection, hazard identification, and hazard prediction. In the process of variable selection, the stepwise regression analysis is used to select 8 main parameters that have the major influence on the DC bus voltage value from 18 parameters. In the process of hazard identification, support vector machine is used to identify pre-existing hazards in electrical system based on the status of all components. The identification accuracy of the support vector machine is 92.3%. When the electrical system does not have unacceptable hazards, a prediction of the variation range of the DC bus voltage value in the aircraft electrical system is performed. The average prediction relative error of support vector machine is only 0.86%. Overall, the identification accuracy and average prediction relative error show that the proposed hazard identification and prediction system can accurately and effectively identify and predict the hazards in the aircraft electrical system.

Download Full-text

Long-term Runoff Forecasting Models Based on the Teleconnection coupled with Machine Learning

10.5194/egusphere-egu2020-1369 ◽

2020 ◽

Author(s):

Teng Zhang ◽

Zhongjing Wang ◽

Zixiong Zhang

Keyword(s):

Support Vector Machine ◽

Water Resources ◽

Model Performance ◽

Support Vector ◽

Wet Season ◽

Model Based ◽

Runoff Forecasting ◽

Monthly Runoff ◽

Highly Correlated ◽

The Impact

<p>Runoff forecast with high precision is important for the efficient utilization of water resources and regional sustainable development, especially in the arid area. The monthly runoff of Changmabao (CMB) station has an upwards trend and an abrupt point in 1998. The impact factor analysis shows that it is highly correlated with the current precipitation and temperature in the wet season while the previous runoff and previous global land temperature in the dry season. Three models including the time-series decomposition model, the model based on teleconnection coupled with the support vector machine, and the model based on teleconnection coupled with the artificial neural network are used to predict the runoff of CMB station. An indicator &#946; is constructed with the correlation coefficient (R) and mean relative deviation (rBias) to evaluate the model performance more conveniently and intuitively. The results suggest that the model based on teleconnection coupled with the support vector machine preforms best. This forecasting method could be applied to the management and dispatch of water resources in arid areas.</p>

Download Full-text

Anomaly Intrusion Detection Based on Support Vector Machine with Mexico Hat Wavelet Kernel Function

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.687-691.3897 ◽

2014 ◽

Vol 687-691 ◽

pp. 3897-3900 ◽

Cited By ~ 1

Author(s):

Ping An Wang ◽

Xu Sheng Gan ◽

Deng Kai Yao

Keyword(s):

Support Vector Machine ◽

Intrusion Detection ◽

Kernel Function ◽

Great Influence ◽

Model Performance ◽

Necessary Condition ◽

Support Vector ◽

Rbf Kernel ◽

Wavelet Kernel Function ◽

Anomaly Intrusion Detection

The selection of kernel function in Support Vector Machine (SVM) has a great influence on the model performance. In the paper, Mexico hat wavelet kernel is introduced to employ the kernel function of SVM, and theoretically it has be prove that, Mexico hat wavelet kernel satisfies the Merce condition, that is the necessary condition as the kernel function of SVM. Simulation on the anomaly detection shows that the capability of SVM based on Mexico hat wavelet kernel is better than that of SVM based on RBF kernel with a satisfactory result for anomaly intrusion detection.

Download Full-text