Systematic analysis and prediction of type IV secreted effector proteins by machine learning approaches

AbstractType IV secretion systems exist in a number of bacterial pathogens and are used to secrete effector proteins directly into host cells in order to change their environment making the environment hospitable for the bacteria. In recent years, several machine learning algorithms have been developed to predict effector proteins, potentially facilitating experimental verification. However, inconsistencies exist between their results. Previously we analysed the disparate sets of predictive features used in these algorithms to determine an optimal set of 370 features for effector prediction. This work focuses on the best way to use these optimal features by designing three machine learning classifiers, comparing our results with those of others, and obtaining de novo results. We chose the pathogenLegionella pneumophilastrain Philadelphia-1, a cause of Legionnaires’ disease, because it has many validated effector proteins and others have developed machine learning prediction tools for it. While all of our models give good results indicating that our optimal features are quite robust, Model 1, which uses all 370 features with a support vector machine, has slightly better accuracy. Moreover, Model 1 predicted 760 effector proteins, more than any other study, 315 of which have been validated. Although the results of our three models agree well with those of other researchers, their models only predicted 126 and 311 candidate effectors.

Download Full-text

A systematic analysis for maritime accidents causation in Chinese coastal waters using machine learning approaches

Ocean & Coastal Management ◽

10.1016/j.ocecoaman.2021.105859 ◽

2021 ◽

Vol 213 ◽

pp. 105859

Author(s):

Kezhong Liu ◽

Qing Yu ◽

Zhitao Yuan ◽

Zhisen Yang ◽

Yaqing Shu

Keyword(s):

Machine Learning ◽

Coastal Waters ◽

Learning Approaches ◽

Systematic Analysis

Download Full-text

Latent Structures for Coreference Resolution

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00147 ◽

2015 ◽

Vol 3 ◽

pp. 405-418 ◽

Cited By ~ 12

Author(s):

Sebastian Martschat ◽

Michael Strube

Keyword(s):

Machine Learning ◽

Coreference Resolution ◽

Learning Approaches ◽

Systematic Analysis ◽

Latent Structures ◽

Pair Level

Machine learning approaches to coreference resolution vary greatly in the modeling of the problem: while early approaches operated on the mention pair level, current research focuses on ranking architectures and antecedent trees. We propose a unified representation of different approaches to coreference resolution in terms of the structure they operate on. We represent several coreference resolution approaches proposed in the literature in our framework and evaluate their performance. Finally, we conduct a systematic analysis of the output of these approaches, highlighting differences and similarities.

Download Full-text

Supplemental Material for Psychometric and Machine Learning Approaches for Diagnostic Assessment and Tests of Individual Classification

Psychological Methods ◽

10.1037/met0000317.supp ◽

2020 ◽

Keyword(s):

Machine Learning ◽

Diagnostic Assessment ◽

Learning Approaches

Download Full-text

Machine Learning Approaches for the Analysis of Non-Metallic Inclusion Data Sets

AISTech2019 Proceedings of the Iron and Steel Technology Conference ◽

10.33313/377/275 ◽

2019 ◽

Author(s):

M. Webler ◽

B. Abdulsalam

Keyword(s):

Machine Learning ◽

Data Sets ◽

Learning Approaches ◽

Metallic Inclusion

Download Full-text

Multiple vehicles detection and tracking for intelligent transport systems using machine learning approaches

Transport and Communication Science Journal ◽

10.25073/tcsj.70.3.7 ◽

2019 ◽

Vol 70 (3) ◽

pp. 214-224

Author(s):

Bui Ngoc Dung ◽

Manh Dzung Lai ◽

Tran Vu Hieu ◽

Nguyen Binh T. H.

Keyword(s):

Machine Learning ◽

Gaussian Mixture ◽

Research Field ◽

Transport Systems ◽

Learning Approaches ◽

Subtraction Method ◽

Intelligent Transport Systems ◽

Intelligent Transport ◽

Detection And Tracking ◽

Multiple Vehicles

Video surveillance is emerging research field of intelligent transport systems. This paper presents some techniques which use machine learning and computer vision in vehicles detection and tracking. Firstly the machine learning approaches using Haar-like features and Ada-Boost algorithm for vehicle detection are presented. Secondly approaches to detect vehicles using the background subtraction method based on Gaussian Mixture Model and to track vehicles using optical flow and multiple Kalman filters were given. The method takes advantages of distinguish and tracking multiple vehicles individually. The experimental results demonstrate high accurately of the method.

Download Full-text

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition

10.26434/chemrxiv.5513581.v1 ◽

2017 ◽

Author(s):

Sabrina Jaeger ◽

Simone Fulle ◽

Samo Turk

Keyword(s):

Machine Learning ◽

Language Processing ◽

Supervised Machine Learning ◽

Learning Approach ◽

Learning Approaches ◽

Unsupervised Machine Learning ◽

Feature Representations ◽

Machine Learning Approach ◽

The Individual ◽

Vector Representations

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.

Download Full-text

DETECTION OF ANOMALY BASED APPLICATION LAYER DDoS ATTACKS USING MACHINE LEARNING APPROACHES

i-manager s Journal on Computer Science ◽

10.26634/jcom.4.2.8120 ◽

2016 ◽

Vol 4 (2) ◽

pp. 6

Author(s):

VANI NIDHI M.S.P.S. ◽

PRASAD K. MUNIVARA ◽

◽

Keyword(s):

Machine Learning ◽

Learning Approaches ◽

Ddos Attacks ◽

Application Layer

Download Full-text

Predictors of remission from body dysmorphic disorder after internet-delivered cognitive behavior therapy: a machine learning approach

10.31234/osf.io/eqcdx ◽

2019 ◽

Author(s):

Oskar Flygare ◽

Jesper Enander ◽

Erik Andersson ◽

Brjánn Ljótsson ◽

Volen Z Ivanov ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forests ◽

Clinical Utility ◽

Body Dysmorphic Disorder ◽

Prediction Models ◽

Behavioral Therapy ◽

Learning Approach ◽

Learning Approaches ◽

Machine Learning Approach

**Background:** Previous attempts to identify predictors of treatment outcomes in body dysmorphic disorder (BDD) have yielded inconsistent findings. One way to increase precision and clinical utility could be to use machine learning methods, which can incorporate multiple non-linear associations in prediction models. **Methods:** This study used a random forests machine learning approach to test if it is possible to reliably predict remission from BDD in a sample of 88 individuals that had received internet-delivered cognitive behavioral therapy for BDD. The random forest models were compared to traditional logistic regression analyses. **Results:** Random forests correctly identified 78% of participants as remitters or non-remitters at post-treatment. The accuracy of prediction was lower in subsequent follow-ups (68%, 66% and 61% correctly classified at 3-, 12- and 24-month follow-ups, respectively). Depressive symptoms, treatment credibility, working alliance, and initial severity of BDD were among the most important predictors at the beginning of treatment. By contrast, the logistic regression models did not identify consistent and strong predictors of remission from BDD. **Conclusions:** The results provide initial support for the clinical utility of machine learning approaches in the prediction of outcomes of patients with BDD. **Trial registration:** ClinicalTrials.gov ID: NCT02010619.

Download Full-text

Identification of interface residues involved in protein-protein and protein-DNA interactions from sequence using machine learning approaches

10.31274/rtd-180813-2240 ◽

2005 ◽

Author(s):

Changhui Yan

Keyword(s):

Machine Learning ◽

Learning Approaches ◽

Dna Interactions ◽

Protein Dna Interactions ◽

Interface Residues

Download Full-text