scholarly journals ApoplastP: prediction of effectors and plant proteins in the apoplast using machine learning

2017 ◽  
Author(s):  
Jana Sperschneider ◽  
Peter N. Dodds ◽  
Karam B. Singh ◽  
Jennifer M. Taylor

AbstractThe plant apoplast is integral to intercellular signalling, transport and plant-pathogen interactions. Plant pathogens deliver effectors both into the apoplast and inside host cells, but no computational method currently exists to discriminate between these localizations. We present ApoplastP, the first method for predicting if an effector or plant protein localizes to the apoplast. ApoplastP uncovers features for apoplastic localization common to both effectors and plant proteins, namely an enrichment in small amino acids and cysteines as well as depletion in glutamic acid. ApoplastP predicts apoplastic localization in effectors with sensitivity of 75% and false positive rate of 5%, improving accuracy of cysteine-rich classifiers by over 13%. ApoplastP does not depend on the presence of a signal peptide and correctly predicts the localization of unconventionally secreted plant and effector proteins. The secretomes of fungal saprophytes, necrotrophic pathogens and extracellular pathogens are enriched for predicted apoplastic proteins. Rust pathogen secretomes have the lowest percentage of apoplastic proteins, but these are highly enriched for predicted effectors. ApoplastP pioneers apoplastic localization prediction using machine learning. It will facilitate functional studies and will be valuable for predicting if an effector localizes to the apoplast or if it enters plant cells. ApoplastP is available at http://apoplastp.csiro.au.

2021 ◽  
Vol 7 (2) ◽  
pp. 86
Author(s):  
Bilal Ökmen ◽  
Daniela Schwammbach ◽  
Guus Bakkeren ◽  
Ulla Neumann ◽  
Gunther Doehlemann

Obligate biotrophic fungal pathogens, such as Blumeria graminis and Puccinia graminis, are amongst the most devastating plant pathogens, causing dramatic yield losses in many economically important crops worldwide. However, a lack of reliable tools for the efficient genetic transformation has hampered studies into the molecular basis of their virulence or pathogenicity. In this study, we present the Ustilago hordei–barley pathosystem as a model to characterize effectors from different plant pathogenic fungi. We generate U. hordei solopathogenic strains, which form infectious filaments without the presence of a compatible mating partner. Solopathogenic strains are suitable for heterologous expression system for fungal virulence factors. A highly efficient Crispr/Cas9 gene editing system is made available for U. hordei. In addition, U. hordei infection structures during barley colonization are analyzed using transmission electron microscopy, showing that U. hordei forms intracellular infection structures sharing high similarity to haustoria formed by obligate rust and powdery mildew fungi. Thus, U. hordei has high potential as a fungal expression platform for functional studies of heterologous effector proteins in barley.


2010 ◽  
Vol 37 (10) ◽  
pp. 913 ◽  
Author(s):  
Pamela H. P. Gan ◽  
Maryam Rafiqi ◽  
Adrienne R. Hardham ◽  
Peter N. Dodds

Plant pathogenic biotrophic fungi are able to grow within living plant tissue due to the action of secreted pathogen proteins known as effectors that alter the response of plant cells to pathogens. The discovery and identification of these proteins has greatly expanded with the sequencing and annotation of fungal pathogen genomes. Studies to characterise effector function have revealed that a subset of these secreted pathogen proteins interact with plant proteins within the host cytoplasm. This review focuses on the effectors of intracellular biotrophic and hemibiotrophic fungal plant pathogens and summarises advances in understanding the roles of these proteins in disease and in elucidating the mechanism of fungal effector uptake into host cells.


2019 ◽  
Author(s):  
Rayees Rahman ◽  
Arad Kodesh ◽  
Stephen Z Levine ◽  
Sven Sandin ◽  
Abraham Reichenberg ◽  
...  

AbstractImportanceCurrent approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, where most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome.ObjectiveDevelop a machine learning (ML) method predicting the diagnosis of ASD in offspring in a general population sample, using parental electronic medical records (EMR) available before childbirthDesignPrognostic study of EMR data within a single Israeli health maintenance organization, for the parents of 1,397 ASD children (ICD-9/10), and 94,741 non-ASD children born between January 1st, 1997 through December 31st, 2008. The complete EMR record of the parents was used to develop various ML models to predict the risk of having a child with ASD.Main outcomes and measuresRoutinely available parental sociodemographic information, medical histories and prescribed medications data until offspring’s birth were used to generate features to train various machine learning algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross validation, by computing C statistics, sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value, PPV).ResultsAll ML models tested had similar performance, achieving an average C statistics of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85% for predicting ASD in this dataset.Conclusion and relevanceML algorithms combined with EMR capture early life ASD risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children.Key pointsQuestionCan autism risk in children be predicted using the pre-birth electronic medical record (EMR) of the parents?FindingsIn this population-based study that included 1,397 children with autism spectrum disorder (ASD) and 94,741 non-ASD children, we developed a machine learning classifier for predicting the likelihood of childhood diagnosis of ASD with an average C statistic of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85%.MeaningThe results presented serve as a proof-of-principle of the potential utility of EMR for the identification of a large proportion of future children at a high-risk of ASD.


Electronics ◽  
2021 ◽  
Vol 10 (22) ◽  
pp. 2857
Author(s):  
Laura Vigoya ◽  
Diego Fernandez ◽  
Victor Carneiro ◽  
Francisco Nóvoa

With advancements in engineering and science, the application of smart systems is increasing, generating a faster growth of the IoT network traffic. The limitations due to IoT restricted power and computing devices also raise concerns about security vulnerabilities. Machine learning-based techniques have recently gained credibility in a successful application for the detection of network anomalies, including IoT networks. However, machine learning techniques cannot work without representative data. Given the scarcity of IoT datasets, the DAD emerged as an instrument for knowing the behavior of dedicated IoT-MQTT networks. This paper aims to validate the DAD dataset by applying Logistic Regression, Naive Bayes, Random Forest, AdaBoost, and Support Vector Machine to detect traffic anomalies in IoT. To obtain the best results, techniques for handling unbalanced data, feature selection, and grid search for hyperparameter optimization have been used. The experimental results show that the proposed dataset can achieve a high detection rate in all the experiments, providing the best mean accuracy of 0.99 for the tree-based models, with a low false-positive rate, ensuring effective anomaly detection.


2019 ◽  
Author(s):  
Karine de Guillen ◽  
Cécile Lorrain ◽  
Pascale Tsan ◽  
Philippe Barthe ◽  
Benjamin Petre ◽  
...  

ABSTRACTRust fungi are plant pathogens that secrete an arsenal of effector proteins interfering with plant functions and promoting parasitic infection. Effectors are often species-specific, evolve rapidly, and display low sequence similarities with known proteins or domains. How rust fungal effectors function in host cells remains elusive, and biochemical and structural approaches have been scarcely used to tackle this question. In this study, we used a strategy based on recombinant protein production in Escherichia coli to study eleven candidate effectors of the leaf rust fungus Melampsora larici-populina. We successfully purified and solved the three-dimensional structure of two proteins, MLP124266 and MLP124017, using NMR spectroscopy. Although both proteins show no sequence similarity with known proteins, they exhibit structural similarities to knottin and nuclear transport factor 2-like proteins, respectively. Altogether, our findings show that sequence-unrelated effectors can adopt folds similar to known proteins, and encourage the use of biochemical and structural approaches to functionally characterize rust effector candidates.


2012 ◽  
pp. 830-850
Author(s):  
Abhilash Alexander Miranda ◽  
Olivier Caelen ◽  
Gianluca Bontempi

This chapter presents a comprehensive scheme for automated detection of colorectal polyps in computed tomography colonography (CTC) with particular emphasis on robust learning algorithms that differentiate polyps from non-polyp shapes. The authors’ automated CTC scheme introduces two orientation independent features which encode the shape characteristics that aid in classification of polyps and non-polyps with high accuracy, low false positive rate, and low computations making the scheme suitable for colorectal cancer screening initiatives. Experiments using state-of-the-art machine learning algorithms viz., lazy learning, support vector machines, and naïve Bayes classifiers reveal the robustness of the two features in detecting polyps at 100% sensitivity for polyps with diameter greater than 10 mm while attaining total low false positive rates, respectively, of 3.05, 3.47 and 0.71 per CTC dataset at specificities above 99% when tested on 58 CTC datasets. The results were validated using colonoscopy reports provided by expert radiologists.


2020 ◽  
Vol 63 (1) ◽  
Author(s):  
Rayees Rahman ◽  
Arad Kodesh ◽  
Stephen Z. Levine ◽  
Sven Sandin ◽  
Abraham Reichenberg ◽  
...  

Abstract Background. Current approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, and most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome. The aim of the current study was to test the ability of machine learning (ML) models applied to electronic medical records (EMRs) to predict ASD early in life, in a general population sample. Methods. We used EMR data from a single Israeli Health Maintenance Organization, including EMR information for parents of 1,397 ASD children (ICD-9/10) and 94,741 non-ASD children born between January 1st, 1997 and December 31st, 2008. Routinely available parental sociodemographic information, parental medical histories, and prescribed medications data were used to generate features to train various ML algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross-validation by computing the area under the receiver operating characteristic curve (AUC; C-statistic), sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value [PPV]). Results. All ML models tested had similar performance. The average performance across all models had C-statistic of 0.709, sensitivity of 29.93%, specificity of 98.18%, accuracy of 95.62%, false positive rate of 1.81%, and PPV of 43.35% for predicting ASD in this dataset. Conclusions. We conclude that ML algorithms combined with EMR capture early life ASD risk as well as reveal previously unknown features to be associated with ASD-risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children.


2015 ◽  
Vol 28 (6) ◽  
pp. 689-700 ◽  
Author(s):  
Benjamin Petre ◽  
Diane G. O. Saunders ◽  
Jan Sklenar ◽  
Cécile Lorrain ◽  
Joe Win ◽  
...  

Rust fungi are devastating crop pathogens that deliver effector proteins into infected tissues to modulate plant functions and promote parasitic growth. The genome of the poplar leaf rust fungus Melampsora larici-populina revealed a large catalog of secreted proteins, some of which have been considered candidate effectors. Unraveling how these proteins function in host cells is a key to understanding pathogenicity mechanisms and developing resistant plants. In this study, we used an effectoromics pipeline to select, clone, and express 20 candidate effectors in Nicotiana benthamiana leaf cells to determine their subcellular localization and identify the plant proteins they interact with. Confocal microscopy revealed that six candidate effectors target the nucleus, nucleoli, chloroplasts, mitochondria, and discrete cellular bodies. We also used coimmunoprecipitation (coIP) and mass spectrometry to identify 606 N. benthamiana proteins that associate with the candidate effectors. Five candidate effectors specifically associated with a small set of plant proteins that may represent biologically relevant interactors. We confirmed the interaction between the candidate effector MLP124017 and TOPLESS-related protein 4 from poplar by in planta coIP. Altogether, our data enable us to validate effector proteins from M. larici-populina and reveal that these proteins may target multiple compartments and processes in plant cells. It also shows that N. benthamiana can be a powerful heterologous system to study effectors of obligate biotrophic pathogens.


2009 ◽  
Vol 53 (7) ◽  
pp. 2949-2954 ◽  
Author(s):  
Isabel Cuesta ◽  
Concha Bielza ◽  
Pedro Larrañaga ◽  
Manuel Cuenca-Estrella ◽  
Fernando Laguna ◽  
...  

ABSTRACT European Committee on Antimicrobial Susceptibility Testing (EUCAST) breakpoints classify Candida strains with a fluconazole MIC ≤ 2 mg/liter as susceptible, those with a fluconazole MIC of 4 mg/liter as representing intermediate susceptibility, and those with a fluconazole MIC > 4 mg/liter as resistant. Machine learning models are supported by complex statistical analyses assessing whether the results have statistical relevance. The aim of this work was to use supervised classification algorithms to analyze the clinical data used to produce EUCAST fluconazole breakpoints. Five supervised classifiers (J48, Correlation and Regression Trees [CART], OneR, Naïve Bayes, and Simple Logistic) were used to analyze two cohorts of patients with oropharyngeal candidosis and candidemia. The target variable was the outcome of the infections, and the predictor variables consisted of values for the MIC or the proportion between the dose administered and the MIC of the isolate (dose/MIC). Statistical power was assessed by determining values for sensitivity and specificity, the false-positive rate, the area under the receiver operating characteristic (ROC) curve, and the Matthews correlation coefficient (MCC). CART obtained the best statistical power for a MIC > 4 mg/liter for detecting failures (sensitivity, 87%; false-positive rate, 8%; area under the ROC curve, 0.89; MCC index, 0.80). For dose/MIC determinations, the target was >75, with a sensitivity of 91%, a false-positive rate of 10%, an area under the ROC curve of 0.90, and an MCC index of 0.80. Other classifiers gave similar breakpoints with lower statistical power. EUCAST fluconazole breakpoints have been validated by means of machine learning methods. These computer tools must be incorporated in the process for developing breakpoints to avoid researcher bias, thus enhancing the statistical power of the model.


2021 ◽  
Author(s):  
Anna Goldenberg ◽  
Bret Nestor ◽  
Jaryd Hunter ◽  
Raghu Kainkaryam ◽  
Erik Drysdale ◽  
...  

Abstract Commercial wearable devices are surfacing as an appealing mechanism to detect COVID-19 and potentially other public health threats, due to their widespread use. To assess the validity of wearable devices as population health screening tools, it is essential to evaluate predictive methodologies based on wearable devices by mimicking their real-world deployment. Several points must be addressed to transition from statistically significant differences between infected and uninfected cohorts to COVID-19 inferences on individuals. We demonstrate the strengths and shortcomings of existing approaches on a cohort of 32,198 individuals who experience influenza like illness (ILI), 204 of which report testing positive for COVID-19. We show that, despite commonly made design mistakes resulting in overestimation of performance, when properly designed wearables can be effectively used as a part of the detection pipeline. For example, knowing the week of year, combined with naive randomised test set generation leads to substantial overestimation of COVID-19 classification performance at 0.73 AUROC. However, an average AUROC of only 0.55 +/- 0.02 would be attainable in a simulation of real-world deployment, due to the shifting prevalence of COVID-19 and non-COVID-19 ILI to trigger further testing. In this work we show how to train a machine learning model to differentiate ILI days from healthy days, followed by a survey to differentiate COVID-19 from influenza and unspecified ILI based on symptoms. In a forthcoming week, models can expect a sensitivity of 0.50 (0-0.74, 95% CI), while utilising the wearable device to reduce the burden of surveys by 35%. The corresponding false positive rate is 0.22 (0.02-0.47, 95% CI). In the future, serious consideration must be given to the design, evaluation, and reporting of wearable device interventions if they are to be relied upon as part of frequent COVID-19 or other public health threat testing infrastructures.


Sign in / Sign up

Export Citation Format

Share Document