scholarly journals Brain tumor diagnostic model and dietary effect based on extracellular vesicle microbiome data in serum

2020 ◽  
Vol 52 (9) ◽  
pp. 1602-1613
Author(s):  
Jinho Yang ◽  
Hyo Eun Moon ◽  
Hyung Woo Park ◽  
Andrea McDowell ◽  
Tae-Seop Shin ◽  
...  

Abstract The human microbiome has been recently associated with human health and disease. Brain tumors (BTs) are a particularly difficult condition to directly link to the microbiome, as microorganisms cannot generally cross the blood–brain barrier (BBB). However, some nanosized extracellular vesicles (EVs) released from microorganisms can cross the BBB and enter the brain. Therefore, we conducted metagenomic analysis of microbial EVs in both serum (152 BT patients and 198 healthy controls (HC)) and brain tissue (5 BT patients and 5 HC) samples based on the V3–V4 regions of 16S rDNA. We then developed diagnostic models through logistic regression and machine learning algorithms using serum EV metagenomic data to assess the ability of various dietary supplements to reduce BT risk in vivo. Models incorporating the stepwise method and the linear discriminant analysis effect size (LEfSe) method yielded 12 and 29 significant genera as potential biomarkers, respectively. Models using the selected biomarkers yielded areas under the curves (AUCs) >0.93, and the model using machine learning resulted in an AUC of 0.99. In addition, Dialister and [Eubacterium] rectale were significantly lower in both blood and tissue samples of BT patients than in those of HCs. In vivo tests showed that BT risk was decreased through the addition of sorghum, brown rice oil, and garlic but conversely increased by the addition of bellflower and pear. In conclusion, serum EV metagenomics shows promise as a rich data source for highly accurate detection of BT risk, and several foods have potential for mitigating BT risk.

2020 ◽  
Vol 8 (Suppl 3) ◽  
pp. A62-A62
Author(s):  
Dattatreya Mellacheruvu ◽  
Rachel Pyke ◽  
Charles Abbott ◽  
Nick Phillips ◽  
Sejal Desai ◽  
...  

BackgroundAccurately identified neoantigens can be effective therapeutic agents in both adjuvant and neoadjuvant settings. A key challenge for neoantigen discovery has been the availability of accurate prediction models for MHC peptide presentation. We have shown previously that our proprietary model based on (i) large-scale, in-house mono-allelic data, (ii) custom features that model antigen processing, and (iii) advanced machine learning algorithms has strong performance. We have extended upon our work by systematically integrating large quantities of high-quality, publicly available data, implementing new modelling algorithms, and rigorously testing our models. These extensions lead to substantial improvements in performance and generalizability. Our algorithm, named Systematic HLA Epitope Ranking Pan Algorithm (SHERPA™), is integrated into the ImmunoID NeXT Platform®, our immuno-genomics and transcriptomics platform specifically designed to enable the development of immunotherapies.MethodsIn-house immunopeptidomic data was generated using stably transfected HLA-null K562 cells lines that express a single HLA allele of interest, followed by immunoprecipitation using W6/32 antibody and LC-MS/MS. Public immunopeptidomics data was downloaded from repositories such as MassIVE and processed uniformly using in-house pipelines to generate peptide lists filtered at 1% false discovery rate. Other metrics (features) were either extracted from source data or generated internally by re-processing samples utilizing the ImmunoID NeXT Platform.ResultsWe have generated large-scale and high-quality immunopeptidomics data by using approximately 60 mono-allelic cell lines that unambiguously assign peptides to their presenting alleles to create our primary models. Briefly, our primary ‘binding’ algorithm models MHC-peptide binding using peptide and binding pockets while our primary ‘presentation’ model uses additional features to model antigen processing and presentation. Both primary models have significantly higher precision across all recall values in multiple test data sets, including mono-allelic cell lines and multi-allelic tissue samples. To further improve the performance of our model, we expanded the diversity of our training set using high-quality, publicly available mono-allelic immunopeptidomics data. Furthermore, multi-allelic data was integrated by resolving peptide-to-allele mappings using our primary models. We then trained a new model using the expanded training data and a new composite machine learning architecture. The resulting secondary model further improves performance and generalizability across several tissue samples.ConclusionsImproving technologies for neoantigen discovery is critical for many therapeutic applications, including personalized neoantigen vaccines, and neoantigen-based biomarkers for immunotherapies. Our new and improved algorithm (SHERPA) has significantly higher performance compared to a state-of-the-art public algorithm and furthers this objective.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 264-265
Author(s):  
Duy Ngoc Do ◽  
Guoyu Hu ◽  
Younes Miar

Abstract American mink (Neovison vison) is the major source of fur for the fur industries worldwide and Aleutian disease (AD) is causing severe financial losses to the mink industry. Different methods have been used to diagnose the AD in mink, but the combination of several methods can be the most appropriate approach for the selection of AD resilient mink. Iodine agglutination test (IAT) and counterimmunoelectrophoresis (CIEP) methods are commonly employed in test-and-remove strategy; meanwhile, enzyme-linked immunosorbent assay (ELISA) and packed-cell volume (PCV) methods are complementary. However, using multiple methods are expensive; and therefore, hindering the corrected use of AD tests in selection. This research presented the assessments of the AD classification based on machine learning algorithms. The Aleutian disease was tested on 1,830 individuals using these tests in an AD positive mink farm (Canadian Centre for Fur Animal Research, NS, Canada). The accuracy of classification for CIEP was evaluated based on the sex information, and IAT, ELISA and PCV test results implemented in seven machine learning classification algorithms (Random Forest, Artificial Neural Networks, C50Tree, Naive Bayes, Generalized Linear Models, Boost, and Linear Discriminant Analysis) using the Caret package in R. The accuracy of prediction varied among the methods. Overall, the Random Forest was the best-performing algorithm for the current dataset with an accuracy of 0.89 in the training data and 0.94 in the testing data. Our work demonstrated the utility and relative ease of using machine learning algorithms to assess the CIEP information, and consequently reducing the cost of AD tests. However, further works require the inclusion of production and reproduction information in the models and extension of phenotypic collection to increase the accuracy of current methods.


Author(s):  
S. R. Mani Sekhar ◽  
G. M. Siddesh

Machine learning is one of the important areas in the field of computer science. It helps to provide an optimized solution for the real-world problems by using past knowledge or previous experience data. There are different types of machine learning algorithms present in computer science. This chapter provides the overview of some selected machine learning algorithms such as linear regression, linear discriminant analysis, support vector machine, naive Bayes classifier, neural networks, and decision trees. Each of these methods is illustrated in detail with an example and R code, which in turn assists the reader to generate their own solutions for the given problems.


2021 ◽  
Vol 11 ◽  
Author(s):  
Qi Wan ◽  
Jiaxuan Zhou ◽  
Xiaoying Xia ◽  
Jianfeng Hu ◽  
Peng Wang ◽  
...  

ObjectiveTo evaluate the performance of 2D and 3D radiomics features with different machine learning approaches to classify SPLs based on magnetic resonance(MR) T2 weighted imaging (T2WI).Material and MethodsA total of 132 patients with pathologically confirmed SPLs were examined and randomly divided into training (n = 92) and test datasets (n = 40). A total of 1692 3D and 1231 2D radiomics features per patient were extracted. Both radiomics features and clinical data were evaluated. A total of 1260 classification models, comprising 3 normalization methods, 2 dimension reduction algorithms, 3 feature selection methods, and 10 classifiers with 7 different feature numbers (confined to 3–9), were compared. The ten-fold cross-validation on the training dataset was applied to choose the candidate final model. The area under the receiver operating characteristic curve (AUC), precision-recall plot, and Matthews Correlation Coefficient were used to evaluate the performance of machine learning approaches.ResultsThe 3D features were significantly superior to 2D features, showing much more machine learning combinations with AUC greater than 0.7 in both validation and test groups (129 vs. 11). The feature selection method Analysis of Variance(ANOVA), Recursive Feature Elimination(RFE) and the classifier Logistic Regression(LR), Linear Discriminant Analysis(LDA), Support Vector Machine(SVM), Gaussian Process(GP) had relatively better performance. The best performance of 3D radiomics features in the test dataset (AUC = 0.824, AUC-PR = 0.927, MCC = 0.514) was higher than that of 2D features (AUC = 0.740, AUC-PR = 0.846, MCC = 0.404). The joint 3D and 2D features (AUC=0.813, AUC-PR = 0.926, MCC = 0.563) showed similar results as 3D features. Incorporating clinical features with 3D and 2D radiomics features slightly improved the AUC to 0.836 (AUC-PR = 0.918, MCC = 0.620) and 0.780 (AUC-PR = 0.900, MCC = 0.574), respectively.ConclusionsAfter algorithm optimization, 2D feature-based radiomics models yield favorable results in differentiating malignant and benign SPLs, but 3D features are still preferred because of the availability of more machine learning algorithmic combinations with better performance. Feature selection methods ANOVA and RFE, and classifier LR, LDA, SVM and GP are more likely to demonstrate better diagnostic performance for 3D features in the current study.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e17544-e17544
Author(s):  
Wanja Nikolai Kassuhn ◽  
Oliver Klein ◽  
Silvia Darb-Esfahani ◽  
Hedwig Lammert ◽  
Sylwia Handzik ◽  
...  

e17544 Background: High-grade serous ovarian cancer (HGSOC) can be separated by gene expression profiling into four molecular subtypes with clear correlation of the clinical outcome. However, these gene signatures have not been implemented in clinical practice to stratify patients for targeted therapy. This is mainly due to a lack of easy, cost-effective and reproducible methods, as well as the high heterogeneity of HGSOC. Hence, we aimed to examine the potential of unsupervised matrix assisted laser desorption/ionization imaging mass spectrometry (MALDI-IMS) to stratify patients, which might benefit from targeted therapeutic strategies. Methods: Molecular subtyping of paraffin-embedded tissue samples from 279 HGSOC patients was performed by NanoString analysis (ground truth labeling). Next, we applied MALDI-IMS, a novel technology to identify distinct mass profiles on the same paraffin-embedded tissue sections paired with machine learning algorithms to identify HGSOC subtypes by proteomic signature. Finally, we devised a novel strategy to annotate spectra of stromal origin. Results: We elucidated a MALDI-derived proteomic signature (135 peptides) able to classify HGSOC subtypes. Random forest classifiers achieved an area under the curve (AUC) of 0.983. Furthermore, we demonstrated that the exclusion of stroma associated spectra provides tangible improvements to classification quality (AUC = 0.988). False discovery rates (FDR) were reduced from 10.2% to 8.0%. Finally, novel MALDI-based stroma annotation achieved near-perfect classifications (AUC = 0.999, FDR < 1.0%). Conclusions: Here, we present a concept integrating MALDI-IMS with machine learning algorithms to classify patients according to distinct molecular subtypes of HGSOC. This has great potential to assign patients for targeted therapies.


2021 ◽  
Author(s):  
Dian Kesumapramudya Nurputra ◽  
Ahmad Kusumaatmadja ◽  
Mohamad Saifudin Hakim ◽  
Shidiq Nur Hidayat ◽  
Trisna Julian ◽  
...  

Abstract Despite its high accuracy to detect the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the reverse transcription-quantitative polymerase chain reaction (RT-qPCR) approach possesses several limitations (e.g., the lengthy invasive procedure, the reagent availability, and the requirement of specialized laboratory, equipment, and trained staffs). We developed and employed a low-cost, noninvasive method to rapidly sniff out the coronavirus disease 2019 (COVID-19) based on a portable electronic nose (GeNose C19) integrating metal oxide semiconductor gas sensor array, optimized feature extraction, and machine learning models. This approach was evaluated in profiling tests involving a total number of 615 breath samples (i.e., 333 positive and 282 negative COVID-19 confirmed by RT-qPCR) obtained from 83 patients in two hospitals located in the Special Region of Yogyakarta, Indonesia. Four different machine learning algorithms (i.e., linear discriminant analysis (LDA), support vector machine (SVM), stacked multilayer perceptron (MLP), and deep neural network (DNN)) were utilized to identify the top-performing pattern recognition methods and to obtain high system detection accuracy (88–95%), sensitivity (86–94%), specificity (88–95%) levels from the testing datasets. Our results suggest that GeNose C19 can be considered a highly potential breathalyzer for fast COVID-19 screening.


Author(s):  
Mustafa N Shakir ◽  
Brittany N Dugger

Abstract Alzheimer disease (AD) is a neurodegenerative disorder characterized pathologically by the presence of neurofibrillary tangles and amyloid beta (Aβ) plaques in the brain. The disease was first described in 1906 by Alois Alzheimer, and since then, there have been many advancements in technologies that have aided in unlocking the secrets of this devastating disease. Such advancements include improving microscopy and staining techniques, refining diagnostic criteria for the disease, and increased appreciation for disease heterogeneity both in neuroanatomic location of abnormalities as well as overlap with other brain diseases; for example, Lewy body disease and vascular dementia. Despite numerous advancements, there is still much to achieve as there is not a cure for AD and postmortem histological analyses is still the gold standard for appreciating AD neuropathologic changes. Recent technological advances such as in-vivo biomarkers and machine learning algorithms permit great strides in disease understanding, and pave the way for potential new therapies and precision medicine approaches. Here, we review the history of human AD neuropathology research to include the notable advancements in understanding common co-pathologies in the setting of AD, and microscopy and staining methods. We also discuss future approaches with a specific focus on deep phenotyping using machine learning.


Author(s):  
Jose Liñares-Blanco ◽  
Carlos Fernandez-Lozano ◽  
Jose A. Seoane ◽  
Guillermo Lopez-Campos

In recent years, microbiota has become an increasingly relevant factor for the understanding and potential treatment of diseases. In this work, based on the data reported by the largest study of microbioma in the world, a classification model has been developed based on Machine Learning (ML) capable of predicting the country of origin (United Kingdom vs United States) according to metagenomic data. The data were used for the training of a glmnet algorithm and a Random Forest algorithm. Both algorithms obtained similar results (0.698 and 0.672 in AUC, respectively). Furthermore, thanks to the application of a multivariate feature selection algorithm, eleven metagenomic genres highly correlated with the country of origin were obtained. An in-depth study of the variables used in each model is shown in the present work.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Arvin Hansrajh ◽  
Timothy T. Adeliyi ◽  
Jeanette Wing

The exponential growth in fake news and its inherent threat to democracy, public trust, and justice has escalated the necessity for fake news detection and mitigation. Detecting fake news is a complex challenge as it is intentionally written to mislead and hoodwink. Humans are not good at identifying fake news. The detection of fake news by humans is reported to be at a rate of 54% and an additional 4% is reported in the literature as being speculative. The significance of fighting fake news is exemplified during the present pandemic. Consequently, social networks are ramping up the usage of detection tools and educating the public in recognising fake news. In the literature, it was observed that several machine learning algorithms have been applied to the detection of fake news with limited and mixed success. However, several advanced machine learning models are not being applied, although recent studies are demonstrating the efficacy of the ensemble machine learning approach; hence, the purpose of this study is to assist in the automated detection of fake news. An ensemble approach is adopted to help resolve the identified gap. This study proposed a blended machine learning ensemble model developed from logistic regression, support vector machine, linear discriminant analysis, stochastic gradient descent, and ridge regression, which is then used on a publicly available dataset to predict if a news report is true or not. The proposed model will be appraised with the popular classical machine learning models, while performance metrics such as AUC, ROC, recall, accuracy, precision, and f1-score will be used to measure the performance of the proposed model. Results presented showed that the proposed model outperformed other popular classical machine learning models.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1622 ◽  
Author(s):  
Jung-Yeon Kim ◽  
Geunsu Park ◽  
Seong-A Lee ◽  
Yunyoung Nam

Spasticity is a frequently observed symptom in patients with neurological impairments. Spastic movements of their upper and lower limbs are periodically measured to evaluate functional outcomes of physical rehabilitation, and they are quantified by clinical outcome measures such as the modified Ashworth scale (MAS). This study proposes a method to determine the severity of elbow spasticity, by analyzing the acceleration and rotation attributes collected from the elbow of the affected side of patients and machine-learning algorithms to classify the degree of spastic movement; this approach is comparable to assigning an MAS score. We collected inertial data from participants using a wearable device incorporating inertial measurement units during a passive stretch test. Machine-learning algorithms—including decision tree, random forests (RFs), support vector machine, linear discriminant analysis, and multilayer perceptrons—were evaluated in combinations of two segmentation techniques and feature sets. A RF performed well, achieving up to 95.4% accuracy. This work not only successfully demonstrates how wearable technology and machine learning can be used to generate a clinically meaningful index but also offers rehabilitation patients an opportunity to monitor the degree of spasticity, even in nonhealthcare institutions where the help of clinical professionals is unavailable.


Sign in / Sign up

Export Citation Format

Share Document