scholarly journals 458. A Machine Learning Approach Identifies Distinct Early-Symptom Cluster Phenotypes Which Correlate with Severe SARS-CoV-2 Outcomes

2021 ◽  
Vol 8 (Supplement_1) ◽  
pp. S331-S332
Author(s):  
Nusrat J Epsi ◽  
John H Powers ◽  
David A Lindholm ◽  
David A Lindholm ◽  
Alison Helfrich ◽  
...  

Abstract Background The novel coronavirus disease 2019 (COVID-19) pandemic remains a global challenge. Accurate COVID-19 prognosis remains an important aspect of clinical management. While many prognostic systems have been proposed, most are derived from analyses of individual symptoms or biomarkers. Here, we take a machine learning approach to first identify discrete clusters of early stage-symptoms which may delineate groups with distinct symptom phenotypes. We then sought to identify whether these groups correlate with subsequent disease severity. Methods The Epidemiology, Immunology, and Clinical Characteristics of Emerging Infectious Diseases with Pandemic Potential (EPICC) study is a longitudinal cohort study with data and biospecimens collected from nine military treatment facilities over 1 year of follow-up. Demographic and clinical characteristics were measured with interviews and electronic medical record review. Early symptoms by organ-domain were measured by FLU-PRO-plus surveys collected for 14 days post-enrollment, with surveys completed a median 14.5 (Interquartile Range, IQR = 13) days post-symptom onset. Using these FLU-PRO-plus responses, we applied principal component analysis followed by unsupervised machine learning algorithm k-means to identify groups with distinct clusters of symptoms. We then fit multivariate logistic regression models to determine how these early-symptom clusters correlated with hospitalization risk after controlling for age, sex, race, and obesity. Results Using SARS-CoV-2 positive participants (n = 1137) from the EPICC cohort (Figure 1), we transformed reported symptoms into domains and identified three groups of participants with distinct clusters of symptoms. Logistic regression demonstrated that cluster-2 was associated with an approximately three-fold increased odds [3.01 (95% CI: 2-4.52); P < 0.001] of hospitalization which remained significant after controlling for other factors [2.97 (95% CI: 1.88-4.69); P < 0.001]. (A) Baseline characteristics of SARS-CoV-2 positive participants. (B) Heatmap comparing FLU-PRO response in each participant. (C) Principal component analysis followed by k-means clustering identified three groups of participants. (D) Crude and adjusted association of identified cluster with hospitalization. Conclusion Our findings have identified three distinct groups with early-symptom phenotypes. With further validation of the clusters’ significance, this tool could be used to improve COVID-19 prognosis in a precision medicine framework and may assist in patient triaging and clinical decision-making. Disclaimer Disclosures David A. Lindholm, MD, American Board of Internal Medicine (Individual(s) Involved: Self): Member of Auxiliary R&D Infectious Disease Item-Writer Task Force. No financial support received. No exam questions will be disclosed ., Other Financial or Material Support Ryan C. Maves, MD, EMD Serono (Advisor or Review Panel member)Heron Therapeutics (Advisor or Review Panel member) Simon Pollett, MBBS, Astra Zeneca (Other Financial or Material Support, HJF, in support of USU IDCRP, funded under a CRADA to augment the conduct of an unrelated Phase III COVID-19 vaccine trial sponsored by AstraZeneca as part of USG response (unrelated work))

2019 ◽  
Author(s):  
Oskar Flygare ◽  
Jesper Enander ◽  
Erik Andersson ◽  
Brjánn Ljótsson ◽  
Volen Z Ivanov ◽  
...  

**Background:** Previous attempts to identify predictors of treatment outcomes in body dysmorphic disorder (BDD) have yielded inconsistent findings. One way to increase precision and clinical utility could be to use machine learning methods, which can incorporate multiple non-linear associations in prediction models. **Methods:** This study used a random forests machine learning approach to test if it is possible to reliably predict remission from BDD in a sample of 88 individuals that had received internet-delivered cognitive behavioral therapy for BDD. The random forest models were compared to traditional logistic regression analyses. **Results:** Random forests correctly identified 78% of participants as remitters or non-remitters at post-treatment. The accuracy of prediction was lower in subsequent follow-ups (68%, 66% and 61% correctly classified at 3-, 12- and 24-month follow-ups, respectively). Depressive symptoms, treatment credibility, working alliance, and initial severity of BDD were among the most important predictors at the beginning of treatment. By contrast, the logistic regression models did not identify consistent and strong predictors of remission from BDD. **Conclusions:** The results provide initial support for the clinical utility of machine learning approaches in the prediction of outcomes of patients with BDD. **Trial registration:** ClinicalTrials.gov ID: NCT02010619.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
J. A. Camilleri ◽  
S. B. Eickhoff ◽  
S. Weis ◽  
J. Chen ◽  
J. Amunts ◽  
...  

AbstractWhile a replicability crisis has shaken psychological sciences, the replicability of multivariate approaches for psychometric data factorization has received little attention. In particular, Exploratory Factor Analysis (EFA) is frequently promoted as the gold standard in psychological sciences. However, the application of EFA to executive functioning, a core concept in psychology and cognitive neuroscience, has led to divergent conceptual models. This heterogeneity severely limits the generalizability and replicability of findings. To tackle this issue, in this study, we propose to capitalize on a machine learning approach, OPNMF (Orthonormal Projective Non-Negative Factorization), and leverage internal cross-validation to promote generalizability to an independent dataset. We examined its application on the scores of 334 adults at the Delis–Kaplan Executive Function System (D-KEFS), while comparing to standard EFA and Principal Component Analysis (PCA). We further evaluated the replicability of the derived factorization across specific gender and age subsamples. Overall, OPNMF and PCA both converge towards a two-factor model as the best data-fit model. The derived factorization suggests a division between low-level and high-level executive functioning measures, a model further supported in subsamples. In contrast, EFA, highlighted a five-factor model which reflects the segregation of the D-KEFS battery into its main tasks while still clustering higher-level tasks together. However, this model was poorly supported in the subsamples. Thus, the parsimonious two-factors model revealed by OPNMF encompasses the more complex factorization yielded by EFA while enjoying higher generalizability. Hence, OPNMF provides a conceptually meaningful, technically robust, and generalizable factorization for psychometric tools.


AITI ◽  
2020 ◽  
Vol 17 (1) ◽  
pp. 42-55
Author(s):  
Radius Tanone ◽  
Arnold B Emmanuel

Bank XYZ is one of the banks in Kupang City, East Nusa Tenggara Province which has several ATM machines and is placed in several merchant locations. The existing ATM machine is one of the goals of customers and non-customers in conducting transactions at the ATM machine. The placement of the ATM machines sometimes makes the machine not used optimally by the customer to transact, causing the disposal of machine resources and a condition called Not Operational Transaction (NOP). With the data consisting of several independent variables with numeric types, it is necessary to know how the classification of the dependent variable is NOP. Machine learning approach with Logistic Regression method is the solution in doing this classification. Some research steps are carried out by collecting data, analyzing using machine learning using python programming and writing reports. The results obtained with this machine learning approach is the resulting prediction value of 0.507 for its classification. This means that in the future XYZ Bank can classify NOP conditions based on the behavior of customers or non-customers in making transactions using Bank XYZ ATM machines.  


Catalysts ◽  
2020 ◽  
Vol 10 (3) ◽  
pp. 291 ◽  
Author(s):  
Anamya Ajjolli Nagaraja ◽  
Philippe Charton ◽  
Xavier F. Cadet ◽  
Nicolas Fontaine ◽  
Mathieu Delsaut ◽  
...  

The metabolic engineering of pathways has been used extensively to produce molecules of interest on an industrial scale. Methods like gene regulation or substrate channeling helped to improve the desired product yield. Cell-free systems are used to overcome the weaknesses of engineered strains. One of the challenges in a cell-free system is selecting the optimized enzyme concentration for optimal yield. Here, a machine learning approach is used to select the enzyme concentration for the upper part of glycolysis. The artificial neural network approach (ANN) is known to be inefficient in extrapolating predictions outside the box: high predicted values will bump into a sort of “glass ceiling”. In order to explore this “glass ceiling” space, we developed a new methodology named glass ceiling ANN (GC-ANN). Principal component analysis (PCA) and data classification methods are used to derive a rule for a high flux, and ANN to predict the flux through the pathway using the input data of 121 balances of four enzymes in the upper part of glycolysis. The outcomes of this study are i. in silico selection of optimum enzyme concentrations for a maximum flux through the pathway and ii. experimental in vitro validation of the “out-of-the-box” fluxes predicted using this new approach. Surprisingly, flux improvements of up to 63% were obtained. Gratifyingly, these improvements are coupled with a cost decrease of up to 25% for the assay.


2020 ◽  
Author(s):  
Jan Wolff ◽  
Alexander Gary ◽  
Daniela Jung ◽  
Claus Normann ◽  
Klaus Kaier ◽  
...  

Abstract Background: A common problem in machine learning applications is availability of data at the point of decision making. The aim of the present study was to use routine data readily available at admission to predict aspects relevant to the organization of psychiatric hospital care. A further aim was to compare the results of a machine learning approach with those obtained through a traditional method and those obtained through a naive baseline classifier. Methods: The study included consecutively discharged patients between 1 st of January 2017 and 31 st of December 2018 from nine psychiatric hospitals in Hesse, Germany. We compared the predictive performance achieved by stochastic gradient boosting (GBM) with multiple logistic regression and a naive baseline classifier. We tested the performance of our final models on unseen patients from another calendar year and from different hospitals. Results: The study included 45,388 inpatient episodes. The models’ performance, as measured by the area under the Receiver Operating Characteristic curve, varied strongly between the predicted outcomes, with relatively high performance in the prediction of coercive treatment (area under the curve: 0.83) and 1:1 observations (0.80) and relatively poor performance in the prediction of short length of stay (0.69) and non-response to treatment (0.65). The GBM performed slightly better than logistic regression. Both approaches were substantially better than a naive prediction based solely on basic diagnostic grouping. Conclusion: The present study has shown that administrative routine data can be used to predict aspects relevant to the organisation of psychiatric hospital care. Future research should investigate the predictive performance that is necessary to provide effective assistance in clinical practice for the benefit of both staff and patients.


Sign in / Sign up

Export Citation Format

Share Document