scholarly journals Developing a machine learning environmental allergy prediction model from real world data through a novel decentralized mobile study platform.

Author(s):  
Chethan Sarabu ◽  
Sandra Steyaert ◽  
Nirav Shah

Environmental allergies cause significant morbidity across a wide range of demographic groups. This morbidity could be mitigated through individualized predictive models capable of guiding personalized preventive measures. We developed a predictive model by integrating smartphone sensor data with symptom diaries maintained by patients. The machine learning model was found to be highly predictive, with an accuracy of 0.801. Such models based on real-world data can guide clinical care for patients and providers, reduce the economic burden of uncontrolled allergies, and set the stage for subsequent research pursuing allergy prediction and prevention. Moreover, this study offers proof-of-principle regarding the feasibility of building clinically useful predictive models from 'messy,' participant derived real-world data.

2020 ◽  
Author(s):  
Javier Mar ◽  
Ania Gorostiza ◽  
Oliver Ibarrondo ◽  
Carlos Cernuda ◽  
Arantzazu Arrospide ◽  
...  

Abstract Background Neuropsychiatric symptoms (NPS) are the leading cause of the social burden of dementia but their role is underestimated. The objective of the study was to validate predictive models to separately identify psychotic and depressive symptoms in patients diagnosed with dementia using clinical databases representing the whole population (real-world data). Methods First, we searched the electronic health records of 4,003 patients with dementia to identify NPS. Second, machine learning (random forest) algorithms were applied to build in the training sample (N=3,003) separate predictive models for psychotic and depressive symptoms. In order to evaluate the classification ability of the models, the following statistics were calculated for each model: the area under the receiver operating curve (AUC), sensitivity, specificity, accuracy, no-information rate and Kappa index. Third, calibration and discrimination were assessed in the validation sample (N= 1,000) to assess the performance of the models. A calibration curve was drawn by plotting the predicted probabilities for groups on the x-axis and the mean observed values on the y-axis. Results Neuropsychiatric symptoms were noted in the electronic health record of 58% of patients. The AUC reached 0.80 for the psychotic symptoms model and 0.74 for the depressive symptoms model. The Kappa index and accuracy also showed better discrimination in the psychotic model. Calibration plots indicated that both types of model had less predictive accuracy when the probability of neuropsychiatric symptoms was < 25%. The most important variables in the psychotic symptom model were use of risperidone, level of sedation, quetiapine and haloperidol and the number of antipsychotics prescribed. In the depressive symptom model, the most important variable was number of antidepressants prescribed, use of escitalopram, level of sedation and age. Conclusions More than half of the sample had NPS as identified by the presence of key terms in the electronic health record. Although NPS are not encoded, they are treated with antipsychotics and antidepressants, which allows developing valid predictive models by joining machine learning tools and real-world data. Given their good performance, the predictive models can be used to estimate prevalence of NPS in population databases.


2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Tinofirei Museba ◽  
Fulufhelo Nelwamondo ◽  
Khmaies Ouahada

Beyond applying machine learning predictive models to static tasks, a significant corpus of research exists that applies machine learning predictive models to streaming environments that incur concept drift. With the prevalence of streaming real-world applications that are associated with changes in the underlying data distribution, the need for applications that are capable of adapting to evolving and time-varying dynamic environments can be hardly overstated. Dynamic environments are nonstationary and change with time and the target variables to be predicted by the learning algorithm and often evolve with time, a phenomenon known as concept drift. Most work in handling concept drift focuses on updating the prediction model so that it can recover from concept drift while little effort has been dedicated to the formulation of a learning system that is capable of learning different types of drifting concepts at any time with minimum overheads. This work proposes a novel and evolving data stream classifier called Adaptive Diversified Ensemble Selection Classifier (ADES) that significantly optimizes adaptation to different types of concept drifts at any time and improves convergence to new concepts by exploiting different amounts of ensemble diversity. The ADES algorithm generates diverse base classifiers, thereby optimizing the margin distribution to exploit ensemble diversity to formulate an ensemble classifier that generalizes well to unseen instances and provides fast recovery from different types of concept drift. Empirical experiments conducted on both artificial and real-world data streams demonstrate that ADES can adapt to different types of drifts at any given time. The prediction performance of ADES is compared to three other ensemble classifiers designed to handle concept drift using both artificial and real-world data streams. The comparative evaluation performed demonstrated the ability of ADES to handle different types of concept drifts. The experimental results, including statistical test results, indicate comparable performances with other algorithms designed to handle concept drift and prove their significance and effectiveness.


Cancers ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 875
Author(s):  
Kerri Beckmann ◽  
Hans Garmo ◽  
Ingela Franck Lissbrant ◽  
Pär Stattin

Real-world data (RWD), that is, data from sources other than controlled clinical trials, play an increasingly important role in medical research. The development of quality clinical registers, increasing access to administrative data sources, growing computing power and data linkage capacities have contributed to greater availability of RWD. Evidence derived from RWD increases our understanding of prostate cancer (PCa) aetiology, natural history and effective management. While randomised controlled trials offer the best level of evidence for establishing the efficacy of medical interventions and making causal inferences, studies using RWD offer complementary evidence about the effectiveness, long-term outcomes and safety of interventions in real-world settings. RWD provide the only means of addressing questions about risk factors and exposures that cannot be “controlled”, or when assessing rare outcomes. This review provides examples of the value of RWD for generating evidence about PCa, focusing on studies using data from a quality clinical register, namely the National Prostate Cancer Register (NPCR) Sweden, with longitudinal data on advanced PCa in Patient-overview Prostate Cancer (PPC) and data linkages to other sources in Prostate Cancer data Base Sweden (PCBaSe).


2020 ◽  
Vol 13 (11) ◽  
pp. 371
Author(s):  
Maximilian J. Hochmair ◽  
Hannah Fabikan ◽  
Oliver Illini ◽  
Christoph Weinlinger ◽  
Ulrike Setinek ◽  
...  

In clinical practice, patients with anaplastic lymphoma kinase (ALK)-rearrangement–positive non–small-cell lung cancer commonly receive sequential treatment with ALK tyrosine kinase inhibitors. The third-generation agent lorlatinib has been shown to inhibit a wide range of ALK resistance mutations and thus offers potential benefit in later lines, although real-world data are lacking. This multicenter study retrospectively investigated later-line, real-world use of lorlatinib in patients with advanced ALK- or ROS1-positive lung cancer. Fifty-one patients registered in a compassionate use program in Austria, who received second- or later-line lorlatinib between January 2016 and May 2020, were included in this retrospective real-world data analysis. Median follow-up was 25.3 months. Median time of lorlatinib treatment was 4.4 months for ALK-positive and 12.2 months for ROS-positive patients. ALK-positive patients showed a response rate of 43.2%, while 85.7% percent of the ROS1-positive patients were considered responders. Median overall survival from lorlatinib initiation was 10.2 and 20.0 months for the ALK- and ROS1-positive groups, respectively. In the ALK-positive group, lorlatinib proved efficacy after both brigatinib and alectinib. Lorlatinib treatment was well tolerated. Later-line lorlatinib treatment can induce sustained responses in patients with advanced ALK- and ROS1-positive lung cancer.


Sign in / Sign up

Export Citation Format

Share Document