scholarly journals A quantum-enhanced precision medicine application to support data-driven clinical decisions for the personalized treatment of advanced knee osteoarthritis: development and preliminary validation of precisionKNEE QNN

Author(s):  
Stefano Olgiati ◽  
Nima Heidari ◽  
Davide Meloni ◽  
Federico Pirovano ◽  
Ali Noorani ◽  
...  

Background Quantum computing (QC) and quantum machine learning (QML) are promising experimental technologies which can improve precision medicine applications by reducing the computational complexity of algorithms driven by big, unstructured, real-world data. The clinical problem of knee osteoarthritis is that, although some novel therapies are safe and effective, the response is variable, and defining the characteristics of an individual who will respond remains a challenge. In this paper we tested a quantum neural network (QNN) application to support precision data-driven clinical decisions to select personalized treatments for advanced knee osteoarthritis. Methods Following patients consent and Research Ethics Committee approval, we collected clinico-demographic data before and after the treatment from 170 patients eligible for knee arthroplasty (Kellgren-Lawrence grade ≥ 3, OKS ≤ 27, Age ≥ 64 and idiopathic aetiology of arthritis) treated over a 2 year period with a single injection of microfragmented fat. Gender classes were balanced (76 M, 94 F) to mitigate gender bias. A patient with an improvement ≥ 7 OKS has been considered a Responder. We trained our QNN Classifier on a randomly selected training subset of 113 patients to classify responders from non-responders (73 R, 40 NR) in pain and function at 1 year. Outliers were hidden from the training dataset but not from the validation set. Results We tested our QNN Classifier on a randomly selected test subset of 57 patients (34 R, 23 NR) including outliers. The No Information Rate was equal to 0.59. Our application correctly classified 28 Responders out of 34 and 6 non-Responders out of 23 (Sensitivity = 0.82, Specificity = 0.26, F1 Statistic= 0.71). The Positive (LR+) and Negative (LR-) Likelihood Ratios were respectively 1.11 and 0.68. The Diagnostic Odds Ratio (DOR) was equal to 2. Conclusions Preliminary results on a small validation dataset show that quantum machine learning applied to data-driven clinical decisions for the personalized treatment of advanced knee osteoarthritis is a promising technology to reduce computational complexity and improve prognostic performance. Our results need further research validation with larger, real-world unstructured datasets, and clinical validation with an AI Clinical Trial to test model efficacy, safety, clinical significance and relevance at a public health level.

2021 ◽  
Vol 29 ◽  
pp. S397-S398
Author(s):  
S. Kim ◽  
M.R. Kosorok ◽  
L. Arbeeva ◽  
T. Schwartz ◽  
Y.M. Golightly ◽  
...  

2022 ◽  
Vol 54 (9) ◽  
pp. 1-36
Author(s):  
Dylan Chou ◽  
Meng Jiang

Data-driven network intrusion detection (NID) has a tendency towards minority attack classes compared to normal traffic. Many datasets are collected in simulated environments rather than real-world networks. These challenges undermine the performance of intrusion detection machine learning models by fitting machine learning models to unrepresentative “sandbox” datasets. This survey presents a taxonomy with eight main challenges and explores common datasets from 1999 to 2020. Trends are analyzed on the challenges in the past decade and future directions are proposed on expanding NID into cloud-based environments, devising scalable models for large network data, and creating labeled datasets collected in real-world networks.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Hojjat Salehinejad ◽  
Jumpei Kitamura ◽  
Noah Ditkofsky ◽  
Amy Lin ◽  
Aditya Bharatha ◽  
...  

AbstractMachine learning (ML) holds great promise in transforming healthcare. While published studies have shown the utility of ML models in interpreting medical imaging examinations, these are often evaluated under laboratory settings. The importance of real world evaluation is best illustrated by case studies that have documented successes and failures in the translation of these models into clinical environments. A key prerequisite for the clinical adoption of these technologies is demonstrating generalizable ML model performance under real world circumstances. The purpose of this study was to demonstrate that ML model generalizability is achievable in medical imaging with the detection of intracranial hemorrhage (ICH) on non-contrast computed tomography (CT) scans serving as the use case. An ML model was trained using 21,784 scans from the RSNA Intracranial Hemorrhage CT dataset while generalizability was evaluated using an external validation dataset obtained from our busy trauma and neurosurgical center. This real world external validation dataset consisted of every unenhanced head CT scan (n = 5965) performed in our emergency department in 2019 without exclusion. The model demonstrated an AUC of 98.4%, sensitivity of 98.8%, and specificity of 98.0%, on the test dataset. On external validation, the model demonstrated an AUC of 95.4%, sensitivity of 91.3%, and specificity of 94.1%. Evaluating the ML model using a real world external validation dataset that is temporally and geographically distinct from the training dataset indicates that ML generalizability is achievable in medical imaging applications.


Author(s):  
Benedikt Knüsel ◽  
Christoph Baumberger ◽  
Marius Zumwald ◽  
David N. Bresch ◽  
Reto Knutti

<p>Due to ever larger volumes of environmental data, environmental scientists can increasingly use machine learning to construct data-driven models of phenomena. Data-driven environmental models can provide useful information to society, but this requires that their uncertainties be understood. However, new conceptual tools are needed for this because existing approaches to assess the uncertainty of environmental models do so in terms of specific locations, such as model structure and parameter values. These locations are not informative for an assessment of the predictive uncertainty of data-driven models. Rather than the model structure or model parameters, we argue that it is the <em>behavior</em> of a data-driven model that should be subject to an assessment of uncertainty.</p><p>In this paper, we present a novel framework that can be used to assess the uncertainty of data-driven environmental models. The framework uses argument analysis and focuses on epistemic uncertainty, i.e., uncertainty that is related to a lack of knowledge. It proceeds in three steps. The first step consists in reconstructing the justification of the assumption that the model used is fit for the predictive task at hand. Arguments for this justification may, for example, refer to sensitivity analyses and model performance on a validation dataset. In a second step, this justification is evaluated to identify how conclusively the fitness-for-purpose assumption is justified. In a third step, the epistemic uncertainty is assessed based on the evaluation of the arguments. Epistemic uncertainty emerges due to insufficient justification of the fitness-for-purpose assumption, i.e., if the model is less-than-maximally fit-for-purpose. This lack of justification translates to predictive uncertainty, or <em>first-order uncertainty</em>. Uncertainty also emerges if it is unclear how well the fitness-for-purpose assumption is justified. We refer to this uncertainty as “second-order uncertainty”. In other words, second-order uncertainty is uncertainty that researchers face when assessing first-order uncertainty.</p><p>We illustrate how the framework is applied by discussing to a case study from environmental science in which data-driven models are used to make long-term projections of soil selenium concentrations. We highlight that in many applications, the lack of system understanding and the lack of transparency of machine learning can introduce a substantial level of second-order uncertainty. We close by sketching how the framework can inform uncertainty quantification.</p>


Author(s):  
Parisa Kordjamshidi ◽  
Dan Roth ◽  
Kristian Kersting

Data-driven approaches are becoming dominant problem-solving techniques in many areas of research and industry. Unfortunately, current technologies do not make such techniques easy to use for application experts who are not fluent in machine learning nor for machine learning experts who aim at testing ideas on real-world data and need to evaluate those as a part of an end-to-end system. We review key efforts made by various AI communities to provide languages for high-level abstractions over learning and reasoning techniques needed for designing complex AI systems. We classify the existing frameworks based on the type of techniques as well as the data and knowledge representations they use, provide a comparative study of the way they address the challenges of programming real-world applications, and highlight some shortcomings and future directions.


2022 ◽  
Vol 16 (1) ◽  
pp. e0010056
Author(s):  
Emmanuelle Sylvestre ◽  
Clarisse Joachim ◽  
Elsa Cécilia-Joseph ◽  
Guillaume Bouzillé ◽  
Boris Campillo-Gimenez ◽  
...  

Background Traditionally, dengue surveillance is based on case reporting to a central health agency. However, the delay between a case and its notification can limit the system responsiveness. Machine learning methods have been developed to reduce the reporting delays and to predict outbreaks, based on non-traditional and non-clinical data sources. The aim of this systematic review was to identify studies that used real-world data, Big Data and/or machine learning methods to monitor and predict dengue-related outcomes. Methodology/Principal findings We performed a search in PubMed, Scopus, Web of Science and grey literature between January 1, 2000 and August 31, 2020. The review (ID: CRD42020172472) focused on data-driven studies. Reviews, randomized control trials and descriptive studies were not included. Among the 119 studies included, 67% were published between 2016 and 2020, and 39% used at least one novel data stream. The aim of the included studies was to predict a dengue-related outcome (55%), assess the validity of data sources for dengue surveillance (23%), or both (22%). Most studies (60%) used a machine learning approach. Studies on dengue prediction compared different prediction models, or identified significant predictors among several covariates in a model. The most significant predictors were rainfall (43%), temperature (41%), and humidity (25%). The two models with the highest performances were Neural Networks and Decision Trees (52%), followed by Support Vector Machine (17%). We cannot rule out a selection bias in our study because of our two main limitations: we did not include preprints and could not obtain the opinion of other international experts. Conclusions/Significance Combining real-world data and Big Data with machine learning methods is a promising approach to improve dengue prediction and monitoring. Future studies should focus on how to better integrate all available data sources and methods to improve the response and dengue management by stakeholders.


2019 ◽  
Vol 63 (7) ◽  
pp. 1109-1123 ◽  
Author(s):  
Iqbal H Sarker ◽  
Alan Colman ◽  
Jun Han ◽  
A S M Kayes ◽  
Paul Watters

Abstract The electronic calendar is a valuable resource nowadays for managing our daily life appointments or schedules, also known as events, ranging from professional to highly personal. Researchers have studied various types of calendar events to predict smartphone user behavior for incoming mobile communications. However, these studies typically do not take into account behavioral variations between individuals. In the real world, smartphone users can differ widely from each other in how they respond to incoming communications during their scheduled events. Moreover, an individual user may respond the incoming communications differently in different contexts subject to what type of event is scheduled in her personal calendar. Thus, a static calendar-based behavioral model for individual smartphone users does not necessarily reflect their behavior to the incoming communications. In this paper, we present a machine learning based context-aware model that is personalized and dynamically identifies individual’s dominant behavior for their scheduled events using logged time-series smartphone data, and shortly name as ‘CalBehav’. The experimental results based on real datasets from calendar and phone logs, show that this data-driven personalized model is more effective for intelligently managing the incoming mobile communications compared to existing calendar-based approaches.


2021 ◽  
Author(s):  
Jose Romero Hung

ACE-GCN is a fast, resource conservative and energy-efficient, FPGA accelerator for graph convolutional embedding with data-drivenqualities, intended for low-power in-place deployment. Our accelerator exploits the inherent qualities of power law distributionexhibited by real-world graphs, such as structural similarity, replication, and features exchangeability. Contrary to other hardwareimplementations of GCN, on which dataset sparsity becomes an issue and is bypassed with multiple optimization techniques, ourarchitecture is designed to take advantage of this very same situation. We implement an innovative hardware architecture, supportedby our “implicit-processing-by-association” concept. The computational relief and consequential acceleration effect come from thepossibility of replacing rather complex convolutional operations for faster LUT-based comparators and automatic convolutionalresult estimations. We are able to transfer computational complexity into storing capacity, under controllable design parameters.ACE-GCN accelerator core operation consists of orderly parading a set of vector-based, sub-graph structures named “types”, linked topre-calculated embeddings, to incoming "sub-graphs-in-observance", denominated SIO in our work, for either their graph embeddingassumption or their unavoidable convolutional processing, decision depending on the level of similarity obtained from a Jaccardfeature-based coefficient. Results demonstrate that our accelerator has a competitive amount of acceleration; depending on datasetand resource target; between 100× to 1600× PyG baseline, coming close to AWB-GCN by 40% to 70% on smaller datasets and evensurpassing AWB-GCN for larger with controllable accuracy loss levels. We further demonstrate the parallelism potentiality of ourapproach by analyzing the effect of storage capacity on the gradual reliving


Sign in / Sign up

Export Citation Format

Share Document