Machine Learning in Clinical Trials

Author(s):  
Shazia Rashid ◽  
Neha Kathuria
2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e13588-e13588
Author(s):  
Laura Sachse ◽  
Smriti Dasari ◽  
Marc Ackermann ◽  
Emily Patnaude ◽  
Stephanie OLeary ◽  
...  

e13588 Background: Pre-screening for clinical trials is becoming more challenging as inclusion/exclusion criteria becomes increasingly complex. Oncology precision medicine provides an exciting opportunity to simplify this process and quickly match patients with trials by leveraging machine learning technology. The Tempus TIME Trial site network matches patients to relevant, open, and recruiting clinical trials, personalized to each patient’s clinical and molecular biology. Methods: Tempus screens patients at sites within the TIME Trial Network to find high-fidelity matches to clinical trials. The patient records include documentation submitted alongside NGS orders as well as electronic medical records (EMR) ingested through EMR Integrations. While Tempus-sequenced patients were automatically matched to trials using a Tempus-built matching application, EMR records were run through a natural language processing (NLP) data abstraction model to identify patients with an actionable gene of interest. Structured data were analyzed to filter to patients that lack a deceased date and have an encounter date within a predefined time period. Tempus abstractors manually validated the resulting unstructured records to ensure each patient was matched to a TIME Trial at a site capable of running the trial. For all high-level patient matches, a Tempus Clinical Navigator manually evaluated other clinical criteria to confirm trial matches and communicated with the site about trial options. Results: Patient matching was accelerated by implementing NLP gene and report detection (which isolated 17% of records) and manual screening. As a result, Tempus facilitated screening of over 190,000 patients efficiently using proprietary NLP technology to match 332 patients to 21 unique interventional clinical trials since program launch. Tempus continues to optimize its NLP models to increase high-fidelity trial matching at scale. Conclusions: The TIME Trial Network is an evolving, dynamic program that efficiently matches patients with clinical trial sites using both EMR and Tempus sequencing data. Here, we show how machine learning technology can be utilized to efficiently identify and recruit patients to clinical trials, thereby personalizing trial enrollment for each patient.[Table: see text]


Author(s):  
Diego Alejandro Dri ◽  
Maurizio Massella ◽  
Donatella Gramaglia ◽  
Carlotta Marianecci ◽  
Sandra Petraglia

: Machine Learning, a fast-growing technology, is an application of Artificial Intelligence that has significantly contributed to drug discovery and clinical development. In the last few years, the number of clinical applications based on Machine Learning has constantly been growing. Moreover, it is now also impacting National Competent Authorities during the assessment of most recently submitted Clinical Trials that are designed, managed, or generating data deriving from the use of Machine Learning or Artificial Intelligence technologies. We review current information available on the regulatory approach to Clinical Trials and Machine Learning. We also provide inputs for further reasoning and potential indications, including six actionable proposals for regulators to proactively drive the upcoming evolution of Clinical Trials within a strong regulatory framework, focusing on patient safety, health protection, and fostering immediate access to effective treatments.


2019 ◽  
Author(s):  
William A Figgett ◽  
Katherine Monaghan ◽  
Milica Ng ◽  
Monther Alhamdoosh ◽  
Eugene Maraskovsky ◽  
...  

ABSTRACTObjectiveSystemic lupus erythematosus (SLE) is a heterogeneous autoimmune disease that is difficult to treat. There is currently no optimal stratification of patients with SLE, and thus responses to available treatments are unpredictable. Here, we developed a new stratification scheme for patients with SLE, based on the whole-blood transcriptomes of patients with SLE.MethodsWe applied machine learning approaches to RNA-sequencing (RNA-seq) datasets to stratify patients with SLE into four distinct clusters based on their gene expression profiles. A meta-analysis on two recently published whole-blood RNA-seq datasets was carried out and an additional similar dataset of 30 patients with SLE and 29 healthy donors was contributed in this research; 141 patients with SLE and 51 healthy donors were analysed in total.ResultsExamination of SLE clusters, as opposed to unstratified SLE patients, revealed underappreciated differences in the pattern of expression of disease-related genes relative to clinical presentation. Moreover, gene signatures correlated to flare activity were successfully identified.ConclusionGiven that disease heterogeneity has confounded research studies and clinical trials, our approach addresses current unmet medical needs and provides a greater understanding of SLE heterogeneity in humans. Stratification of patients based on gene expression signatures may be a valuable strategy to harness disease heterogeneity and identify patient populations that may be at an increased risk of disease symptoms. Further, this approach can be used to understand the variability in responsiveness to therapeutics, thereby improving the design of clinical trials and advancing personalised therapy.


Author(s):  
Charles M. Pérez-Espinoza ◽  
Nuvia Beltran-Robayo ◽  
Teresa Samaniego-Cobos ◽  
Abel Alarcón-Salvatierra ◽  
Ana Rodriguez-Mendez ◽  
...  

2019 ◽  
pp. 1-11 ◽  
Author(s):  
Kien Wei Siah ◽  
Sean Khozin ◽  
Chi Heem Wong ◽  
Andrew W. Lo

PURPOSE The prediction of clinical outcomes for patients with cancer is central to precision medicine and the design of clinical trials. We developed and validated machine-learning models for three important clinical end points in patients with advanced non–small-cell lung cancer (NSCLC)—objective response (OR), progression-free survival (PFS), and overall survival (OS)—using routinely collected patient and disease variables. METHODS We aggregated patient-level data from 17 randomized clinical trials recently submitted to the US Food and Drug Administration evaluating molecularly targeted therapy and immunotherapy in patients with advanced NSCLC. To our knowledge, this is one of the largest studies of NSCLC to consider biomarker and inhibitor therapy as candidate predictive variables. We developed a stochastic tumor growth model to predict tumor response and explored the performance of a range of machine-learning algorithms and survival models. Models were evaluated on out-of-sample data using the standard area under the receiver operating characteristic curve and concordance index (C-index) performance metrics. RESULTS Our models achieved promising out-of-sample predictive performances of 0.79 area under the receiver operating characteristic curve (95% CI, 0.77 to 0.81), 0.67 C-index (95% CI, 0.66 to 0.69), and 0.73 C-index (95% CI, 0.72 to 0.74) for OR, PFS, and OS, respectively. The calibration plots for PFS and OS suggested good agreement between actual and predicted survival probabilities. In addition, the Kaplan-Meier survival curves showed that the difference in survival between the low- and high-risk groups was significant (log-rank test P < .001) for both PFS and OS. CONCLUSION Biomarker status was the strongest predictor of OR, PFS, and OS in patients with advanced NSCLC treated with immune checkpoint inhibitors and targeted therapies. However, single biomarkers have limited predictive value, especially for programmed death-ligand 1 immunotherapy. To advance beyond the results achieved in this study, more comprehensive data on composite multiomic signatures is required.


Author(s):  
O. Uspenskaya-Cadoz ◽  
C. Alamuri ◽  
L. Wang ◽  
M. Yang ◽  
S. Khinda ◽  
...  

Background: Recruiting patients for clinical trials of potential therapies for Alzheimer’s disease (AD) remains a major challenge, with demand for trial participants at an all-time high. The AD treatment R&D pipeline includes around 112 agents. In the United States alone, 150 clinical trials are seeking 70,000 participants. Most people with early cognitive impairment consult primary care providers, who may lack time, diagnostic skills and awareness of local clinical trials. Machine learning and predictive analytics offer promise to boost enrollment by predicting which patients have prodromal AD, and which will go on to develop AD. Objectives: The authors set out to develop a machine learning predictive model that identifies prodromal AD patients in the general population, to aid early AD detection by primary care physicians and timely referral to expert sites for biomarker confirmation of diagnosis and clinical trial enrollment. Design: The authors use a classification machine learning algorithm to extract patterns within healthcare claims and prescription data three years prior to AD diagnosis/AD drug initiation. Setting: The study focused on subjects included within proprietary IQVIA US data assets (claims and prescription databases). Patient information was extracted from January 2010 to July 2018, for cohorts aged between 50 and 85 years. Participants: A total of 88,298,289 subjects aged between 50 and 85 years were identified. For the positive cohort, 667,288 subjects were identified who had 24 months of medical history and at least one record with AD or AD treatment. For the negative cohort, 3,670,254 patients were selected who had a similar length of medical history and who were matched to positive cohort subjects based on the prevalence rate. The scoring cohort was selected based on availability of recent medical data of 2-5 years and included 72,670,283 subjects between the ages of 50 and 85 years. Intervention (if any): None. Measurements: A list of clinically-relevant and interpretable predictors was generated and extracted from the data sets for each subject, including pharmacological treatments (NDC/product), office/specialist visits (specialty), tests and procedures (HCPCS and CPT), and diagnosis (ICD). The positive cohort was defined as patients who have AD diagnosis/AD treatment with a 3 years offset as an estimate for prodromal AD diagnosis. Supervised ML techniques were used to develop algorithms to predict the occurrence of prodromal AD cases. The sample dataset was divided randomly into a training dataset and a test dataset. The classification models were trained and executed in the PySpark framework. Training and evaluation of LogisticRegression, DecisionTreeClassifier, RandomForestClassifier, and GBTClassifier were executed using PySpark’s mllib module. The area under the precision-recall curve (AUCPR) was used to compare the results of the various models. Results: The AUCPRs are 0.426, 0.157, 0.436, and 0.440 for LogisticRegression, DecisionTreeClassifier, RandomForestClassifier, and GBTClassifier, respectively, meaning that GBTClassifier (Gradient Boosted Tree) outperforms the other three classifiers. The GBT model identified 222,721 subjects in the prodromal AD stage with 80% precision. Some 76% of identified prodromal AD patients were in the primary care setting. Conclusions: Applying the developed predictive model to 72,670,283 U.S. residents, 222,721 prodromal AD patients were identified, the majority of whom were in the primary care setting. This could drive major advances in AD research by enabling more accurate and earlier prodromal AD diagnosis at the primary care physician level , which would facilitate timely referral to expert sites for in-depth assessment and potential enrolment in clinical trials.


Sign in / Sign up

Export Citation Format

Share Document