scholarly journals Parsing Clinical Trial Eligibility Criteria for Cohort Query by a Multi-Input Multi-Output Sequence Labeling Model

Author(s):  
Shubo Tian ◽  
Pengfei Yin ◽  
Hansi Zhang ◽  
Arslan Erdengasileng ◽  
Jiang Bian ◽  
...  

To enable electronic screening of eligible patients for clinical trials, free-text clinical trial eligibility criteria should be translated to a computable format. Natural language processing (NLP) techniques have the potential to automate this process. In this study, we explored a supervised multi-input multi-output (MIMO) sequence labeling model to parse eligibility criteria into combinations of fact and condition tuples. Our experiments on a small manually annotated training dataset showed that that the performance of the MIMO framework with a BERT-based encoder using all the input sequences achieved an overall lenient-level AUROC of 0.61. Although the performance is suboptimal, representing eligibility criteria into logical and semantically clear tuples can potentially make subsequent translation of these tuples into database queries more reliable.

2021 ◽  
Vol 12 (04) ◽  
pp. 816-825
Author(s):  
Yingcheng Sun ◽  
Alex Butler ◽  
Ibrahim Diallo ◽  
Jae Hyun Kim ◽  
Casey Ta ◽  
...  

Abstract Background Clinical trials are the gold standard for generating robust medical evidence, but clinical trial results often raise generalizability concerns, which can be attributed to the lack of population representativeness. The electronic health records (EHRs) data are useful for estimating the population representativeness of clinical trial study population. Objectives This research aims to estimate the population representativeness of clinical trials systematically using EHR data during the early design stage. Methods We present an end-to-end analytical framework for transforming free-text clinical trial eligibility criteria into executable database queries conformant with the Observational Medical Outcomes Partnership Common Data Model and for systematically quantifying the population representativeness for each clinical trial. Results We calculated the population representativeness of 782 novel coronavirus disease 2019 (COVID-19) trials and 3,827 type 2 diabetes mellitus (T2DM) trials in the United States respectively using this framework. With the use of overly restrictive eligibility criteria, 85.7% of the COVID-19 trials and 30.1% of T2DM trials had poor population representativeness. Conclusion This research demonstrates the potential of using the EHR data to assess the clinical trials population representativeness, providing data-driven metrics to inform the selection and optimization of eligibility criteria.


2017 ◽  
Vol 1 (S1) ◽  
pp. 12-12
Author(s):  
Jianyin Shao ◽  
Ram Gouripeddi ◽  
Julio C. Facelli

OBJECTIVES/SPECIFIC AIMS: This poster presents a detailed characterization of the distribution of semantic concepts used in the text describing eligibility criteria of clinical trials reported to ClincalTrials.gov and patient notes from MIMIC-III. The final goal of this study is to find a minimal set of semantic concepts that can describe clinical trials and patients for efficient computational matching of clinical trial descriptions to potential participants at large scale. METHODS/STUDY POPULATION: We downloaded the free text describing the eligibility criteria of all clinical trials reported to ClinicalTrials.gov as of July 28, 2015, ~195,000 trials and ~2,000,000 clinical notes from MIMIC-III. Using MetaMap 2014 we extracted UMLS concepts (CUIs) from the collected text. We calculated the frequency of presence of the semantic concepts in the texts describing the clinical trials eligibility criteria and patient notes. RESULTS/ANTICIPATED RESULTS: The results show a classical power distribution, Y=210X(−2.043), R2=0.9599, for clinical trial eligibility criteria and Y=513X(−2.684), R2=0.9477 for MIMIC patient notes, where Y represents the number of documents in which a concept appears and X is the cardinal order the concept ordered from more to less frequent. From this distribution, it is possible to realize that from the over, 100,000 concepts in UMLS, there are only ~60,000 and 50,000 concepts that appear in less than 10 clinical trial eligibility descriptions and MIMIC-III patient clinical notes, respectively. This indicates that it would be possible to describe clinical trials and patient notes with a relatively small number of concepts, making the search space for matching patients to clinical trials a relatively small sub-space of the overall UMLS search space. DISCUSSION/SIGNIFICANCE OF IMPACT: Our results showing that the concepts used to describe clinical trial eligibility criteria and patient clinical notes follow a power distribution can lead to tractable computational approaches to automatically match patients to clinical trials at large scale by considerably reducing the search space. While automatic patient matching is not the panacea for improving clinical trial recruitment, better low cost computational preselection processes can allow the limited human resources assigned to patient recruitment to be redirected to the most promising targets for recruitment.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 1555-1555
Author(s):  
Eric J. Clayton ◽  
Imon Banerjee ◽  
Patrick J. Ward ◽  
Maggie D Howell ◽  
Beth Lohmueller ◽  
...  

1555 Background: Screening every patient for clinical trials is time-consuming, costly and inefficient. Developing an automated method for identifying patients who have potential disease progression, at the point where the practice first receives their radiology reports, but prior to the patient’s office visit, would greatly increase the efficiency of clinical trial operations and likely result in more patients being offered trial opportunities. Methods: Using Natural Language Processing (NLP) methodology, we developed a text parsing algorithm to automatically extract information about potential new disease or disease progression from multi-institutional, free-text radiology reports (CT, PET, bone scan, MRI or x-ray). We combined semantic dictionary mapping and machine learning techniques to normalize the linguistic and formatting variations in the text, training the XGBoost model particularly to achieve a high precision and accuracy to satisfy clinical trial screening requirements. In order to be comprehensive, we enhanced the model vocabulary using a multi-institutional dataset which includes reports from two academic institutions. Results: A dataset of 732 de-identified radiology reports were curated (two MDs agreed on potential new disease/dz progression vs stable) and the model was repeatedly re-trained for each fold where the folds were randomly selected. The final model achieved consistent precision (>0.87 precision) and accuracy (>0.87 accuracy). See the table for a summary of the results, by radiology report type. We are continuing work on the model to validate accuracy and precision using a new and unique set of reports. Conclusions: NLP systems can be used to identify patients who potentially have suffered new disease or disease progression and reduce the human effort in screening or clinical trials. Efforts are ongoing to integrate the NLP process into existing EHR reporting. New imaging reports sent via interface to the EHR will be extracted daily using a database query and will be provided via secure electronic transport to the NLP system. Patients with higher likelihood of disease progression will be automatically identified, and their reports routed to the clinical trials office for clinical trial screening parallel to physician EHR mailbox reporting. The over-arching goal of the project is to increase clinical trial enrollment. 5-fold cross-validation performance of the NLP model in terms of accuracy, precision and recall averaged across all the folds.[Table: see text]


10.2196/17832 ◽  
2020 ◽  
Vol 8 (7) ◽  
pp. e17832
Author(s):  
Kun Zeng ◽  
Zhiwei Pan ◽  
Yibin Xu ◽  
Yingying Qu

Background Eligibility criteria are the main strategy for screening appropriate participants for clinical trials. Automatic analysis of clinical trial eligibility criteria by digital screening, leveraging natural language processing techniques, can improve recruitment efficiency and reduce the costs involved in promoting clinical research. Objective We aimed to create a natural language processing model to automatically classify clinical trial eligibility criteria. Methods We proposed a classifier for short text eligibility criteria based on ensemble learning, where a set of pretrained models was integrated. The pretrained models included state-of-the-art deep learning methods for training and classification, including Bidirectional Encoder Representations from Transformers (BERT), XLNet, and A Robustly Optimized BERT Pretraining Approach (RoBERTa). The classification results by the integrated models were combined as new features for training a Light Gradient Boosting Machine (LightGBM) model for eligibility criteria classification. Results Our proposed method obtained an accuracy of 0.846, a precision of 0.803, and a recall of 0.817 on a standard data set from a shared task of an international conference. The macro F1 value was 0.807, outperforming the state-of-the-art baseline methods on the shared task. Conclusions We designed a model for screening short text classification criteria for clinical trials based on multimodel ensemble learning. Through experiments, we concluded that performance was improved significantly with a model ensemble compared to a single model. The introduction of focal loss could reduce the impact of class imbalance to achieve better performance.


Author(s):  
Yilu Fang ◽  
Jae Hyun Kim ◽  
Betina Ross Idnay ◽  
Rebeca Aragon Garcia ◽  
Carmen E. Castillo ◽  
...  

Clinical trial eligibility criteria are important for selecting the right participants for clinical trials. However, they are often complex and not computable. This paper presents the participatory design of a human-computer collaboration method for criteria simplification that includes natural language processing followed by user-centered eligibility criteria simplification. A case study on the ARCADIA trial shows how criteria were simplified for structured database querying by clinical researchers and identifies rules for criteria simplification and concept normalization.


2020 ◽  
Author(s):  
Kun Zeng ◽  
Zhiwei Pan ◽  
Yibin Xu ◽  
Yingying Qu

BACKGROUND Eligibility criteria are the main strategy for screening appropriate participants for clinical trials. Automatic analysis of clinical trial eligibility criteria by digital screening, leveraging natural language processing techniques, can improve recruitment efficiency and reduce the costs involved in promoting clinical research. OBJECTIVE We aimed to create a natural language processing model to automatically classify clinical trial eligibility criteria. METHODS We proposed a classifier for short text eligibility criteria based on ensemble learning, where a set of pretrained models was integrated. The pretrained models included state-of-the-art deep learning methods for training and classification, including Bidirectional Encoder Representations from Transformers (BERT), XLNet, and A Robustly Optimized BERT Pretraining Approach (RoBERTa). The classification results by the integrated models were combined as new features for training a Light Gradient Boosting Machine (LightGBM) model for eligibility criteria classification. RESULTS Our proposed method obtained an accuracy of 0.846, a precision of 0.803, and a recall of 0.817 on a standard data set from a shared task of an international conference. The macro F1 value was 0.807, outperforming the state-of-the-art baseline methods on the shared task. CONCLUSIONS We designed a model for screening short text classification criteria for clinical trials based on multimodel ensemble learning. Through experiments, we concluded that performance was improved significantly with a model ensemble compared to a single model. The introduction of focal loss could reduce the impact of class imbalance to achieve better performance.


2018 ◽  
Vol 25 (4) ◽  
Author(s):  
K. Al-Baimani ◽  
H. Jonker ◽  
T. Zhang ◽  
G.D. Goss ◽  
S.A. Laurie ◽  
...  

Background Advanced non-small-cell lung cancer (nsclc) represents a major health issue globally. Systemic treatment decisions are informed by clinical trials, which, over years, have improved the survival of patients with advanced nsclc. The applicability of clinical trial results to the broad lung cancer population is unclear because strict eligibility criteria in trials generally select for optimal patients.Methods We performed a retrospective chart review of all consecutive patients with advanced nsclc seen in outpatient consultation at our academic institution between September 2009 and September 2012, collecting data about patient demographics and cancer characteristics, treatment, and survival from hospital and pharmacy records. Two sets of arbitrary trial eligibility criteria were applied to the cohort. Scenario A stipulated Eastern Cooperative Oncology Group performance status (ecog ps) 0–1, no brain metastasis, creatinine less than 120 μmol/L, and no second malignancy. Less-strict scenario B stipulated ecog ps 0–2 and creatinine less than 120 μmol/L. We then used the two scenarios to analyze treatment and survival of patients by trial eligibility status.Results The 528 included patients had a median age of 67 years, with 55% being men and 58% having adenocarcinoma. Of those 528 patients, 291 received at least 1 line of palliative systemic therapy. Using the scenario A eligibility criteria, 73% were trial-ineligible. However, 46% of “ineligible” patients actually received therapy and experienced survival similar to that of the “eligible” treated patients (10.2 months vs. 11.6 months, p = 0.10). Using the scenario B criteria, only 35% were ineligible, but again, the survival of treated patients was similar in the ineligible and eligible groups (10.1 months vs. 10.9 months, p = 0.57).Conclusions Current trial eligibility criteria are often strict and limit the enrolment of patients in clinical trials. Our results suggest that, depending on the chosen drug, its toxicities and tolerability, eligibility criteria could be carefully reviewed and relaxed.


Author(s):  
Mohammadreza Mobinizadeh ◽  
Morteza Arab-Zozani

Context: Coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) appeared for the first time in December 2019 in Wuhan, China. Due to the lack of unified and integrated evidence for Favipiravir, this study was conducted to rapidly review the existing evidence to help evidence-based decision-making on the therapeutic potential of this drug in the treatment of COVID-19 patients. Evidence Acquisition: This study is a rapid Health Technology Assessment (HTA). By searching pertinent databases, the research team collected relevant articles and tried to create a policy guide through a thematic approach. This rapid review was done in four steps: (1) Searching for evidence through databases; (2) screening the evidence considering eligibility criteria; (3) data extraction; and (4) analyzing the data through thematic analysis. Results: After applying the inclusion criteria, four studies were finally found, including three review studies and a clinical trial that was temporarily removed by its publisher from the journal’s website. After searching the sources mentioned in the articles, two ongoing clinical trials were found in China. Also, by searching the clinical trial website, www.clinicaltrials.gov, five clinical trials were found in the search. The result of the search in the clinical trial registration system in Iran showed a study that is in the process of patient recruitment. A limited number of other articles were found, mostly in the form of reflections from physicians or researchers and letters to editors who have predicted the drug’s performance on SARS-CoV-2, which needs further clinical study to be approved. Conclusions: With the available evidence, it is not possible to make a definite conclusion about the safety and efficacy of Favipiravir in the treatment of patients with COVID-19.


2020 ◽  
pp. 50-59 ◽  
Author(s):  
J. Thaddeus Beck ◽  
Melissa Rammage ◽  
Gretchen P. Jackson ◽  
Anita M. Preininger ◽  
Irene Dankwa-Mullan ◽  
...  

PURPOSE Less than 5% of patients with cancer enroll in clinical trials, and 1 in 5 trials are stopped for poor accrual. We evaluated an automated clinical trial matching system that uses natural language processing to extract patient and trial characteristics from unstructured sources and machine learning to match patients to clinical trials. PATIENTS AND METHODS Medical records from 997 patients with breast cancer were assessed for trial eligibility at Highlands Oncology Group between May and August 2016. System and manual attribute extraction and eligibility determinations were compared using the percentage of agreement for 239 patients and 4 trials. Sensitivity and specificity of system-generated eligibility determinations were measured, and the time required for manual review and system-assisted eligibility determinations were compared. RESULTS Agreement between system and manual attribute extraction ranged from 64.3% to 94.0%. Agreement between system and manual eligibility determinations was 81%-96%. System eligibility determinations demonstrated specificities between 76% and 99%, with sensitivities between 91% and 95% for 3 trials and 46.7% for the 4th. Manual eligibility screening of 90 patients for 3 trials took 110 minutes; system-assisted eligibility determinations of the same patients for the same trials required 24 minutes. CONCLUSION In this study, the clinical trial matching system displayed a promising performance in screening patients with breast cancer for trial eligibility. System-assisted trial eligibility determinations were substantially faster than manual review, and the system reliably excluded ineligible patients for all trials and identified eligible patients for most trials.


2006 ◽  
Vol 24 (18_suppl) ◽  
pp. 6056-6056
Author(s):  
J. K. Keller ◽  
J. Bowman ◽  
J. A. Lee ◽  
M. A. Mathiason ◽  
K. A. Frisby ◽  
...  

6056 Background: Less than 5% of newly diagnosed cancer patients are accrued into clinical trials. In the community setting, the lack of appropriate clinical trials is a major barrier. Our prospective study in 2004 determined that 58% of newly diagnosed adult cancer patients at our community-based cancer center didn’t have a clinical trial available appropriate for their disease stage. Among those with clinical trials, 23% were subsequently found to be ineligible (Go RS, et al. Cancer 2006, in press). However, the availability of clinical trials may vary from year to year. Methods: A retrospective study was conducted to determine what clinical trials were available for newly diagnosed adult cancer patients at our institution from June 1999-July 2004. The study also investigated the proportions of newly diagnosed patients who had a clinical trial available appropriate for type and stage of disease and patients accrued. Results: Over the 5-year period, 207 (82, 87, 99, 102, 117, years 1–5, respectively) trials were available. Most (50.7%) trials were for the following cancers: breast (15.5%), lung (13.5%), head and neck (7.7%), colorectal (7.2%) and lymphoma (6.8%). ECOG (53%), RTOG (26%), and CTSU (9%) provided the majority of the trials. A total of 5,776 new adult cancer patients were seen during this period. Overall, 60% of the patients had a trial available appropriate for type and stage of their cancer, but only 103 (3%) were enrolled. There was a significant upward trend in the proportions of patients with available trials over the years (60.2%, 55.9%, 59.2%, 60.7%, 63.9%, years 1–5, respectively; Mantel-Haenszel P=.008). The proportion of patients with a trial available was highest for prostate (97.3%), lung (90.9%), and breast (73.9%), and lowest for melanoma (17.1%), renal (11.6%), and bladder (7.2%). The majority of patients accrued to trials had the following cancers: breast (32%), lung (17%), lymphoma (9%), colon (7%), and prostate (5%). Conclusions: Nearly half of the newly diagnosed adult patients at our center had no trials available appropriate for type and stage of their cancers. It is likely that if strict clinical trial eligibility criteria were applied, approximately 2/3 of our patients would not be eligible for a clinical trial. No significant financial relationships to disclose.


Sign in / Sign up

Export Citation Format

Share Document