cohort identification
Recently Published Documents


TOTAL DOCUMENTS

69
(FIVE YEARS 34)

H-INDEX

8
(FIVE YEARS 3)

2022 ◽  
Vol 3 (2) ◽  
pp. 1-28
Author(s):  
Besat Kassaie ◽  
Elizabeth L. Irving ◽  
Frank Wm. Tompa

The standard approach to expert-in-the-loop machine learning is active learning, where, repeatedly, an expert is asked to annotate one or more records and the machine finds a classifier that respects all annotations made until that point. We propose an alternative approach, IQRef , in which the expert iteratively designs a classifier and the machine helps him or her to determine how well it is performing and, importantly, when to stop, by reporting statistics on a fixed, hold-out sample of annotated records. We justify our approach based on prior work giving a theoretical model of how to re-use hold-out data. We compare the two approaches in the context of identifying a cohort of EHRs and examine their strengths and weaknesses through a case study arising from an optometric research problem. We conclude that both approaches are complementary, and we recommend that they both be employed in conjunction to address the problem of cohort identification in health research.


Medicine ◽  
2021 ◽  
Vol 100 (51) ◽  
pp. e28354
Author(s):  
H. Nina Kim ◽  
Ayushi Gupta ◽  
Kristine Lan ◽  
Jenell Stewart ◽  
Shireesha Dhanireddy ◽  
...  

2021 ◽  
Author(s):  
Emily R. Pfaff ◽  
Robert Bradford ◽  
Marshall Clark ◽  
James P. Balhoff ◽  
Rujin Wang ◽  
...  

ABSTRACTBackgroundComputable phenotypes are increasingly important tools for patient cohort identification. As part of a study of risk of chronic opioid use after surgery, we used a Resource Description Framework (RDF) triplestore as our computable phenotyping platform, hypothesizing that the unique affordances of triplestores may aid in making complex computable phenotypes more interoperable and reproducible than traditional relational database queries.To identify and model risk for new chronic opioid users post-surgery, we loaded several heterogeneous data sources into a Blazegraph triplestore: (1) electronic health record data; (2) claims data; (3) American Community Survey data; and (4) Centers for Disease Control Social Vulnerability Index, opioid prescription rate, and drug poisoning rate data. We then ran a series of queries to execute each of the rules in our “new chronic opioid user” phenotype definition to ultimately arrive at our qualifying cohort.ResultsOf the 4,163 patients in the denominator, our computable phenotype identified 248 patients as new chronic opioid users after their index surgical procedure. After validation against charts, 228 of the 248 were revealed to be true positive cases, giving our phenotype a PPV of 0.92.ConclusionWe successfully used the triplestore to execute the new chronic opioid user phenotype logic, and in doing so noted some advantages of the triplestore in terms of schemalessness, interoperability, and reproducibility. Future work will use the triplestore to create the planned risk model and leverage the additional links with ontologies, and ontological reasoning.


2021 ◽  
Author(s):  
Emilie Guillochon ◽  
J&eacuter&eacutemy Fraering ◽  
Valentin Joste ◽  
Claire Kamaliddin ◽  
Bertin Vianou ◽  
...  

The host and parasitic factors leading to cerebral malaria (CM) are not yet fully elucidated and CM Plasmodium falciparum isolates transcriptome profile remains largely unknown. Based on RNA-seq data from 15 CM and 15 uncomplicated malaria (UM) children from Benin, we identified an increased ring stage signature in CM parasites. Reduced circulating time may result from a higher adherence ability of CM isolates and consistent with this hypothesis, we measured an overexpression of var genes in CM. var genes domains expression was more restricted in CM isolates compared to UM, reflecting the specific binding to receptors in host brain endothelium capillaries. However, ICAM-1 binding motif was found expressed in both CM and UM, questioning its role in PfEMP1 adhesion to ICAM-1 receptor. UM isolates increased circulation time may also be modulated by a more efficient immune response against infected erythrocytes surface proteins, which we could not demonstrate on our cohort. Identification of deregulated genes involved in adhesion, excluding variant surface antigens, also supports the hypothesis of an increased CM adhesion capacity. Finally, numerous upregulated genes involved in entry into host pathway were found, reflecting a greater erythrocytes invasion capacity of CM parasites.


JAMIA Open ◽  
2021 ◽  
Vol 4 (4) ◽  
Author(s):  
Pascal S Brandt ◽  
Jennifer A Pacheco ◽  
Luke V Rasmussen

Abstract Objective The objective of this study is to create a repository of computable, technology-agnostic phenotype definitions for the purposes of analysis and automatic cohort identification. Materials and Methods We selected phenotype definitions from PheKB and excluded definitions that did not use structured data or were not used in published research. We translated these definitions into the Clinical Quality Language (CQL) and Fast Healthcare Interoperability Resources (FHIR) and validated them using code review and automated tests. Results A total of 33 phenotype definitions met our inclusion criteria. We developed 40 CQL libraries, 231 value sets, and 347 test cases. To support these test cases, a total of 1624 FHIR resources were created as test data. Discussion and Conclusion Although a number of challenges were encountered while translating the phenotypes into structured form, such as requiring specialized knowledge, or imprecise, ambiguous, and conflicting language, we have created a repository and a development environment that can be used for future research on computable phenotypes.


BMC Neurology ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Lester Y. Leung ◽  
Sunyang Fu ◽  
Patrick H. Luetmer ◽  
David F. Kallmes ◽  
Neel Madan ◽  
...  

Abstract Background There are numerous barriers to identifying patients with silent brain infarcts (SBIs) and white matter disease (WMD) in routine clinical care. A natural language processing (NLP) algorithm may identify patients from neuroimaging reports, but it is unclear if these reports contain reliable information on these findings. Methods Four radiology residents reviewed 1000 neuroimaging reports (RI) of patients age > 50 years without clinical histories of stroke, TIA, or dementia for the presence, acuity, and location of SBIs, and the presence and severity of WMD. Four neuroradiologists directly reviewed a subsample of 182 images (DR). An NLP algorithm was developed to identify findings in reports. We assessed interrater reliability for DR and RI, and agreement between these two and with NLP. Results For DR, interrater reliability was moderate for the presence of SBIs (k = 0.58, 95 % CI 0.46–0.69) and WMD (k = 0.49, 95 % CI 0.35–0.63), and moderate to substantial for characteristics of SBI and WMD. Agreement between DR and RI was substantial for the presence of SBIs and WMD, and fair to substantial for characteristics of SBIs and WMD. Agreement between NLP and DR was substantial for the presence of SBIs (k = 0.64, 95 % CI 0.53–0.76) and moderate (k = 0.52, 95 % CI 0.39–0.65) for the presence of WMD. Conclusions Neuroimaging reports in routine care capture the presence of SBIs and WMD. An NLP can identify these findings (comparable to direct imaging review) and can likely be used for cohort identification.


2021 ◽  
Author(s):  
Faisal Rahman ◽  
Noam Finkelstein ◽  
Anton Alyakin ◽  
Nisha Gilotra ◽  
Jeff Trost ◽  
...  

Abstract Objective: Despite technological and treatment advancements over the past two decades, cardiogenic shock (CS) mortality has remained between 40-60%. A number of factors can lead to delayed diagnosis of CS, including gradual onset and nonspecific symptoms. Our objective was to develop an algorithm that can continuously monitor heart failure patients, and partition them into cohorts of high- and low-risk for CS.Methods: We retrospectively studied 24,461 patients hospitalized with acute decompensated heart failure, 265 of whom developed CS, in the Johns Hopkins Healthcare system. Our cohort identification approach is based on logistic regression, and makes use of vital signs, lab values, and medication administrations recorded during the normal course of care. Results: Our algorithm identified patients at high-risk of CS. Patients in the high-risk cohort had 10.2 times (95% confidence interval 6.1-17.2) higher prevalence of CS than those in the low-risk cohort. Patients who experienced cardiogenic shock while in the high-risk cohort were first deemed high-risk a median of 1.7 days (interquartile range 0.8 to 4.6) before cardiogenic shock diagnosis was made by their clinical team. Conclusions: This risk model was able to predict patients at higher risk of CS in a time frame that allowed a change in clinical care. Future studies need to evaluate if CS analysis of high-risk cohort identification may affect outcomes.


2021 ◽  
Author(s):  
Ran Sun ◽  
Imon Banerjee ◽  
Shengtian Sang ◽  
Jennifer Joseph ◽  
Jennifer Schneider ◽  
...  

<b>Key Points</b> <p>· About one-third of patients with type 1 diabetes were found to use continuous glucose monitoring (CGM) and/or continuous subcutaneous insulin infusion (CSII) in routine clinical care.</p> <p>· Disparities exist in CGM and CSII adoption, with device use more common in patients of higher socioeconomic status.</p> <p>· Mining clinical narratives with natural language processing techniques can be applied successfully for medical device surveillance and cohort identification for observational studies.</p> <p>· CGM use in conjunction with CSII after type 1 diabetes diagnosis is more effective than other therapy regimens and may translate to improved long-term glycemic control. </p>


2021 ◽  
Author(s):  
Ran Sun ◽  
Imon Banerjee ◽  
Shengtian Sang ◽  
Jennifer Joseph ◽  
Jennifer Schneider ◽  
...  

<b>Key Points</b> <p>· About one-third of patients with type 1 diabetes were found to use continuous glucose monitoring (CGM) and/or continuous subcutaneous insulin infusion (CSII) in routine clinical care.</p> <p>· Disparities exist in CGM and CSII adoption, with device use more common in patients of higher socioeconomic status.</p> <p>· Mining clinical narratives with natural language processing techniques can be applied successfully for medical device surveillance and cohort identification for observational studies.</p> <p>· CGM use in conjunction with CSII after type 1 diabetes diagnosis is more effective than other therapy regimens and may translate to improved long-term glycemic control. </p>


Sign in / Sign up

Export Citation Format

Share Document