scholarly journals Framework for identifying drug repurposing candidates from observational healthcare data

JAMIA Open ◽  
2020 ◽  
Author(s):  
Michal Ozery-Flato ◽  
Yaara Goldschmidt ◽  
Oded Shaham ◽  
Sivan Ravid ◽  
Chen Yanover

Abstract Objective Observational medical databases, such as electronic health records and insurance claims, track the healthcare trajectory of millions of individuals. These databases provide real-world longitudinal information on large cohorts of patients and their medication prescription history. We present an easy-to-customize framework that systematically analyzes such databases to identify new indications for on-market prescription drugs. Materials and Methods Our framework provides an interface for defining study design parameters and extracting patient cohorts, disease-related outcomes, and potential confounders in observational databases. It then applies causal inference methodology to emulate hundreds of randomized controlled trials (RCTs) for prescribed drugs, while adjusting for confounding and selection biases. After correcting for multiple testing, it outputs the estimated effects and their statistical significance in each database. Results We demonstrate the utility of the framework in a case study of Parkinson’s disease (PD) and evaluate the effect of 259 drugs on various PD progression measures in two observational medical databases, covering more than 150 million patients. The results of these emulated trials reveal remarkable agreement between the two databases for the most promising candidates. Discussion Estimating drug effects from observational data is challenging due to data biases and noise. To tackle this challenge, we integrate causal inference methodology with domain knowledge and compare the estimated effects in two separate databases. Conclusion Our framework enables systematic search for drug repurposing candidates by emulating RCTs using observational data. The high level of agreement between separate databases strongly supports the identified effects.

Author(s):  
Michal Ozery-Flato ◽  
Yaara Goldschmidt ◽  
Oded Shaham ◽  
Sivan Ravid ◽  
Chen Yanover

AbstractObjectiveObservational medical databases, such as electronic health records and insurance claims, track the healthcare trajectory of millions of individuals. These databases provide real-world longitudinal information on large cohorts of patients and their medication prescription history. We present an easy-to-customize framework that systematically analyzes such databases to identify new indications for on-market prescription drugs.Materials and MethodsOur framework provides an interface for defining study design parameters and extracting patient cohorts, disease-related outcomes, and potential confounders in two observational databases, covering more than 150 million patients. It then applies causal inference methodology to emulate hundreds of randomized controlled trials (RCTs) for prescribed drugs, while adjusting for confounding and selection biases. After correcting for multiple testing, it outputs the estimated effects and their statistical significance in each database.ResultsWe demonstrate the utility of the framework in a case study of Parkinson’s disease (PD) and evaluate the effect of 259 drugs on various PD progression measures. The results of these emulated trials reveal a remarkable agreement between the two observational medical databases for the most promising candidates.DiscussionEstimating drug effects from observational data is challenging due to data biases and noise. To tackle this challenge, we integrate causal inference methodology with domain knowledge and compare the estimated effects in two separate databases.ConclusionOur framework enables a systematic search for drug repurposing candidates by emulating RCTs that use observational data. The high level of agreement in the results obtained in two separate databases provides an internal validation of identified effects.


2021 ◽  
Author(s):  
Maaz Siddiqui ◽  
John P Piserchio ◽  
Misha Patel ◽  
Jino Park ◽  
Michelle Foster ◽  
...  

Background: Much of the blame of the increasing death toll by drug overdoses has justifiably been attributed to the United States current opioid epidemic. However, nearly 80% of overdoses related to opioids involve another drug substance or alcohol. The objective of this study was to elucidate overrepresentation of drugs in polypharmacy arrests by identifying drugs that were more likely to be found in conjunction with other substances, using the drug arrest data provided by the Maine Diversion Alert Program (DAP). Methods: Single drug arrest and multiple drug arrest totals reported to the DAP were examined. Drugs involved in the arrests were classified by Drug Enforcement Administration Schedule (I-V or non-controlled prescription) and categorized into five drug families: hallucinogens, opioids, sedatives, stimulants, and miscellaneous. Multiple drug arrest totals were compared to single drug arrest totals to create a Multiple-to-Single Ratio (MSR) specific to each drug family and each drug. Chi-square approximations without Yates correction and two-tailed P values were used to determine statistical significance through GraphPad 2x2 contingency tables. Results: Over three-fifths (63.8%) of all arrests involved a single drug. Opioids accounted for over-half (53.5%) of single arrests, followed by stimulants (27.7%) and hallucinogens (7.7%). Similarly, nearly two-fifths (39.6%) of multiple arrests were opioids, followed by stimulants (30.8%) and miscellaneous (13.0%). Miscellaneous family drugs were recorded with the highest Multiple-to-Single Ratio (1.51), followed by sedatives (1.09), stimulants (0.63), opioids (0.42), and hallucinogens (0.35). Carisoprodol (8.80), amitriptyline (6.34), and quetiapine (4.69) had the highest MSR values and therefore were the three most overrepresented drugs in polysubstance arrests. Conclusion: The abuse of opioids, both alone and in conjunction with another drug, deserves continued surveillance in public health. In addition, common prescription drugs with lesser-known misuse potential, especially carisoprodol, amitriptyline, and quetiapine, require more attention by medical providers for their ability to enhance the effects of other drugs or to compensate for undesired drug effects.


2021 ◽  
Vol 15 (5) ◽  
pp. 1-46
Author(s):  
Liuyi Yao ◽  
Zhixuan Chu ◽  
Sheng Li ◽  
Yaliang Li ◽  
Jing Gao ◽  
...  

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy, and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well-known causal inference frameworks. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine, and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.


Stroke ◽  
2021 ◽  
Vol 52 (Suppl_1) ◽  
Author(s):  
Sarah E Wetzel-Strong ◽  
Shantel M Weinsheimer ◽  
Jeffrey Nelson ◽  
Ludmila Pawlikowska ◽  
Dewi Clark ◽  
...  

Objective: Circulating plasma protein profiling may aid in the identification of cerebrovascular disease signatures. This study aimed to identify circulating angiogenic and inflammatory biomarkers that may serve as biomarkers to differentiate sporadic brain arteriovenous malformation (bAVM) patients from other conditions with brain AVMs, including hereditary hemorrhagic telangiectasia (HHT) patients. Methods: The Quantibody Human Angiogenesis Array 1000 (Raybiotech) is an ELISA multiplex panel that was used to assess the levels of 60 proteins related to angiogenesis and inflammation in heparin plasma samples from 13 sporadic unruptured bAVM patients (69% male, mean age 51 years) and 37 patients with HHT (40% male, mean age 47 years, n=19 (51%) with bAVM). The Quantibody Q-Analyzer tool was used to calculate biomarker concentrations based on the standard curve for each marker and log-transformed marker levels were evaluated for associations between disease states using a multivariable interval regression model adjusted for age, sex, ethnicity and collection site. Statistical significance was based on Bonferroni correction for multiple testing of 60 biomarkers (P< 8.3x10 - 4 ). Results: Circulating levels of two plasma proteins differed significantly between sporadic bAVM and HHT patients: PDGF-BB (P=2.6x10 -4 , PI= 3.37, 95% CI:1.76-6.46) and CCL5 (P=6.0x10 -6 , PI=3.50, 95% CI=2.04-6.03). When considering markers with a nominal p-value of less than 0.01, MMP1 and angiostatin levels also differed between patients with sporadic bAVM and HHT. Markers with nominal p-values less than 0.05 when comparing sporadic brain AVM and HHT patients also included angiostatin, IL2, VEGF, GRO, CXCL16, ITAC, and TGFB3. Among HHT patients, the circulating levels of UPAR and IL6 were elevated in patients with documented bAVMs when considering markers with nominal p-values less than 0.05. Conclusions: This study identified differential expression of two promising plasma biomarkers that differentiate sporadic bAVMs from patients with HHT. Furthermore, this study allowed us to evaluate markers that are associated with the presence of bAVMs in HHT patients, which may offer insight into mechanisms underlying bAVM pathophysiology.


2021 ◽  
Author(s):  
Oscar Wiljam Savolainen

Abstract It is of great interest in neuroscience to determine what frequency bands in the brain contain common information. However, to date, a comprehensive statistical approach to this question has been lacking. As such, this work presents a novel statistical significance test for correlated power across frequency bands in non-stationary time series. The test accounts for biases that often go untreated in time-frequency analysis, i.e. intra-frequency autocorrelation, inter-frequency non-dyadicity, and multiple testing under dependency. It is used to test all of the inter-frequency correlations between 0.2 and 8500 Hz in continuous intracortical extracellular neural recordings, using a very large, publicly available dataset. The results show that neural processes have signatures across a very broad range of frequency bands. In particular, LFP frequency bands as low as 20 Hz were found to almost always be significantly correlated to kHz frequency ranges. This test also has applications in a broad range of fields, e.g. biological signal processing, economics, finance, climatology, etc. It is useful whenever one wants to robustly determine whether short-term components in a signal are robustly related to long-term trends, or what frequencies contain common information.


2020 ◽  
Author(s):  
Linchen He ◽  
Yuru Ren ◽  
Han Chen ◽  
Daphne Guinn ◽  
Deepak Parashar ◽  
...  

PURPOSEMolecular oncology determines biomarker-defined niche indications. Basket trials pool histologic indications sharing molecular pathophysiology, potentially improving development efficiency. Currently basket trials have been confirmatory only for exceptional therapies. Our previous randomized basket design may be generally suitable in the resource-intensive confirmatory phase, maintains high power, and provides nearly k-fold increased efficiency for k indications, but controls false positives for the pooled result only. Since false positive control by indications (FWER) may sometimes be required, we now simulate a variant of this basket design controlling FWER at 0.025k, the total FWER of k separate randomized trials.METHODSThe previous design eliminated indications at an interim analysis, conducting a final pooled analysis of remaining indications. To control FWER, we rechecked individual indications at a prospectively defined level of statistical significance after any positive pooled result. We simulated this modified design under numerous scenarios varying design parameters. Only designs controlling FWER and minimizing estimation bias were allowable.RESULTSSequential analyses (interim, pooled, and post-individual tests)) result in cumulative power losses. Optimal performance results when k = 3,4. We report efficiency (expected # true positives/expected sample size) relative to k parallel studies, at 90% power (“uncorrected”) or at the power achieved in the basket trial (“corrected”, because conventional designs could also increase efficiency by sacrificing power). Efficiency and power (percentage active indications identified) improve with higher percentage of initial indications active. Up to 92% uncorrected and 38% corrected efficiency improvement is possible, with power ≈ 60%.CONCLUSIONSEven under FWER control, randomized confirmatory basket trials substantially improve development efficiency. Initial indication selection is critical. The design is particularly attractive when enrollment challenges preclude full powering of individual indications.


2017 ◽  
Vol 2017 ◽  
pp. 1-5 ◽  
Author(s):  
Lijun Wu ◽  
Liwang Gao ◽  
Xiaoyuan Zhao ◽  
Meixian Zhang ◽  
Jianxin Wu ◽  
...  

Purpose. Genome-wide association studies have found two obesity-related single-nucleotide polymorphisms (SNPs), rs17782313 near the melanocortin-4 receptor (MC4R) gene and rs6265 near the brain-derived neurotrophic factor (BDNF) gene, but the associations of both SNPs with other obesity-related traits are not fully described, especially in children. The aim of the present study is to investigate the associations between the SNPs and adiponectin that has a regulatory role in glucose and lipid metabolism. Methods. We examined the associations of the SNPs with adiponectin in Beijing Child and Adolescent Metabolic Syndrome (BCAMS) study. A total of 3503 children participated in the study. Results. The SNP rs6265 was significantly associated with adiponectin under an additive model (P=0.02 and 0.024, resp.) after adjustment for age, gender, and BMI or obesity statuses. The SNP rs17782313 was significantly associated with low adiponectin under a recessive model. No statistical significance was found between the two SNPs and low adiponectin after correction for multiple testing. Conclusion. We demonstrate for the first time that the SNP rs17782313 near MC4R and the SNP rs6265 near BDNF are associated with adiponectin in Chinese children. These novel findings provide important evidence that adiponectin possibly mediates MC4R and BDNF involved in obesity.


2020 ◽  
Vol 13 (5) ◽  
pp. 531-540
Author(s):  
John N. Brewin ◽  
Alexander E. Smith ◽  
Riley Cook ◽  
Sanjay Tewari ◽  
Julie Brent ◽  
...  

Background: Ischemic stroke is a devastating complication affecting children with sickle cell anemia. Genetic factors are likely to be important in determining the risk of stroke but are poorly defined. Methods: We have studied a cohort of 19 children who had an overt ischemic stroke before 4 years of age. We predicted genetic determinants of stroke would be more prominent in this group. We performed whole exome sequencing on this cohort and applied 2 hypotheses to our variant filtering. First, we looked for strong, potentially mono- or oligogenic variants for ischemic stroke, and second, we considered that more common polygenic variants will be enriched in our cohort. Candidate variants emerging from both strategies were validated in a cohort of 283 patients with sickle cell anemia and known pediatric cerebrovascular outcomes. We used principal component analysis in this cohort to control for relatedness and population substructure. Results: Our primary finding was that the Apoliprotein E genotypes ε2/ε4 and ε4/ ε4, defined by the interplay of rs7412 and rs429358 , were associated with increased stroke risk, with an odds ratio of 4.35 ([95% CI, 1.85–10.0] P =0.0011) for ischemic stroke in the validation cohort. We also found that rs2297518 in NOS (NO synthase) 2 (odds ratio, 2.25 [95% CI, 1.21–4.19]; P =0.014) and rs2230123 in signal transducer and activator of transcription (odds ratio, 2.60 [95% CI, 1.30–5.20]; P =0.009) both had increased odds ratios for ischemic stroke, although these two variants were below the threshold for statistical significance after correction for multiple testing. Conclusions: These data identify new loci for future functional investigations into cerebrovascular disease in sickle cell anemia. Based on African population reference allele frequencies, the Apoliprotein E genotypes would be present in about 10% of children with sickle cell anemia and represent a genetic risk factor that is potentially modifiable by both dietary and pharmaceutical manipulation of its dyslipidemic effects.


Sign in / Sign up

Export Citation Format

Share Document