scholarly journals Automatable Distributed Regression Analysis of Vertically Partitioned Data Facilitated by PopMedNet: Feasibility and Enhancement Study

10.2196/21459 ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. e21459
Author(s):  
Qoua Her ◽  
Thomas Kent ◽  
Yuji Samizo ◽  
Aleksandra Slavkovic ◽  
Yury Vilk ◽  
...  

Background In clinical research, important variables may be collected from multiple data sources. Physical pooling of patient-level data from multiple sources often raises several challenges, including proper protection of patient privacy and proprietary interests. We previously developed an SAS-based package to perform distributed regression—a suite of privacy-protecting methods that perform multivariable-adjusted regression analysis using only summary-level information—with horizontally partitioned data, a setting where distinct cohorts of patients are available from different data sources. We integrated the package with PopMedNet, an open-source file transfer software, to facilitate secure file transfer between the analysis center and the data-contributing sites. The feasibility of using PopMedNet to facilitate distributed regression analysis (DRA) with vertically partitioned data, a setting where the data attributes from a cohort of patients are available from different data sources, was unknown. Objective The objective of the study was to describe the feasibility of using PopMedNet and enhancements to PopMedNet to facilitate automatable vertical DRA (vDRA) in real-world settings. Methods We gathered the statistical and informatic requirements of using PopMedNet to facilitate automatable vDRA. We enhanced PopMedNet based on these requirements to improve its technical capability to support vDRA. Results PopMedNet can enable automatable vDRA. We identified and implemented two enhancements to PopMedNet that improved its technical capability to perform automatable vDRA in real-world settings. The first was the ability to simultaneously upload and download multiple files, and the second was the ability to directly transfer summary-level information between the data-contributing sites without a third-party analysis center. Conclusions PopMedNet can be used to facilitate automatable vDRA to protect patient privacy and support clinical research in real-world settings.

2020 ◽  
Author(s):  
Qoua Her ◽  
Thomas Kent ◽  
Yuji Samizo ◽  
Aleksandra Slavkovic ◽  
Yury Vilk ◽  
...  

BACKGROUND In clinical research, important variables may be collected from multiple data sources. Physical pooling of patient-level data from multiple sources often raises several challenges, including proper protection of patient privacy and proprietary interests. We previously developed an SAS-based package to perform distributed regression—a suite of privacy-protecting methods that perform multivariable-adjusted regression analysis using only summary-level information—with horizontally partitioned data, a setting where distinct cohorts of patients are available from different data sources. We integrated the package with PopMedNet, an open-source file transfer software, to facilitate secure file transfer between the analysis center and the data-contributing sites. The feasibility of using PopMedNet to facilitate distributed regression analysis (DRA) with vertically partitioned data, a setting where the data attributes from a cohort of patients are available from different data sources, was unknown. OBJECTIVE The objective of the study was to describe the feasibility of using PopMedNet and enhancements to PopMedNet to facilitate automatable vertical DRA (vDRA) in real-world settings. METHODS We gathered the statistical and informatic requirements of using PopMedNet to facilitate automatable vDRA. We enhanced PopMedNet based on these requirements to improve its technical capability to support vDRA. RESULTS PopMedNet can enable automatable vDRA. We identified and implemented two enhancements to PopMedNet that improved its technical capability to perform automatable vDRA in real-world settings. The first was the ability to simultaneously upload and download multiple files, and the second was the ability to directly transfer summary-level information between the data-contributing sites without a third-party analysis center. CONCLUSIONS PopMedNet can be used to facilitate automatable vDRA to protect patient privacy and support clinical research in real-world settings.


Author(s):  
Magdalena Opazo Breton ◽  
John Britton ◽  
Yue Huang ◽  
Ilze Bogdanovica

Price of tobacco products has traditionally been relevant both for the industry, to respond to policy changes, and for governments, as an effective tobacco control measure. However, monitoring prices across a wide range of brands and brand variants requires access to expensive commercial sales databases. This study aims to investigate the comparability of average tobacco prices from two commercial sources and an in-house monitoring database which provides daily data in real time at minimal cost. We used descriptive and regression analysis to compare the monthly average numbers of brands, brand variants, products and prices of cigarettes and hand-rolling tobacco using commercial data from Nielsen Scantrack and Kantar Worldpanel, and an online price database (OPD) created in Nottingham, for the period from May 2013 to February 2017. There were marked differences in the number of products tracked in the three data sources. Nielsen was the most comprehensive and Kantar Worldpanel the least. Though average prices were very similar between the three datasets, Nottingham OPD prices were the highest and Kantar Worldpanel the lowest. However, regression analysis demonstrated that after adjustment for differences in product range, price differences between the datasets were very small. After allowing for differences in product range these data sources offer representative prices for application in price research. Online price tracking offers an inexpensive and near real-time alternative to the commercial datasets.


2014 ◽  
Vol 9 (1) ◽  
pp. 12-24
Author(s):  
Michael Comerford

The plethora of new data sources, combined with a growing interest in increased access to previously unpublished data, poses a set of ethical challenges regarding individual privacy. This paper sets out one aspect of those challenges: the need to anonymise data in such a form that protects the privacy of individuals while providing sufficient data utility for data users. This issue is discussed using a case study of Scottish Government’s administrative data, in which disclosure risk is examined and data utility is assessed using a potential ‘real-world’ analysis.


JURNAL PUNDI ◽  
2018 ◽  
Vol 2 (3) ◽  
Author(s):  
Elsa Meirina

Budget slack is a behavior of individuals and organizations in preparing budgets even more so if the budget is used as a basis for performance measurement. The occurrence of a budget slack, namely budgetary slack will make the performance look better in the eyes of superiors if they can achieve the budget goals. Budgetary slacks are often used to overcome uncertainties predicting the future and the allocation of resources will be carried out based on the projected budget costs, so that slack makes flexible. Budgetary participation, asymmetric information, and budget emphasis are factors that influence the occurrence of budget slack. The study was conducted at the Badan Perencanaan dan Pembangunan Daerah (BAPPEDA) of West Sumatra with a total of 50 respondents consisting of heads of departments, heads of fields, sub-sections, and other employees related to the budget. Data sources in this study are questionnaires that use multiple regression analysis. The results of the research that have been conducted show that asymmetric information affects the occurrence of budget slack. Information owned by subordinates or superiors is the basis for setting the markup level on the budget.


Author(s):  
Laura North

IntroductionThe Dementias Platform UK (DPUK) Cohort Explorer is an interactive, online visualisation tool that allows users to explore data for a number of DPUK cohorts. Over 30 variables across cohorts have been harmonised, including information on demographics, lifestyle, cognition, health, and genetic biomarkers. Objectives and ApproachThe tool has been developed to complement existing DPUK cohort metadata to provide a visual representation of participant numbers and field-level information for a selection of cohorts. This enables users to determine a cohort’s eligibility before applying for access to a cohort’s data, and aid in shaping potential hypotheses. Developed using Microsoft PowerBI, the Explorer hosts a subset of the cohort’s baseline, harmonised data, allowing a user to interrogate the visualisations of the uploaded data in a secure manner on the DPUK Data Portal website. Visualisations are linked so that participant numbers and distributions can be explored interactively. ResultsThis approach allows the user to explore the harmonised data across a number of cohorts simultaneously whilst setting and adjusting filters that are of interest to the user’s search criteria. This provides a better understanding of the real-world data and enables the user to determine the feasibility of each cohort for potential studies, whilst facilitating meaningful comparisons across cohorts. The tool currently visualises five DPUK cohorts with a total of 82,391 participants, however it is being incrementally developed with more cohorts being added continually. Conclusion / ImplicationsBy combing an easy-to-use, interactive dashboard with harmonised sets of real-world data, the tool allows the user to explore, interrogate and better understand field-level information in a secure manner with zero data transfer. This provides more insight for the user when applying for access to a cohort dataset using the DPUK Data Portal and may help the user to make more informed decisions and/or hypotheses.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e17543-e17543
Author(s):  
Xiaoxiang Chen ◽  
Jing Ni ◽  
Xia Xu ◽  
Wenwen Guo ◽  
Xianzhong Cheng ◽  
...  

e17543 Background: Homologous recombination deficiency (HRD) is the first phenotypically defined predictive biomarker for Poly (ADP-ribose) polymerase inhibitors (PARPi) in ovarian cancer. However, the proportion of HRD positive in real world and the relationship of HRD status with PARPi in Chinese ovarian cancer patients remains unknown. Methods: A total of sixty-four ovarian cancer patients underwent PARPi, both Olaparib and Niraparib, were enrolled from August 2018 to January 2021 in Jiangsu Institute of Cancer Hospital. HRD score which was the sum of loss of heterozygosity (LOH), telomeric allelic imbalance (TAI) and large-scale state transitions (LST) events were calculated using tumor DNA-based next generation sequencing (NGS) assays. HRD-positive was defined by either BRCA1/2 pathogenic or likely pathogenic mutation or HRD score ≥42. Progression-free survival (PFS) was analyzed with a log-rank test using HRD status and summarized using Kaplan-Meier methodology. Univariate and multiple cox-regression analysis were conducted to investigate all possible clinical factors. Results: 71.9% (46/64) patients were HRD positive and the rest 28.1% (18/64) were HRD negative, which was higher than the HRD positive proportion reported in Western countries. The PFS among HRD positive patients was significantly longer than those HRD negative patients (medium PFS 8.9 m vs 3.6 m, hazard ratio [HR]: 0.22, p < 0.001). Among them, 23 patients who were BRCA wild type but HRD positive had longer PFS than those with BRCA wild type and HRD negative (medium PFS 9.2 m vs 3.6 m, HR: 0.20, p < 0.001). Univariate cox-regression analysis found that HRD status, previous treatment lines, secondary cytoreductive surgery (SCS) were significantly associated with PFS after PARPi treatment. After multiple regression correction, HRD status (HR: 0.39, 95% CI: [0.20-0.76], p = 0.006), ECOG score (HR: 2.53, 95% CI: [1.24-5.17], p = 0.011) and SCS (HR: 2.21, 95% CI: [1.09-4.48], p = 0.028) were the independent factors. Subgroup analysis in ECOG = 0 subgroup (N = 36), HRD positive patients had significant longer PFS than HRD negative patients (medium PFS 10.3 m vs 5.8 m, HR: 0.14, p < 0.001). Also in the subgroup of patients without SCS, PFS in patients with HRD was longer than patients without HRD (medium PFS 10.2 m vs 5.7 m, HR: 0.29, p = 0.003). Conclusions: This is the first real-world data of HRD status in ovarian cancer patients from China and demonstrate that HRD is a valid biomarker for PARP inhibitors in Chinese ovarian cancer patients.


Sign in / Sign up

Export Citation Format

Share Document