scholarly journals Clinical Annotations for Prostate Cancer Research: Defining Data Elements, Creating a Reproducible Analytical Pipeline, and Assessing Data Quality

Author(s):  
Niamh M. Keegan ◽  
Samantha E. Vasselman ◽  
Ethan S. Barnett ◽  
Barbara Nweji ◽  
Emily A. Carbone ◽  
...  

Background: Routine clinical data from clinical charts are indispensable for retrospective and prospective observational studies and clinical trials. Their reproducibility is often not assessed. Objective: To develop a prostate cancer-specific database with a defined source hierarchy for clinical annotations in conjunction with molecular profiling and to evaluate data reproducibility. Design, setting, and participants: For men with prostate cancer and clinical-grade paired tumor-normal sequencing, we performed team-based retrospective data collection from the electronic medical record at a comprehensive cancer center. We developed an open-source R package for data processing. We assessed reproducibility using blinded repeat annotation by a reference medical oncologist. Outcome measurements and statistical analysis: We evaluated completeness of data elements, reproducibility of team-based annotation compared to the reference, and impact of measurement error on bias in survival analyses. Results and limitations: Data elements on demographics, diagnosis and staging, disease state at the time of procuring a genomically characterized sample, and clinical outcomes were piloted and then abstracted for 2,261 patients (with 2,631 samples). Completeness of data elements was generally high. Comparing to the repeat annotation by a medical oncologist blinded to the database (100 patients/samples), reproducibility of annotations was high to very high; T stage, metastasis date, and presence and date of castration resistance had lower reproducibility. Impact of measurement error on estimates for strong prognostic factors was modest. Conclusions: With a prostate cancer-specific data dictionary and quality control measures, manual clinical annotations by a multidisciplinary team can be scalable and reproducible. The data dictionary and the R package for reproducible data processing are freely available to increase data quality in clinical prostate cancer research.

2007 ◽  
Author(s):  
Flora A. M. Ukoli ◽  
Yong Cui ◽  
William Washington ◽  
LaMonica Stewart ◽  
O. Ogunkua ◽  
...  

2013 ◽  
Author(s):  
Flora A. Ukoli ◽  
LaMonica Stewart ◽  
M. Sanderson ◽  
Z. Chen ◽  
L. Dent ◽  
...  

2013 ◽  
Author(s):  
Flora A. Ukoli ◽  
LaMonica Stewart ◽  
M. Sanderson ◽  
A. Pasipanodya ◽  
Carlton Adams ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document