Background and research aim: Lung cancer is a research priority in the UK. Early diagnosis of lung cancer can improve patients' survival outcomes. The DART-QResearch project is part of a larger academic-industrial collaborative initiative, using big data and artificial intelligence to improve patient outcomes with thoracic diseases. There are two general research aims in the DART-QResearch project: (1) to understand the natural history of lung cancer, (2) to develop, validate, and evaluate risk prediction models to select patients at high risk for lung cancer screening.
Methods: This population-based cohort study uses the QResearch database (version 45) and includes patients aged between 25 and 84 years old and without a diagnosis of lung cancer at cohort entry (study period: 1 January 2005 to 31 December 2020). The team conducted a literature review (with additional clinical input) to inform the inclusion of variables for data extraction from the QResearch database. The following statistical techniques will be used for different research objectives, including descriptive statistics, multi-level modelling, multiple imputation for missing data, fractional polynomials to explore non-linear relationships between continuous variables and the outcome, and Cox regression for the prediction model. We will update our QCancer (lung, 10-year risk) algorithm, and compare it with the other two mainstream models (LLP and PLCOM2012) for lung cancer screening using the same dataset. We will evaluate the discrimination, calibration, and clinical usefulness of the prediction models, and recommend the best one for lung cancer screening for the English primary care population.
Discussion: The DART-QResearch project focuses on both symptomatic presentation and asymptomatic patients in the lung cancer care pathway. A better understanding of the patterns, trajectories, and phenotypes of symptomatic presentation may help GPs consider lung cancer earlier. Screening asymptomatic patients at high risk is another route to achieve earlier diagnosis of lung cancer. The strengths of this study include using large-scale representative population-based clinical data, robust methodology, and a transparent research process. This project has great potential to contribute to the national cancer strategic plan and yields substantial public and societal benefits through earlier diagnosis of lung cancer.