scholarly journals A prediction model based on DNA methylation biomarkers and radiological characteristics for identifying malignant from benign pulmonary nodules

BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Wenqun Xing ◽  
Haibo Sun ◽  
Chi Yan ◽  
Chengzhi Zhao ◽  
Dongqing Wang ◽  
...  

Abstract Background Lung cancer remains the leading cause of cancer deaths across the world. Early detection of lung cancer by low-dose computed tomography (LDCT) can reduce the mortality rate. However, making a definitive preoperative diagnosis of malignant pulmonary nodules (PNs) found by LDCT is a clinical challenge. This study aimed to develop a prediction model based on DNA methylation biomarkers and radiological characteristics for identifying malignant pulmonary nodules from benign PNs. Methods We assessed three DNA methylation biomarkers (PTGER4, RASSF1A, and SHOX2) and clinically-relevant variables in a training cohort of 110 individuals with PNs. Four machine-learning-based prediction models were established and compared, including the K-nearest neighbors (KNN), random forest (RF), support vector machine (SVM), and logistic regression (LR) algorithms. Variables of the best-performing algorithm (LR) were selected through stepwise use of Akaike’s information criterion (AIC). The constructed prediction model was compared with the methylation biomarkers and the Mayo Clinic model using the non-parametric approach of DeLong et al. with the area under a receiver operator characteristic curve (AUC) analysis. Results A prediction model was finally constructed based on three DNA methylation biomarkers and one radiological characteristic for identifying malignant from benign PNs. The developed prediction model achieved an AUC value of 0.951 in malignant PNs diagnosis, significantly higher than the three DNA methylation biomarkers (0.912, 95% CI:0.843–0.958, p = 0.013) or Mayo Clinic model (0.823, 95% CI:0.739–0.890, p = 0.001). Validation of the prediction model in the testing cohort of 100 subjects with PNs confirmed the diagnostic value. Conclusion We have shown that integrating DNA methylation biomarkers and radiological characteristics could more accurately identify lung cancer in subjects with CT-found PNs. The prediction model developed in our study may provide clinical utility in combination with LDCT to improve the over-all diagnosis of lung cancer.

2020 ◽  
Author(s):  
Wenqun Xing ◽  
Haibo Sun ◽  
Chi Yan ◽  
Chengzhi Zhao ◽  
Dongqing Wang ◽  
...  

Abstract BackgroundLung cancer remains the leading cause of cancer deaths across the world. Early detection of lung cancer by low-dose computed tomography (LDCT) can reduce the mortality rate. However, making a definitive preoperative diagnosis of malignant pulmonary nodules (PNs) found by LDCT is a clinical challenge. This study aimed to develop a prediction model based on DNA methylation biomarkers and radiological characteristics for identifying malignant pulmonary nodules from benign PNs. MethodsWe assessed three DNA methylation biomarkers (PTGER4, RASSF1A, and SHOX2) in a training cohort of 110 individuals with PNs. Using univariate and multivariate logistic regression analysis, we developed a prediction model based on the three DNA methylation biomarkers and one radiological characteristic for identifying malignant from benign PNs. The performance of the prediction model with that of the methylation biomarkers and the Mayo Clinic model were compared using the non-parametric approach of DeLong et al. with the area under a receiver operator characteristic curve (AUC) analysis. ResultsThe developed prediction model achieved a sensitivity of 87.3% and a specificity of 95.7% with an AUC value of 0.951 in malignant PNs diagnosis, being significantly higher than the three DNA methylation biomarkers (84.1% sensitivity and 89.4% specificity, p=0.013) or clinical/radiological characteristics (76.2% sensitivity and 87.2% specificity, p=0.001) alone. Validation of the prediction model in the testing cohort of 100 subjects with PNs confirmed the diagnostic value.ConclusionWe have shown that integrating DNA methylation biomarkers and radiological characteristics could more accurately identify lung cancer in subjects with CT-found PNs. The prediction model developed in our study may provide clinical utility in combination with LDCT to improve the over-all diagnosis of lung cancer.


2017 ◽  
Vol 10 (1) ◽  
pp. 40-45 ◽  
Author(s):  
Jie Ma ◽  
Maria A. Guarnera ◽  
Wenxian Zhou ◽  
HongBin Fang ◽  
Feng Jiang

2020 ◽  
Vol 40 (2) ◽  
Author(s):  
You Li ◽  
Hui Hu ◽  
Ziwei Wu ◽  
Ge Yan ◽  
Tangwei Wu ◽  
...  

Abstract Objectives: The post-imaging, mathematical predictive model was established by combining demographic and imaging characteristics with a pulmonary nodule risk score. The prediction model provides directions for the treatment of pulmonary nodules. Many studies have established predictive models for pulmonary nodules in different populations. However, the predictive factors contained in each model were significantly different. We hypothesized that applying different models to local research groups will make a difference in predicting the benign and malignant lung nodules, distinguishing between early and late lung cancers, and between adenocarcinoma and squamous cell carcinoma. In the present study, we compared four widely used and well-known mathematical prediction models. Materials and methods: We performed a retrospective study of 496 patients from January 2017 to October 2019, they were diagnosed with nodules by pathological. We evaluate models’ performance by viewing 425 malignant and 71 benign patients’ computed tomography results. At the same time, we use the calibration curve and the area under the receiver operating characteristic curve whose abbreviation is AUC to assess one model’s predictive performance. Results: We find that in distinguishing the Benign and the Malignancy, Peking University People’s Hospital model possessed excellent performance (AUC = 0.63), as well as differentiating between early and late lung cancers (AUC = 0.67) and identifying lung adenocarcinoma (AUC = 0.61). While in the identification of lung squamous cell carcinoma, the Veterans Affairs model performed the best (AUC = 0.69). Conclusions: Geographic disparities are an extremely important influence factors, and which clinical features contained in the mathematical prediction model are the key to affect the precision and accuracy.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yohei Hirano ◽  
Yutaka Kondo ◽  
Toru Hifumi ◽  
Shoji Yokobori ◽  
Jun Kanda ◽  
...  

AbstractIn this study, we aimed to develop and validate a machine learning-based mortality prediction model for hospitalized heat-related illness patients. After 2393 hospitalized patients were extracted from a multicentered heat-related illness registry in Japan, subjects were divided into the training set for development (n = 1516, data from 2014, 2017–2019) and the test set (n = 877, data from 2020) for validation. Twenty-four variables including characteristics of patients, vital signs, and laboratory test data at hospital arrival were trained as predictor features for machine learning. The outcome was death during hospital stay. In validation, the developed machine learning models (logistic regression, support vector machine, random forest, XGBoost) demonstrated favorable performance for outcome prediction with significantly increased values of the area under the precision-recall curve (AUPR) of 0.415 [95% confidence interval (CI) 0.336–0.494], 0.395 [CI 0.318–0.472], 0.426 [CI 0.346–0.506], and 0.528 [CI 0.442–0.614], respectively, compared to that of the conventional acute physiology and chronic health evaluation (APACHE)-II score of 0.287 [CI 0.222–0.351] as a reference standard. The area under the receiver operating characteristic curve (AUROC) values were also high over 0.92 in all models, although there were no statistical differences compared to APACHE-II. This is the first demonstration of the potential of machine learning-based mortality prediction models for heat-related illnesses.


2018 ◽  
Vol 12 (1) ◽  
pp. 469-481
Author(s):  
Michala Skovlund Sørensen ◽  
Elizabeth C. Silvius ◽  
Saniya Khullar ◽  
Klaus Hindsø ◽  
Jonathan A. Forsberg ◽  
...  

Background: Predicting survival for patients with metastatic bone disease in the extremities (MBDex) is important for ensuring the implant will outlive the patient. Hitherto, prediction models for these patients have been constructed using subjective assessments, mostly lacking biochemical variables. Objectives: To develop a prediction model for survival after surgery due to MBDex using biochemical variables and externally validate the model. Methods: We created Bayesian Belief Network models to estimate likelihood of survival 1, 3, 6, and 12 months after surgery using 140 patients. We validated the models using the data of 130 other patients and calculated the area under the Receiver Operator Characteristic curve (ROC). Variables included: hemoglobin, neutrophil-count, C-reactive protein, alkaline phosphatase, primary cancer, Karnofsky-score, ASA-score, visceral metastases, bone metastases, days from diagnose of primary cancer to index surgery for MBDex, ischemic heart disease, diabetes, fracture/impending-fracture and age. Results: Survival probabilities were influenced by all biochemical variables. Validation showed ROC for the 1, 3, 6, and 12-months model: 68% (C.I.: 55%-80%), 69% (C.I.: 60%-78%), 81% (C.I.: 74%-87%) and 84% (C.I.: 77%-90%). Conclusion: Biochemical markers can be incorporated into a prediction model for survival in patients having surgery for MBDex allowing surgeons to offer more objective and individualized treatment options.


2021 ◽  
Vol 17 (9) ◽  
pp. 727-735
Author(s):  
Jiamei Long ◽  
Jia Yang ◽  
Jing Peng ◽  
Leiqing Pan ◽  
Kang Tu

Abstract Moisture content and carotenoid content are important indicators for evaluating the drying process of carrot slices. There are growing attention to develop non-destructive methods as effectively analytical tools in quality assurance of drying carrot slices. In this study, the characteristic wavelengths of moisture and carotenoid content in carrot slices during hot air drying were extracted based on hyperspectral imaging technology. A multispectral imaging equipment was built after that, and the wavelengths of filters were determined according to the characteristic wavelengths. Based on the successive projection algorithm (SPA), the optimal wavelengths of moisture and carotenoid content were further determined, and prediction models of both were established based on the system. There were 12 filters selected in this study. The results showed that a support vector machine (SVM) prediction model for moisture content was established based on seven optimal wavelengths with 0.991 for the coefficient of determination of prediction set (R 2 p ) and 10.318 for the residual prediction residual (RPD). Based on eight optimal wavelengths, a SVM prediction model for carotenoid content was also established with 0.968 for R 2 p and 5.337 for RPD. The prediction performance is close to or even better than that based on hyperspectral. The study confirmed the feasibility of using the multispectral imaging equipment to measure the moisture and carotenoid content of carrot slices during drying based on selected wavelengths, laying a foundation for the further preparation of a portable multispectral detector for the quality of dry products.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Lei Li ◽  
Zhujia Ye ◽  
Sai Yang ◽  
Hao Yang ◽  
Jing Jin ◽  
...  

Abstract Background Lung cancer is the leading cause of cancer-related mortality. The alteration of DNA methylation plays a major role in the development of lung cancer. Methylation biomarkers become a possible method for lung cancer diagnosis. Results We identified eleven lung cancer-specific methylation markers (CDO1, GSHR, HOXA11, HOXB4-1, HOXB4-2, HOXB4-3, HOXB4-4, LHX9, MIR196A1, PTGER4-1, and PTGER4-2), which could differentiate benign and malignant pulmonary nodules. The methylation levels of these markers are significantly higher in malignant tissues. In bronchoalveolar lavage fluid (BALF) samples, the methylation signals maintain the same differential trend as in tissues. An optimal 5-marker model for pulmonary nodule diagnosis (malignant vs. benign) was developed from all possible combinations of the eleven markers. In the test set (57 tissue and 71 BALF samples), the area under curve (AUC) value achieves 0.93, and the overall sensitivity is 82% at the specificity of 91%. In an independent validation set (111 BALF samples), the AUC is 0.82 with a specificity of 82% and a sensitivity of 70%. Conclusions This model can differentiate pulmonary adenocarcinoma and squamous carcinoma from benign diseases, especially for infection, inflammation, and tuberculosis. The model’s performance is not affected by gender, age, smoking history, or the solid components of nodules.


2020 ◽  
Author(s):  
Zhanyou Xu ◽  
Andreomar Kurek ◽  
Steven B. Cannon ◽  
Williams D. Beavis

AbstractSelection of markers linked to alleles at quantitative trait loci (QTL) for tolerance to Iron Deficiency Chlorosis (IDC) has not been successful. Genomic selection has been advocated for continuous numeric traits such as yield and plant height. For ordinal data types such as IDC, genomic prediction models have not been systematically compared. The objectives of research reported in this manuscript were to evaluate the most commonly used genomic prediction method, ridge regression and it’s equivalent logistic ridge regression method, with algorithmic modeling methods including random forest, gradient boosting, support vector machine, K-nearest neighbors, Naïve Bayes, and artificial neural network using the usual comparator metric of prediction accuracy. In addition we compared the methods using metrics of greater importance for decisions about selecting and culling lines for use in variety development and genetic improvement projects. These metrics include specificity, sensitivity, precision, decision accuracy, and area under the receiver operating characteristic curve. We found that Support Vector Machine provided the best specificity for culling IDC susceptible lines, while Random Forest GP models provided the best combined set of decision metrics for retaining IDC tolerant and culling IDC susceptible lines.


Sign in / Sign up

Export Citation Format

Share Document