A multiparameter model for prediction of pathological nodal status in clinically early stage oral cancer using machine learning.

2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e18025-e18025
Author(s):  
Indranil Mallick ◽  
Saheli Saha ◽  
Sanjoy Chatterjee ◽  
Paromita Roy

e18025 Background: The current approach to neck treatment in clinical T1-2 oral cancers is to offer elective nodal dissection to all patients, despite the fact that the majority of patients are pathologically node negative. This is due to the poor predictive ability of clinico-radiological assessment and subsequently poorer survival in those in whom neck dissection is omitted based on this. A robust prediction model for pathological nodal status may allow individualized decisions for neck dissection. Our aim was to develop a multiparameter prediction model to identify pathological node-negative status using machine learning. Methods: We identified 497 patients with cT1-2 oral cancer from a single institutional database from 2011-2018 who underwent primary resection and neck dissection. We compared the sensitivity, positive predictive value and accuracy of prediction of pathologically negative neck from clinico-radiological staging alone vs. a model created from multiple parameters including clinical features (clinico-radiological nodal status, ages, sex, subsite of primary lesion) and pathological features of the resected primary tumor (maximum dimension, depth of invasion, lymphovascular invasion, perineural invasion, grade and margins of resection). The multiparameter model was built from a training dataset of the first 400 patients using an ensemble of logistic regression, random forests and support vector machines. A cohort of 97 patients was used for independent validation. Results: In this cohort 232 (47%) were clinico-radiologically node negative, while 307(62%) were pathologically node negative. The sensitivity, positive predictive value and accuracy of the clinico-radiologically assigned nodal status was 56%, 74% and 61%, while that of the multiparameter machine learning model was 87%, 89% and 89% respectively. The area under curve (AUC) of the clinico-radiological prediction was 0.62 whereas that of the multiparameter predictive model was 0.91. In the validation dataset, 58/62 pathologically node negative patients were predicted correctly by the model. The accuracy of the model on the external validation dataset was 82%. Conclusions: The performance of the multiparameter predictive model was considerably superior to clinico-radiological neck staging for prediction of pathological node negative neck. This could be validated on an independent dataset. This could be considered for prospective clinical evaluation of individualized neck dissection.

2019 ◽  
Author(s):  
Rayees Rahman ◽  
Arad Kodesh ◽  
Stephen Z Levine ◽  
Sven Sandin ◽  
Abraham Reichenberg ◽  
...  

AbstractImportanceCurrent approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, where most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome.ObjectiveDevelop a machine learning (ML) method predicting the diagnosis of ASD in offspring in a general population sample, using parental electronic medical records (EMR) available before childbirthDesignPrognostic study of EMR data within a single Israeli health maintenance organization, for the parents of 1,397 ASD children (ICD-9/10), and 94,741 non-ASD children born between January 1st, 1997 through December 31st, 2008. The complete EMR record of the parents was used to develop various ML models to predict the risk of having a child with ASD.Main outcomes and measuresRoutinely available parental sociodemographic information, medical histories and prescribed medications data until offspring’s birth were used to generate features to train various machine learning algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross validation, by computing C statistics, sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value, PPV).ResultsAll ML models tested had similar performance, achieving an average C statistics of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85% for predicting ASD in this dataset.Conclusion and relevanceML algorithms combined with EMR capture early life ASD risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children.Key pointsQuestionCan autism risk in children be predicted using the pre-birth electronic medical record (EMR) of the parents?FindingsIn this population-based study that included 1,397 children with autism spectrum disorder (ASD) and 94,741 non-ASD children, we developed a machine learning classifier for predicting the likelihood of childhood diagnosis of ASD with an average C statistic of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85%.MeaningThe results presented serve as a proof-of-principle of the potential utility of EMR for the identification of a large proportion of future children at a high-risk of ASD.


2021 ◽  
pp. 20210525
Author(s):  
Daisuke Kawahara ◽  
Yuji Murakami ◽  
Shigeyuki Tani ◽  
Yasushi Nagata

Objective: To propose the prediction model for degree of differentiation for locally advanced esophageal cancer patients from the planning CT image by radiomics analysis with machine learning. Methods: Data of 104 patients with esophagus cancer, who underwent chemoradiotherapy followed by surgery at the Hiroshima University hospital from 2003 to 2016 were analyzed. The treatment outcomes of these tumors were known prior to the study. The data were split into 3 sets: 57/16 tumors for the training/validation and 31 tumors for model testing. The degree of differentiation of squamous cell carcinoma was classified into two groups. The first group (Group I) was a poorly differentiated (POR) patients. The second group (Group II) was well and moderately differentiated patients. The radiomics feature was extracted in the tumor and around the tumor regions. A total number of 3480 radiomics features per patient image were extracted from radiotherapy planning CT scan. Models were built with the least absolute shrinkage and selection operator (LASSO) logistic regression and applied to the set of candidate predictors. The radiomics features were used for the input data in the machine learning. To build predictive models with radiomics features, neural network classifiers was used. The precision, accuracy, sensitivity by generating confusion matrices, the area under the curve (AUC) of receiver operating characteristic curve were evaluated. Results: By the LASSO analysis of the training data, we found 13 radiomics features from CT images for the classification. The accuracy of the prediction model was highest for using only CT radiomics features. The accuracy, specificity, and sensitivity of the predictive model were 85.4%, 88.6%, 80.0%, and the AUC was 0.92. Conclusion: The proposed predictive model showed high accuracy for the classification of the degree of the differentiation of esophagus cancer. Because of the good prediction ability of the method, the method may contribute to reducing the pathological examination by biopsy and predicting the local control. Advances in knowledge: For esophageal cancer, the differentiation of degree is the import indexes reflecting the aggressiveness. The current study proposed the prediction model for the differentiation of degree with radiomics analysis.


BMC Cancer ◽  
2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Yin-Chen Hsu ◽  
Yuan-Hsiung Tsai ◽  
Hsu-Huei Weng ◽  
Li-Sheng Hsu ◽  
Ying-Huang Tsai ◽  
...  

Abstract Background This study proposes a prediction model for the automatic assessment of lung cancer risk based on an artificial neural network (ANN) with a data-driven approach to the low-dose computed tomography (LDCT) standardized structure report. Methods This comparative validation study analysed a prospective cohort from Chiayi Chang Gung Memorial Hospital, Taiwan. In total, 836 asymptomatic patients who had undergone LDCT scans between February 2017 and August 2018 were included, comprising 27 lung cancer cases and 809 controls. A derivation cohort of 602 participants (19 lung cancer cases and 583 controls) was collected to construct the ANN prediction model. A comparative validation of the ANN and Lung-RADS was conducted with a prospective cohort of 234 participants (8 lung cancer cases and 226 controls). The areas under the curves (AUCs) of the receiver operating characteristic (ROC) curves were used to compare the prediction models. Results At the cut-off of category 3, the Lung-RADS had a sensitivity of 12.5%, specificity of 96.0%, positive predictive value of 10.0%, and negative predictive value of 96.9%. At its optimal cut-off value, the ANN had a sensitivity of 75.0%, specificity of 85.0%, positive predictive value of 15.0%, and negative predictive value of 99.0%. The area under the ROC curve was 0.764 for the Lung-RADS and 0.873 for the ANN (P = 0.01). The two most important predictors used by the ANN for predicting lung cancer were the documented sizes of partially solid nodules and ground-glass nodules. Conclusions Compared to the Lung-RADS, the ANN provided better sensitivity for the detection of lung cancer in an Asian population. In addition, the ANN provided a more refined discriminative ability than the Lung-RADS for lung cancer risk stratification with population-specific demographic characteristics. When lung nodules are detected and documented in a standardized structured report, ANNs may better provide important insights for lung cancer prediction than conventional rule-based criteria.


2013 ◽  
Vol 31 (15_suppl) ◽  
pp. 2597-2597
Author(s):  
Annemieke J.M. Nieuweboer ◽  
Anne-Joy M. de Graan ◽  
Laure Elens ◽  
Marcel Smid ◽  
John W. M. Martens ◽  
...  

2597 Background: Paclitaxel (PTX) is a commonly used cytotoxic agent. It is metabolized by P450 cytochrome iso-enzymes CYP3A4 and CYP2C8 and has high interindividual variability in pharmacokinetics (PK) and toxicity. Here, we present a genetic prediction model to identify patients with low PTX clearance (CL) using the new Drug-Metabolizing Enzyme and Transporter (DMET; Affymetrix) platform, capable of detecting 1,936 genetic variants (SNPs) in 225 genes. Methods: In a PK study, 270 Caucasian cancer patients were treated with PTX. PK parameters were determined using a limited sampling strategy. HPLC or LC-MS/MS were used to determine PTX plasma concentrations and non-linear mixed effects modelling (NONMEM) was used to estimate individual unbound CL from previously developed PK population models. Subsequently, the cohort of patients was randomly split into a training and validation set. In all patients, the presence of SNPs in metabolic enzymes and transporters was determined using the DMET platform. Selected SNPs were subsequently validated in the validation set. Results: Baseline characteristics were comparable in both sets. The mean CL of the total cohort was 488 ± 149 L/h and the threshold for low CL was set at 339 L/h (1 SD < total mean CL). 14 SNPs were selected to be included in the prediction model and validated in the validation set. For none of these 14 SNPs, evidence for a biological plausible link to taxane metabolism exists. The developed prediction model had a sensitivity of 95% to identify low PTX CL, a positive predictive value of 22% and remained significantly associated with low CL after multivariate analysis correcting for age, gender and Hb levels at start of therapy (P=0.024). Conclusions: This is the first considerably-sized application of the DMET platform to explain PK variability of a widely used anti-cancer drug. Although this validated prediction model for PTX CL had a high sensitivity, its positive predictive value is too low to be of direct clinical use. Likely, genetic variability in DMET genes alone does not sufficiently explain PTX CL, as for example environmental factors may also influence PTX metabolism.


2015 ◽  
Vol 33 (15_suppl) ◽  
pp. LBA3-LBA3 ◽  
Author(s):  
Anil D'Cruz ◽  
Mitali Dandekar ◽  
Richa Vaish ◽  
Supreeta Arya ◽  
Gouri Pantvaidya ◽  
...  

2020 ◽  
Vol 38 (4_suppl) ◽  
pp. 456-456
Author(s):  
Yuji Murakami ◽  
Yasushi Nagata ◽  
Daisuke Kawahara

456 Background: The pathologic complete response (PCR) rate by neoadjuvant chemoradiotherapy (NCRT) for resectable locally advanced esophageal squamous cell carcinoma (ESCC) is about 40%. If we could predict a PCR from pre-treatment image data, it might be possible to select patients who can be cured by organ-preserving CRT. The purpose of this study is to construct a predictive model for PCR by NCRT in patients with locally advanced ESCC using radiomics and machine-learning. Methods: We used data of 98 ESCC patients who underwent NCRT and surgery from 2003 to 2016. Firstly, we fused the radiotherapy treatment planning CT images and PET images scanned before treatment. Then using target delineations on planning CT images, we created eight kinds of target regions on PET images. Secondly, we generated a total of 6968 features per patient using the PET image data within these target regions that were preprocessed by radiomics technique. Among them, we extracted the optimal features for machine-learning using the least absolute shrinkage and selection operator (LASSO) logistic regression. Thirdly, artificial neural networks were used as a machine-learning method to create a predictive model. The extracted radiomics features were used as input values, and the information of ‘PCR’ or ‘not PCR’ was used as output values. We used data of randomly selected 58 patients for training and constructed a predictive model. Then we used data of 15 patients to validate the models and created the optimal model. Finally, we evaluated the predictive model using the test data of 25 patients. Results: By the LASSO analysis, 32 radiomics features were extracted for machine-learning classification. This predictive model predicted pathological findings after NCRT in 24 of 25 test data. The accuracy, specificity and sensitivity in the prediction of PCR after NCRT by this predictive model were 96.0%, 93.8%, and 100%, respectively. Conclusions: A prediction model based on PET images using radiomics and machine-learning could predict pathological findings after NCRT for resectable locally advanced ESCC.


Author(s):  
Michael J. Lopez ◽  
Gregory J. Matthews

AbstractComputing and machine learning advancements have led to the creation of many cutting-edge predictive algorithms, some of which have been demonstrated to provide more accurate forecasts than traditional statistical tools. In this manuscript, we provide evidence that the combination of modest statistical methods with informative data can meet or exceed the accuracy of more complex models when it comes to predicting the NCAA men’s basketball tournament. First, we describe a prediction model that merges the point spreads set by Las Vegas sportsbooks with possession based team efficiency metrics by using logistic regressions. The set of probabilities generated from this model most accurately predicted the 2014 tournament, relative to approximately 400 competing submissions, as judged by the log loss function. Next, we attempt to quantify the degree to which luck played a role in the success of this model by simulating tournament outcomes under different sets of true underlying game probabilities. We estimate that under the most optimistic of game probability scenarios, our entry had roughly a 12% chance of outscoring all competing submissions and just less than a 50% chance of finishing with one of the ten best scores.


2021 ◽  
Author(s):  
Xurui Jin ◽  
Yiyang Sun ◽  
Tinglong Zhu ◽  
Yu Leng ◽  
Shuyi Guan ◽  
...  

AbstractBackground and aimMortality risk stratification was vital for targeted intervention. This study aimed at building the prediction model of all-cause mortality among Chinese dwelling elderly with different methods including regression models and machine learning models and to compare the performance of machine learning models with regression model on predicting mortality. Additionally, this study also aimed at ranking the predictors of mortality within different models and comparing the predictive value of different groups of predictors using the model with best performance.MethodI used data from the sub-study of Chinese Longitudinal Healthy Longevity Survey (CLHLS) - Healthy Ageing and Biomarkers Cohort Study (HABCS). The baseline survey of HABCS was conducted in 2008 and covered similar domains that CLHLS has investigated and shared the sampling strategy. The follow-up of HABCS was conducted every 2-3 years till 2018.The analysis sample included 2,448 participants from HABCS. I used totally 117 predictors to build the prediction model for survival using the HABCS cohort, including 61 questionnaire, 41 biomarker and 15 genetics predictors. Four models were built (XG-Boost, random survival forest [RSF], Cox regression with all variables and Cox-backward). We used C-index and integrated Brier score (Brier score for the two years’ mortality prediction model) to evaluate the performance of those models.ResultsThe XG-Boost model and RSF model shows slightly better predictive performance than Cox models and Cox-backward models based on the C-index and integrated Brier score in predicting surviving. Age. Activity of daily living and Mini-Mental State Examination score were identified as the top 3 predictors in the XG-Boost and RSF models. Biomarker and questionnaire predictors have a similar predictive value, while genetic predictors have no addictive predictive value when combined with questionnaire or biomarker predictors.ConclusionIn this work, it is shown that machine learning techniques can be a useful tool for both prediction and its performance sightly outperformed the regression model in predicting survival.


2020 ◽  
Author(s):  
Yin-Chen Hsu ◽  
Yuan-Hsiung Tsai ◽  
Hsu-Huei Weng ◽  
Li-Sheng Hsu ◽  
Ying-Huang Tsai ◽  
...  

Abstract Background: This study proposes a prediction model for the automatic assessment of lung cancer risk based on an artificial neural network (ANN) with a data-driven approach to the low-dose computed tomography (LDCT) standardized structure report.Methods: This comparative validation study analysed a prospective cohort from Chiayi Chang Gung Memorial Hospital, Taiwan. In total, 836 asymptomatic patients who had undergone LDCT scans between February 2017 and August 2018 were included, comprising 27 lung cancer cases and 809 controls. A derivation cohort of 602 participants (19 lung cancer cases and 583 controls) was collected to construct the ANN prediction model. A comparative validation of the ANN and Lung-RADS was conducted with a prospective cohort of 234 participants (8 lung cancer cases and 226 controls). The areas under the curves (AUCs) of the receiver operating characteristic (ROC) curves were used to compare the prediction models.Results: At the cut-off of category 3, the Lung-RADS had a sensitivity of 12.5%, specificity of 96.0%, positive predictive value of 10.0%, and negative predictive value of 96.9%. At its optimal cut-off value, the ANN had a sensitivity of 75.0%, specificity of 85.0%, positive predictive value of 15.0%, and negative predictive value of 99.0%. The area under the ROC curve was 0.764 for the Lung-RADS and 0.873 for the ANN (P=0.01). The heatmap plot demonstrates the leading items, i.e., solid nodules, partially solid nodules, and ground-glass nodules, as the significant predictors of malignant outcomes.Conclusions: Compared to the Lung-RADS, the ANN provided better sensitivity for the detection of lung cancer in an Asian population. In addition, the ANN provided a more refined discriminative ability than the Lung-RADS for lung cancer risk stratification with population-specific demographic characteristics. When lung nodules are detected and documented in a standardized structured report, ANNs may better provide important insights for lung cancer prediction than conventional rule-based criteria.Trial registrationNot applicable.


Sign in / Sign up

Export Citation Format

Share Document