A multiparameter model for prediction of pathological nodal status in clinically early stage oral cancer using machine learning.

e18025 Background: The current approach to neck treatment in clinical T1-2 oral cancers is to offer elective nodal dissection to all patients, despite the fact that the majority of patients are pathologically node negative. This is due to the poor predictive ability of clinico-radiological assessment and subsequently poorer survival in those in whom neck dissection is omitted based on this. A robust prediction model for pathological nodal status may allow individualized decisions for neck dissection. Our aim was to develop a multiparameter prediction model to identify pathological node-negative status using machine learning. Methods: We identified 497 patients with cT1-2 oral cancer from a single institutional database from 2011-2018 who underwent primary resection and neck dissection. We compared the sensitivity, positive predictive value and accuracy of prediction of pathologically negative neck from clinico-radiological staging alone vs. a model created from multiple parameters including clinical features (clinico-radiological nodal status, ages, sex, subsite of primary lesion) and pathological features of the resected primary tumor (maximum dimension, depth of invasion, lymphovascular invasion, perineural invasion, grade and margins of resection). The multiparameter model was built from a training dataset of the first 400 patients using an ensemble of logistic regression, random forests and support vector machines. A cohort of 97 patients was used for independent validation. Results: In this cohort 232 (47%) were clinico-radiologically node negative, while 307(62%) were pathologically node negative. The sensitivity, positive predictive value and accuracy of the clinico-radiologically assigned nodal status was 56%, 74% and 61%, while that of the multiparameter machine learning model was 87%, 89% and 89% respectively. The area under curve (AUC) of the clinico-radiological prediction was 0.62 whereas that of the multiparameter predictive model was 0.91. In the validation dataset, 58/62 pathologically node negative patients were predicted correctly by the model. The accuracy of the model on the external validation dataset was 82%. Conclusions: The performance of the multiparameter predictive model was considerably superior to clinico-radiological neck staging for prediction of pathological node negative neck. This could be validated on an independent dataset. This could be considered for prospective clinical evaluation of individualized neck dissection.

Download Full-text

Elective versus Therapeutic Neck Dissection in Node-Negative Oral Cancer

50 Landmark Papers every Oral & Maxillofacial Surgeon Should Know ◽

10.1201/9780429288036-11 ◽

2020 ◽

pp. 59-64

Author(s):

David A Mitchell

Keyword(s):

Oral Cancer ◽

Neck Dissection ◽

Node Negative

Download Full-text

Identification of newborns at risk for autism using electronic medical records and machine learning

10.1101/19008367 ◽

2019 ◽

Author(s):

Rayees Rahman ◽

Arad Kodesh ◽

Stephen Z Levine ◽

Sven Sandin ◽

Abraham Reichenberg ◽

...

Keyword(s):

Machine Learning ◽

Autism Spectrum Disorder ◽

Positive Predictive Value ◽

Electronic Medical Records ◽

Predictive Value ◽

False Positive ◽

Medical Records ◽

False Positive Rate ◽

Autism Spectrum ◽

Positive Rate

AbstractImportanceCurrent approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, where most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome.ObjectiveDevelop a machine learning (ML) method predicting the diagnosis of ASD in offspring in a general population sample, using parental electronic medical records (EMR) available before childbirthDesignPrognostic study of EMR data within a single Israeli health maintenance organization, for the parents of 1,397 ASD children (ICD-9/10), and 94,741 non-ASD children born between January 1st, 1997 through December 31st, 2008. The complete EMR record of the parents was used to develop various ML models to predict the risk of having a child with ASD.Main outcomes and measuresRoutinely available parental sociodemographic information, medical histories and prescribed medications data until offspring’s birth were used to generate features to train various machine learning algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross validation, by computing C statistics, sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value, PPV).ResultsAll ML models tested had similar performance, achieving an average C statistics of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85% for predicting ASD in this dataset.Conclusion and relevanceML algorithms combined with EMR capture early life ASD risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children.Key pointsQuestionCan autism risk in children be predicted using the pre-birth electronic medical record (EMR) of the parents?FindingsIn this population-based study that included 1,397 children with autism spectrum disorder (ASD) and 94,741 non-ASD children, we developed a machine learning classifier for predicting the likelihood of childhood diagnosis of ASD with an average C statistic of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85%.MeaningThe results presented serve as a proof-of-principle of the potential utility of EMR for the identification of a large proportion of future children at a high-risk of ASD.

Download Full-text

A prediction model for degree of differentiation for resectable locally advanced esophageal squamous cell carcinoma based on CT images using radiomics and machine-learning

British Journal of Radiology ◽

10.1259/bjr.20210525 ◽

2021 ◽

pp. 20210525

Author(s):

Daisuke Kawahara ◽

Yuji Murakami ◽

Shigeyuki Tani ◽

Yasushi Nagata

Keyword(s):

Machine Learning ◽

Squamous Cell Carcinoma ◽

Esophageal Cancer ◽

Prediction Model ◽

Cell Carcinoma ◽

Predictive Model ◽

Squamous Cell ◽

Locally Advanced ◽

Degree Of Differentiation ◽

Planning Ct

Objective: To propose the prediction model for degree of differentiation for locally advanced esophageal cancer patients from the planning CT image by radiomics analysis with machine learning. Methods: Data of 104 patients with esophagus cancer, who underwent chemoradiotherapy followed by surgery at the Hiroshima University hospital from 2003 to 2016 were analyzed. The treatment outcomes of these tumors were known prior to the study. The data were split into 3 sets: 57/16 tumors for the training/validation and 31 tumors for model testing. The degree of differentiation of squamous cell carcinoma was classified into two groups. The first group (Group I) was a poorly differentiated (POR) patients. The second group (Group II) was well and moderately differentiated patients. The radiomics feature was extracted in the tumor and around the tumor regions. A total number of 3480 radiomics features per patient image were extracted from radiotherapy planning CT scan. Models were built with the least absolute shrinkage and selection operator (LASSO) logistic regression and applied to the set of candidate predictors. The radiomics features were used for the input data in the machine learning. To build predictive models with radiomics features, neural network classifiers was used. The precision, accuracy, sensitivity by generating confusion matrices, the area under the curve (AUC) of receiver operating characteristic curve were evaluated. Results: By the LASSO analysis of the training data, we found 13 radiomics features from CT images for the classification. The accuracy of the prediction model was highest for using only CT radiomics features. The accuracy, specificity, and sensitivity of the predictive model were 85.4%, 88.6%, 80.0%, and the AUC was 0.92. Conclusion: The proposed predictive model showed high accuracy for the classification of the degree of the differentiation of esophagus cancer. Because of the good prediction ability of the method, the method may contribute to reducing the pathological examination by biopsy and predicting the local control. Advances in knowledge: For esophageal cancer, the differentiation of degree is the import indexes reflecting the aggressiveness. The current study proposed the prediction model for the differentiation of degree with radiomics analysis.

Download Full-text

Artificial neural networks improve LDCT lung cancer screening: a comparative validation study

BMC Cancer ◽

10.1186/s12885-020-07465-1 ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Yin-Chen Hsu ◽

Yuan-Hsiung Tsai ◽

Hsu-Huei Weng ◽

Li-Sheng Hsu ◽

Ying-Huang Tsai ◽

...

Keyword(s):

Lung Cancer ◽

Cancer Risk ◽

Positive Predictive Value ◽

Prediction Model ◽

Negative Predictive Value ◽

Validation Study ◽

Predictive Value ◽

Prospective Cohort ◽

Lung Cancer Risk ◽

Comparative Validation

Abstract Background This study proposes a prediction model for the automatic assessment of lung cancer risk based on an artificial neural network (ANN) with a data-driven approach to the low-dose computed tomography (LDCT) standardized structure report. Methods This comparative validation study analysed a prospective cohort from Chiayi Chang Gung Memorial Hospital, Taiwan. In total, 836 asymptomatic patients who had undergone LDCT scans between February 2017 and August 2018 were included, comprising 27 lung cancer cases and 809 controls. A derivation cohort of 602 participants (19 lung cancer cases and 583 controls) was collected to construct the ANN prediction model. A comparative validation of the ANN and Lung-RADS was conducted with a prospective cohort of 234 participants (8 lung cancer cases and 226 controls). The areas under the curves (AUCs) of the receiver operating characteristic (ROC) curves were used to compare the prediction models. Results At the cut-off of category 3, the Lung-RADS had a sensitivity of 12.5%, specificity of 96.0%, positive predictive value of 10.0%, and negative predictive value of 96.9%. At its optimal cut-off value, the ANN had a sensitivity of 75.0%, specificity of 85.0%, positive predictive value of 15.0%, and negative predictive value of 99.0%. The area under the ROC curve was 0.764 for the Lung-RADS and 0.873 for the ANN (P = 0.01). The two most important predictors used by the ANN for predicting lung cancer were the documented sizes of partially solid nodules and ground-glass nodules. Conclusions Compared to the Lung-RADS, the ANN provided better sensitivity for the detection of lung cancer in an Asian population. In addition, the ANN provided a more refined discriminative ability than the Lung-RADS for lung cancer risk stratification with population-specific demographic characteristics. When lung nodules are detected and documented in a standardized structured report, ANNs may better provide important insights for lung cancer prediction than conventional rule-based criteria.

Download Full-text

A pharmacogenetic model predicting low paclitaxel clearance based on the DMET platform.

Journal of Clinical Oncology ◽

10.1200/jco.2013.31.15_suppl.2597 ◽

2013 ◽

Vol 31 (15_suppl) ◽

pp. 2597-2597

Author(s):

Annemieke J.M. Nieuweboer ◽

Anne-Joy M. de Graan ◽

Laure Elens ◽

Marcel Smid ◽

John W. M. Martens ◽

...

Keyword(s):

Positive Predictive Value ◽

Prediction Model ◽

Predictive Value ◽

Plasma Concentrations ◽

Sampling Strategy ◽

High Sensitivity ◽

Cancer Drug ◽

Drug Metabolizing Enzyme ◽

Validation Set ◽

Metabolizing Enzyme

2597 Background: Paclitaxel (PTX) is a commonly used cytotoxic agent. It is metabolized by P450 cytochrome iso-enzymes CYP3A4 and CYP2C8 and has high interindividual variability in pharmacokinetics (PK) and toxicity. Here, we present a genetic prediction model to identify patients with low PTX clearance (CL) using the new Drug-Metabolizing Enzyme and Transporter (DMET; Affymetrix) platform, capable of detecting 1,936 genetic variants (SNPs) in 225 genes. Methods: In a PK study, 270 Caucasian cancer patients were treated with PTX. PK parameters were determined using a limited sampling strategy. HPLC or LC-MS/MS were used to determine PTX plasma concentrations and non-linear mixed effects modelling (NONMEM) was used to estimate individual unbound CL from previously developed PK population models. Subsequently, the cohort of patients was randomly split into a training and validation set. In all patients, the presence of SNPs in metabolic enzymes and transporters was determined using the DMET platform. Selected SNPs were subsequently validated in the validation set. Results: Baseline characteristics were comparable in both sets. The mean CL of the total cohort was 488 ± 149 L/h and the threshold for low CL was set at 339 L/h (1 SD < total mean CL). 14 SNPs were selected to be included in the prediction model and validated in the validation set. For none of these 14 SNPs, evidence for a biological plausible link to taxane metabolism exists. The developed prediction model had a sensitivity of 95% to identify low PTX CL, a positive predictive value of 22% and remained significantly associated with low CL after multivariate analysis correcting for age, gender and Hb levels at start of therapy (P=0.024). Conclusions: This is the first considerably-sized application of the DMET platform to explain PK variability of a widely used anti-cancer drug. Although this validated prediction model for PTX CL had a high sensitivity, its positive predictive value is too low to be of direct clinical use. Likely, genetic variability in DMET genes alone does not sufficiently explain PTX CL, as for example environmental factors may also influence PTX metabolism.

Download Full-text

Elective versus therapeutic neck dissection in the clinically node negative early oral cancer: A randomised control trial (RCT).

Journal of Clinical Oncology ◽

10.1200/jco.2015.33.15_suppl.lba3 ◽

2015 ◽

Vol 33 (15_suppl) ◽

pp. LBA3-LBA3 ◽

Cited By ~ 8

Author(s):

Anil D'Cruz ◽

Mitali Dandekar ◽

Richa Vaish ◽

Supreeta Arya ◽

Gouri Pantvaidya ◽

...

Keyword(s):

Oral Cancer ◽

Neck Dissection ◽

Randomised Control Trial ◽

Node Negative

Download Full-text

A prediction model for pathological findings after neoadjuvant chemoradiotherapy for resectable locally advanced esophageal cancer based on PET images using radiomics and machine-learning.

Journal of Clinical Oncology ◽

10.1200/jco.2020.38.4_suppl.456 ◽

2020 ◽

Vol 38 (4_suppl) ◽

pp. 456-456

Author(s):

Yuji Murakami ◽

Yasushi Nagata ◽

Daisuke Kawahara

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Predictive Model ◽

Test Data ◽

Locally Advanced ◽

Neoadjuvant Chemoradiotherapy ◽

Image Data ◽

Ct Images ◽

Pathological Findings ◽

Planning Ct

456 Background: The pathologic complete response (PCR) rate by neoadjuvant chemoradiotherapy (NCRT) for resectable locally advanced esophageal squamous cell carcinoma (ESCC) is about 40%. If we could predict a PCR from pre-treatment image data, it might be possible to select patients who can be cured by organ-preserving CRT. The purpose of this study is to construct a predictive model for PCR by NCRT in patients with locally advanced ESCC using radiomics and machine-learning. Methods: We used data of 98 ESCC patients who underwent NCRT and surgery from 2003 to 2016. Firstly, we fused the radiotherapy treatment planning CT images and PET images scanned before treatment. Then using target delineations on planning CT images, we created eight kinds of target regions on PET images. Secondly, we generated a total of 6968 features per patient using the PET image data within these target regions that were preprocessed by radiomics technique. Among them, we extracted the optimal features for machine-learning using the least absolute shrinkage and selection operator (LASSO) logistic regression. Thirdly, artificial neural networks were used as a machine-learning method to create a predictive model. The extracted radiomics features were used as input values, and the information of ‘PCR’ or ‘not PCR’ was used as output values. We used data of randomly selected 58 patients for training and constructed a predictive model. Then we used data of 15 patients to validate the models and created the optimal model. Finally, we evaluated the predictive model using the test data of 25 patients. Results: By the LASSO analysis, 32 radiomics features were extracted for machine-learning classification. This predictive model predicted pathological findings after NCRT in 24 of 25 test data. The accuracy, specificity and sensitivity in the prediction of PCR after NCRT by this predictive model were 96.0%, 93.8%, and 100%, respectively. Conclusions: A prediction model based on PET images using radiomics and machine-learning could predict pathological findings after NCRT for resectable locally advanced ESCC.

Download Full-text

Building an NCAA men’s basketball predictive model and quantifying its success

Journal of Quantitative Analysis in Sports ◽

10.1515/jqas-2014-0058 ◽

2015 ◽

Vol 11 (1) ◽

Cited By ~ 4

Author(s):

Michael J. Lopez ◽

Gregory J. Matthews

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Predictive Model ◽

Loss Function ◽

Las Vegas ◽

Statistical Tools ◽

Men's Basketball ◽

Predictive Algorithms ◽

Complex Models ◽

Logistic Regressions

AbstractComputing and machine learning advancements have led to the creation of many cutting-edge predictive algorithms, some of which have been demonstrated to provide more accurate forecasts than traditional statistical tools. In this manuscript, we provide evidence that the combination of modest statistical methods with informative data can meet or exceed the accuracy of more complex models when it comes to predicting the NCAA men’s basketball tournament. First, we describe a prediction model that merges the point spreads set by Las Vegas sportsbooks with possession based team efficiency metrics by using logistic regressions. The set of probabilities generated from this model most accurately predicted the 2014 tournament, relative to approximately 400 competing submissions, as judged by the log loss function. Next, we attempt to quantify the degree to which luck played a role in the success of this model by simulating tournament outcomes under different sets of true underlying game probabilities. We estimate that under the most optimistic of game probability scenarios, our entry had roughly a 12% chance of outscoring all competing submissions and just less than a 50% chance of finishing with one of the ten best scores.

Download Full-text

Machine Learning and Prediction of All-Cause Mortality among Chinese Older Adults

10.1101/2021.04.21.21255843 ◽

2021 ◽

Author(s):

Xurui Jin ◽

Yiyang Sun ◽

Tinglong Zhu ◽

Yu Leng ◽

Shuyi Guan ◽

...

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Regression Model ◽

Predictive Value ◽

Cox Regression ◽

Brier Score ◽

Baseline Survey ◽

Learning Models ◽

All Cause Mortality ◽

Machine Learning Models

AbstractBackground and aimMortality risk stratification was vital for targeted intervention. This study aimed at building the prediction model of all-cause mortality among Chinese dwelling elderly with different methods including regression models and machine learning models and to compare the performance of machine learning models with regression model on predicting mortality. Additionally, this study also aimed at ranking the predictors of mortality within different models and comparing the predictive value of different groups of predictors using the model with best performance.MethodI used data from the sub-study of Chinese Longitudinal Healthy Longevity Survey (CLHLS) - Healthy Ageing and Biomarkers Cohort Study (HABCS). The baseline survey of HABCS was conducted in 2008 and covered similar domains that CLHLS has investigated and shared the sampling strategy. The follow-up of HABCS was conducted every 2-3 years till 2018.The analysis sample included 2,448 participants from HABCS. I used totally 117 predictors to build the prediction model for survival using the HABCS cohort, including 61 questionnaire, 41 biomarker and 15 genetics predictors. Four models were built (XG-Boost, random survival forest [RSF], Cox regression with all variables and Cox-backward). We used C-index and integrated Brier score (Brier score for the two years’ mortality prediction model) to evaluate the performance of those models.ResultsThe XG-Boost model and RSF model shows slightly better predictive performance than Cox models and Cox-backward models based on the C-index and integrated Brier score in predicting surviving. Age. Activity of daily living and Mini-Mental State Examination score were identified as the top 3 predictors in the XG-Boost and RSF models. Biomarker and questionnaire predictors have a similar predictive value, while genetic predictors have no addictive predictive value when combined with questionnaire or biomarker predictors.ConclusionIn this work, it is shown that machine learning techniques can be a useful tool for both prediction and its performance sightly outperformed the regression model in predicting survival.

Download Full-text

Artificial neural networks improve LDCT lung cancer screening: A comparative validation study

10.21203/rs.3.rs-24642/v2 ◽

2020 ◽

Author(s):

Yin-Chen Hsu ◽

Yuan-Hsiung Tsai ◽

Hsu-Huei Weng ◽

Li-Sheng Hsu ◽

Ying-Huang Tsai ◽

...

Keyword(s):

Lung Cancer ◽

Cancer Risk ◽

Positive Predictive Value ◽

Prediction Model ◽

Negative Predictive Value ◽

Validation Study ◽

Predictive Value ◽

Prospective Cohort ◽

Lung Cancer Risk ◽

Comparative Validation

Abstract Background: This study proposes a prediction model for the automatic assessment of lung cancer risk based on an artificial neural network (ANN) with a data-driven approach to the low-dose computed tomography (LDCT) standardized structure report.Methods: This comparative validation study analysed a prospective cohort from Chiayi Chang Gung Memorial Hospital, Taiwan. In total, 836 asymptomatic patients who had undergone LDCT scans between February 2017 and August 2018 were included, comprising 27 lung cancer cases and 809 controls. A derivation cohort of 602 participants (19 lung cancer cases and 583 controls) was collected to construct the ANN prediction model. A comparative validation of the ANN and Lung-RADS was conducted with a prospective cohort of 234 participants (8 lung cancer cases and 226 controls). The areas under the curves (AUCs) of the receiver operating characteristic (ROC) curves were used to compare the prediction models.Results: At the cut-off of category 3, the Lung-RADS had a sensitivity of 12.5%, specificity of 96.0%, positive predictive value of 10.0%, and negative predictive value of 96.9%. At its optimal cut-off value, the ANN had a sensitivity of 75.0%, specificity of 85.0%, positive predictive value of 15.0%, and negative predictive value of 99.0%. The area under the ROC curve was 0.764 for the Lung-RADS and 0.873 for the ANN (P=0.01). The heatmap plot demonstrates the leading items, i.e., solid nodules, partially solid nodules, and ground-glass nodules, as the significant predictors of malignant outcomes.Conclusions: Compared to the Lung-RADS, the ANN provided better sensitivity for the detection of lung cancer in an Asian population. In addition, the ANN provided a more refined discriminative ability than the Lung-RADS for lung cancer risk stratification with population-specific demographic characteristics. When lung nodules are detected and documented in a standardized structured report, ANNs may better provide important insights for lung cancer prediction than conventional rule-based criteria.Trial registrationNot applicable.

Download Full-text