Establishing Machine Learning Models to Predict Curative Resection in Early Gastric Cancer with Undifferentiated Histology: Development and Usability Study

Background Undifferentiated type of early gastric cancer (U-EGC) is included among the expanded indications of endoscopic submucosal dissection (ESD); however, the rate of curative resection remains unsatisfactory. Endoscopists predict the probability of curative resection by considering the size and shape of the lesion and whether ulcers are present or not. The location of the lesion, indicating the likely technical difficulty, is also considered. Objective The aim of this study was to establish machine learning (ML) models to better predict the possibility of curative resection in U-EGC prior to ESD. Methods A nationwide cohort of 2703 U-EGCs treated by ESD or surgery were adopted for the training and internal validation cohorts. Separately, an independent data set of the Korean ESD registry (n=275) and an Asan medical center data set (n=127) treated by ESD were chosen for external validation. Eighteen ML classifiers were selected to establish prediction models of curative resection with the following variables: age; sex; location, size, and shape of the lesion; and whether ulcers were present or not. Results Among the 18 models, the extreme gradient boosting classifier showed the best performance (internal validation accuracy 93.4%, 95% CI 90.4%-96.4%; precision 92.6%, 95% CI 89.5%-95.7%; recall 99.0%, 95% CI 97.8%-99.9%; and F1 score 95.7%, 95% CI 93.3%-98.1%). Attempts at external validation showed substantial accuracy (first external validation 81.5%, 95% CI 76.9%-86.1% and second external validation 89.8%, 95% CI 84.5%-95.1%). Lesion size was the most important feature in each explainable artificial intelligence analysis. Conclusions We established an ML model capable of accurately predicting the curative resection of U-EGC before ESD by considering the morphological and ecological characteristics of the lesions.

Download Full-text

Establishing Machine Learning Models to Predict Curative Resection in Early Gastric Cancer with Undifferentiated Histology: Development and Usability Study (Preprint)

10.2196/preprints.25053 ◽

2020 ◽

Author(s):

Chang Seok Bang ◽

Ji Yong Ahn ◽

Jie-Hyun Kim ◽

Young-Il Kim ◽

Il Ju Choi ◽

...

Keyword(s):

Machine Learning ◽

Gastric Cancer ◽

Early Gastric Cancer ◽

Curative Resection ◽

External Validation ◽

Data Set ◽

Internal Validation ◽

Size And Shape ◽

Extreme Gradient Boosting ◽

Undifferentiated Histology

BACKGROUND Undifferentiated type of early gastric cancer (U-EGC) is included among the expanded indications of endoscopic submucosal dissection (ESD); however, the rate of curative resection remains unsatisfactory. Endoscopists predict the probability of curative resection by considering the size and shape of the lesion and whether ulcers are present or not. The location of the lesion, indicating the likely technical difficulty, is also considered. OBJECTIVE The aim of this study was to establish machine learning (ML) models to better predict the possibility of curative resection in U-EGC prior to ESD. METHODS A nationwide cohort of 2703 U-EGCs treated by ESD or surgery were adopted for the training and internal validation cohorts. Separately, an independent data set of the Korean ESD registry (n=275) and an Asan medical center data set (n=127) treated by ESD were chosen for external validation. Eighteen ML classifiers were selected to establish prediction models of curative resection with the following variables: age; sex; location, size, and shape of the lesion; and whether ulcers were present or not. RESULTS Among the 18 models, the extreme gradient boosting classifier showed the best performance (internal validation accuracy 93.4%, 95% CI 90.4%-96.4%; precision 92.6%, 95% CI 89.5%-95.7%; recall 99.0%, 95% CI 97.8%-99.9%; and F1 score 95.7%, 95% CI 93.3%-98.1%). Attempts at external validation showed substantial accuracy (first external validation 81.5%, 95% CI 76.9%-86.1% and second external validation 89.8%, 95% CI 84.5%-95.1%). Lesion size was the most important feature in each explainable artificial intelligence analysis. CONCLUSIONS We established an ML model capable of accurately predicting the curative resection of U-EGC before ESD by considering the morphological and ecological characteristics of the lesions.

Download Full-text

External validation of nomogram for the prediction of recurrence after curative resection in early gastric cancer

Annals of Oncology ◽

10.1093/annonc/mdr118 ◽

2012 ◽

Vol 23 (2) ◽

pp. 361-367 ◽

Cited By ~ 25

Author(s):

J.H. Kim ◽

H.S. Kim ◽

W.Y. Seo ◽

C.M. Nam ◽

K.Y. Kim ◽

...

Keyword(s):

Gastric Cancer ◽

Early Gastric Cancer ◽

Curative Resection ◽

External Validation

Download Full-text

Prediction of Masked Hypertension and Masked Uncontrolled Hypertension Using Machine Learning

Frontiers in Cardiovascular Medicine ◽

10.3389/fcvm.2021.778306 ◽

2021 ◽

Vol 8 ◽

Author(s):

Ming-Hui Hung ◽

Ling-Chieh Shih ◽

Yu-Ching Wang ◽

Hsin-Bang Leu ◽

Po-Hsun Huang ◽

...

Keyword(s):

Machine Learning ◽

Clinical Characteristics ◽

Prediction Models ◽

External Validation ◽

Uncontrolled Hypertension ◽

Gradient Boosting ◽

Masked Hypertension ◽

Internal Validation ◽

Hypertensive Patients ◽

Extreme Gradient Boosting

Objective: This study aimed to develop machine learning-based prediction models to predict masked hypertension and masked uncontrolled hypertension using the clinical characteristics of patients at a single outpatient visit.Methods: Data were derived from two cohorts in Taiwan. The first cohort included 970 hypertensive patients recruited from six medical centers between 2004 and 2005, which were split into a training set (n = 679), a validation set (n = 146), and a test set (n = 145) for model development and internal validation. The second cohort included 416 hypertensive patients recruited from a single medical center between 2012 and 2020, which was used for external validation. We used 33 clinical characteristics as candidate variables to develop models based on logistic regression (LR), random forest (RF), eXtreme Gradient Boosting (XGboost), and artificial neural network (ANN).Results: The four models featured high sensitivity and high negative predictive value (NPV) in internal validation (sensitivity = 0.914–1.000; NPV = 0.853–1.000) and external validation (sensitivity = 0.950–1.000; NPV = 0.875–1.000). The RF, XGboost, and ANN models showed much higher area under the receiver operating characteristic curve (AUC) (0.799–0.851 in internal validation, 0.672–0.837 in external validation) than the LR model. Among the models, the RF model, composed of 6 predictor variables, had the best overall performance in both internal and external validation (AUC = 0.851 and 0.837; sensitivity = 1.000 and 1.000; specificity = 0.609 and 0.580; NPV = 1.000 and 1.000; accuracy = 0.766 and 0.721, respectively).Conclusion: An effective machine learning-based predictive model that requires data from a single clinic visit may help to identify masked hypertension and masked uncontrolled hypertension.

Download Full-text

Long-term outcomes of endoscopic resection followed by additional surgery after non-curative resection in undifferentiated-type early gastric cancer: a nationwide multi-center study

Surgical Endoscopy ◽

10.1007/s00464-021-08464-4 ◽

2021 ◽

Author(s):

Jie-Hyun Kim ◽

Young-Il Kim ◽

Ji Yong Ahn ◽

Woon Geon Shin ◽

Hyo-Joon Yang ◽

...

Keyword(s):

Gastric Cancer ◽

Early Gastric Cancer ◽

Endoscopic Resection ◽

Curative Resection ◽

Long Term Outcomes ◽

Additional Surgery ◽

Undifferentiated Type ◽

Multi Center Study ◽

Center Study

Download Full-text

Exploiting Rules to Enhance Machine Learning in Extracting Information From Multi-Institutional Prostate Pathology Reports

JCO Clinical Cancer Informatics ◽

10.1200/cci.20.00028 ◽

2020 ◽

pp. 865-874

Author(s):

Enrico Santus ◽

Tal Schuster ◽

Amir M. Tahmasebi ◽

Clara Li ◽

Adam Yala ◽

...

Keyword(s):

Machine Learning ◽

Hybrid Systems ◽

High Performance ◽

Feature Model ◽

Training Data ◽

Gradient Boosting ◽

Support Vector ◽

Data Set ◽

Extreme Gradient Boosting ◽

Pathology Reports

PURPOSE Literature on clinical note mining has highlighted the superiority of machine learning (ML) over hand-crafted rules. Nevertheless, most studies assume the availability of large training sets, which is rarely the case. For this reason, in the clinical setting, rules are still common. We suggest 2 methods to leverage the knowledge encoded in pre-existing rules to inform ML decisions and obtain high performance, even with scarce annotations. METHODS We collected 501 prostate pathology reports from 6 American hospitals. Reports were split into 2,711 core segments, annotated with 20 attributes describing the histology, grade, extension, and location of tumors. The data set was split by institutions to generate a cross-institutional evaluation setting. We assessed 4 systems, namely a rule-based approach, an ML model, and 2 hybrid systems integrating the previous methods: a Rule as Feature model and a Classifier Confidence model. Several ML algorithms were tested, including logistic regression (LR), support vector machine (SVM), and eXtreme gradient boosting (XGB). RESULTS When training on data from a single institution, LR lags behind the rules by 3.5% (F1 score: 92.2% v 95.7%). Hybrid models, instead, obtain competitive results, with Classifier Confidence outperforming the rules by +0.5% (96.2%). When a larger amount of data from multiple institutions is used, LR improves by +1.5% over the rules (97.2%), whereas hybrid systems obtain +2.2% for Rule as Feature (97.7%) and +2.6% for Classifier Confidence (98.3%). Replacing LR with SVM or XGB yielded similar performance gains. CONCLUSION We developed methods to use pre-existing handcrafted rules to inform ML algorithms. These hybrid systems obtain better performance than either rules or ML models alone, even when training data are limited.

Download Full-text

Tu1523 METACHRONOUS CANCER AFTER ENDOSCOPIC SUBMUCOSAL DISSECTION FOR EARLY GASTRIC CANCER WITH UNDIFFERENTIATED HISTOLOGY: A RETROSPECTIVE ANALYSIS

Gastrointestinal Endoscopy ◽

10.1016/j.gie.2020.03.3695 ◽

2020 ◽

Vol 91 (6) ◽

pp. AB601-AB602

Author(s):

Jung-Wook Kim ◽

Seunggyun Baek ◽

Byeungeun Ryu ◽

Yoo Min Park ◽

Chi Hyuk Oh ◽

...

Keyword(s):

Gastric Cancer ◽

Early Gastric Cancer ◽

Endoscopic Submucosal Dissection ◽

Retrospective Analysis ◽

Metachronous Cancer ◽

Submucosal Dissection ◽

Undifferentiated Histology

Download Full-text

Machine learning-based approach for disease severity classification of carpal tunnel syndrome

Scientific Reports ◽

10.1038/s41598-021-97043-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dougho Park ◽

Byung Hee Kim ◽

Sang-Eok Lee ◽

Dong Young Kim ◽

Mansu Kim ◽

...

Keyword(s):

Machine Learning ◽

Carpal Tunnel Syndrome ◽

Carpal Tunnel ◽

Rating Scale ◽

External Validation ◽

Gradient Boosting ◽

Severity Classification ◽

Extreme Gradient Boosting ◽

Tunnel Syndrome ◽

Multi Class Classification

AbstractIdentifying the severity of carpal tunnel syndrome (CTS) is essential to providing appropriate therapeutic interventions. We developed and validated machine-learning (ML) models for classifying CTS severity. Here, 1037 CTS hands with 11 variables each were retrospectively analyzed. CTS was confirmed using electrodiagnosis, and its severity was classified into three grades: mild, moderate, and severe. The dataset was randomly split into a training (70%) and test (30%) set. A total of 507 mild, 276 moderate, and 254 severe CTS hands were included. Extreme gradient boosting (XGB) showed the highest external validation accuracy in the multi-class classification at 76.6% (95% confidence interval [CI] 71.2–81.5). XGB also had an optimal model training accuracy of 76.1%. Random forest (RF) and k-nearest neighbors had the second-highest external validation accuracy of 75.6% (95% CI 70.0–80.5). For the RF and XGB models, the numeric rating scale of pain was the most important variable, and body mass index was the second most important. The one-versus-rest classification yielded improved external validation accuracies for each severity grade compared with the multi-class classification (mild, 83.6%; moderate, 78.8%; severe, 90.9%). The CTS severity classification based on the ML model was validated and is readily applicable to aiding clinical evaluations.

Download Full-text

Tu1487 DERIVATION AND EXTERNAL VALIDATION OF A PREDICTION MODEL (BEST-J SCORE) OF BLEEDING AFTER ENDOSCOPIC SUBMUCOSAL DISSECTION FOR EARLY GASTRIC CANCER

Gastrointestinal Endoscopy ◽

10.1016/j.gie.2020.03.3659 ◽

2020 ◽

Vol 91 (6) ◽

pp. AB585-AB586

Author(s):

Masakuni Kobayashi ◽

Waku Hatta ◽

Yosuke Tsuji ◽

Toshiyuki Yoshio ◽

Naomi Kakushima ◽

...

Keyword(s):

Gastric Cancer ◽

Early Gastric Cancer ◽

Prediction Model ◽

Endoscopic Submucosal Dissection ◽

External Validation ◽

Submucosal Dissection

Download Full-text

Histology as a Prognostic Factor in Early Gastric Cancer

Tumori Journal ◽

10.1177/030089169207800307 ◽

1992 ◽

Vol 78 (3) ◽

pp. 181-184

Author(s):

Massimo Ferrari ◽

Enrico Ghislandi ◽

Giuseppe Landonio ◽

Margherita Majno ◽

Tiziano Porretta ◽

...

Keyword(s):

Risk Factors ◽

Gastric Cancer ◽

Lymph Node ◽

Prognostic Factor ◽

Early Gastric Cancer ◽

Lymph Node Involvement ◽

Depth Of Invasion ◽

Undifferentiated Histology ◽

Node Involvement

Of 431 patients with gastric cancer observed in our Istitution, 23 (5.3 %) had early gastric cancer (EGC). Macroscopic presentation, histology, depth of invasion, and lymph node involvement were evaluated in all the cases. All patients underwent surgery and an intensive follow-up was performed. Five of the 23 patients progressed, and the risk factors were examined. Histology seemed to be the main prognostic factor in our study, since intestinal type of EGC was associated to a significantly better prognosis. Total gastrectomy is indicated in the proximal localization of EGC, and should perhaps be performed also in cases presenting undifferentiated histology.

Download Full-text

Prediction model of bleeding after endoscopic submucosal dissection for early gastric cancer: BEST-J score

Gut ◽

10.1136/gutjnl-2019-319926 ◽

2020 ◽

pp. gutjnl-2019-319926 ◽

Cited By ~ 1

Author(s):

Waku Hatta ◽

Yosuke Tsuji ◽

Toshiyuki Yoshio ◽

Naomi Kakushima ◽

Shu Hoteya ◽

...

Keyword(s):

Gastric Cancer ◽

High Risk ◽

Early Gastric Cancer ◽

Prediction Model ◽

Endoscopic Submucosal Dissection ◽

Clinical Decision Making ◽

Validation Cohort ◽

External Validation ◽

Derivation Cohort ◽

Submucosal Dissection

ObjectiveBleeding after endoscopic submucosal dissection (ESD) for early gastric cancer (EGC) is a frequent adverse event after ESD. We aimed to develop and externally validate a clinically useful prediction model (BEST-J score: Bleeding after ESD Trend from Japan) for bleeding after ESD for EGC.DesignThis retrospective study enrolled patients who underwent ESD for EGC. Patients in the derivation cohort (n=8291) were recruited from 25 institutions, and patients in the external validation cohort (n=2029) were recruited from eight institutions in other areas. In the derivation cohort, weighted points were assigned to predictors of bleeding determined in the multivariate logistic regression analysis and a prediction model was established. External validation of the model was conducted to analyse discrimination and calibration.ResultsA prediction model comprised 10 variables (warfarin, direct oral anticoagulant, chronic kidney disease with haemodialysis, P2Y12 receptor antagonist, aspirin, cilostazol, tumour size >30 mm, lower-third in tumour location, presence of multiple tumours and interruption of each kind of antithrombotic agents). The rates of bleeding after ESD at low-risk (0 to 1 points), intermediate-risk (2 points), high-risk (3 to 4 points) and very high-risk (≥5 points) were 2.8%, 6.1%, 11.4% and 29.7%, respectively. In the external validation cohort, the model showed moderately good discrimination, with a c-statistic of 0.70 (95% CI, 0.64 to 0.76), and good calibration (calibration-in-the-large, 0.05; calibration slope, 1.01).ConclusionsIn this nationwide multicentre study, we derived and externally validated a prediction model for bleeding after ESD. This model may be a good clinical decision-making support tool for ESD in patients with EGC.

Download Full-text