scholarly journals Application of machine learning in the diagnosis of gastric cancer based on noninvasive characteristics

PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0244869
Author(s):  
Shuang-Li Zhu ◽  
Jie Dong ◽  
Chenjing Zhang ◽  
Yao-Bo Huang ◽  
Wensheng Pan

Background The diagnosis of gastric cancer mainly relies on endoscopy, which is invasive and costly. The aim of this study is to develop a predictive model for the diagnosis of gastric cancer based on noninvasive characteristics. Aims To construct a predictive model for the diagnosis of gastric cancer with high accuracy based on noninvasive characteristics. Methods A retrospective study of 709 patients at Zhejiang Provincial People's Hospital was conducted. Variables of age, gender, blood cell count, liver function, kidney function, blood lipids, tumor markers and pathological results were analyzed. We used gradient boosting decision tree (GBDT), a type of machine learning method, to construct a predictive model for the diagnosis of gastric cancer and evaluate the accuracy of the model. Results Of the 709 patients, 398 were diagnosed with gastric cancer; 311 were health people or diagnosed with benign gastric disease. Multivariate analysis showed that gender, age, neutrophil lymphocyte ratio, hemoglobin, albumin, carcinoembryonic antigen (CEA), carbohydrate antigen 125 (CA125) and carbohydrate antigen 199 (CA199) were independent characteristics associated with gastric cancer. We constructed a predictive model using GBDT, and the area under the receiver operating characteristic curve (AUC) of the model was 91%. For the test dataset, sensitivity was 87.0% and specificity 84.1% at the optimal threshold value of 0.56. The overall accuracy was 83.0%. Positive and negative predictive values were 83.0% and 87.8%, respectively. Conclusion We construct a predictive model to diagnose gastric cancer with high sensitivity and specificity. The model is noninvasive and may reduce the medical cost.

Author(s):  
Kazutaka Uchida ◽  
Junichi Kouno ◽  
Shinichi Yoshimura ◽  
Norito Kinjo ◽  
Fumihiro Sakakibara ◽  
...  

AbstractIn conjunction with recent advancements in machine learning (ML), such technologies have been applied in various fields owing to their high predictive performance. We tried to develop prehospital stroke scale with ML. We conducted multi-center retrospective and prospective cohort study. The training cohort had eight centers in Japan from June 2015 to March 2018, and the test cohort had 13 centers from April 2019 to March 2020. We use the three different ML algorithms (logistic regression, random forests, XGBoost) to develop models. Main outcomes were large vessel occlusion (LVO), intracranial hemorrhage (ICH), subarachnoid hemorrhage (SAH), and cerebral infarction (CI) other than LVO. The predictive abilities were validated in the test cohort with accuracy, positive predictive value, sensitivity, specificity, area under the receiver operating characteristic curve (AUC), and F score. The training cohort included 3178 patients with 337 LVO, 487 ICH, 131 SAH, and 676 CI cases, and the test cohort included 3127 patients with 183 LVO, 372 ICH, 90 SAH, and 577 CI cases. The overall accuracies were 0.65, and the positive predictive values, sensitivities, specificities, AUCs, and F scores were stable in the test cohort. The classification abilities were also fair for all ML models. The AUCs for LVO of logistic regression, random forests, and XGBoost were 0.89, 0.89, and 0.88, respectively, in the test cohort, and these values were higher than the previously reported prediction models for LVO. The ML models developed to predict the probability and types of stroke at the prehospital stage had superior predictive abilities.


2021 ◽  
Vol 8 ◽  
Author(s):  
Ruixia Cui ◽  
Wenbo Hua ◽  
Kai Qu ◽  
Heran Yang ◽  
Yingmu Tong ◽  
...  

Sepsis-associated coagulation dysfunction greatly increases the mortality of sepsis. Irregular clinical time-series data remains a major challenge for AI medical applications. To early detect and manage sepsis-induced coagulopathy (SIC) and sepsis-associated disseminated intravascular coagulation (DIC), we developed an interpretable real-time sequential warning model toward real-world irregular data. Eight machine learning models including novel algorithms were devised to detect SIC and sepsis-associated DIC 8n (1 ≤ n ≤ 6) hours prior to its onset. Models were developed on Xi'an Jiaotong University Medical College (XJTUMC) and verified on Beth Israel Deaconess Medical Center (BIDMC). A total of 12,154 SIC and 7,878 International Society on Thrombosis and Haemostasis (ISTH) overt-DIC labels were annotated according to the SIC and ISTH overt-DIC scoring systems in train set. The area under the receiver operating characteristic curve (AUROC) were used as model evaluation metrics. The eXtreme Gradient Boosting (XGBoost) model can predict SIC and sepsis-associated DIC events up to 48 h earlier with an AUROC of 0.929 and 0.910, respectively, and even reached 0.973 and 0.955 at 8 h earlier, achieving the highest performance to date. The novel ODE-RNN model achieved continuous prediction at arbitrary time points, and with an AUROC of 0.962 and 0.936 for SIC and DIC predicted 8 h earlier, respectively. In conclusion, our model can predict the sepsis-associated SIC and DIC onset up to 48 h in advance, which helps maximize the time window for early management by physicians.


2020 ◽  
Vol 66 (11) ◽  
pp. 1396-1404 ◽  
Author(s):  
He S Yang ◽  
Yu Hou ◽  
Ljiljana V Vasovic ◽  
Peter A D Steel ◽  
Amy Chadburn ◽  
...  

Abstract Background Accurate diagnostic strategies to identify SARS-CoV-2 positive individuals rapidly for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours. Method We developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual’s SARS-CoV-2 infection status. Laboratory testing results obtained within 2 days before the release of SARS-CoV-2 RT-PCR result were used to train a gradient boosting decision tree (GBDT) model from 3,356 SARS-CoV-2 RT-PCR tested patients (1,402 positive and 1,954 negative) evaluated at a metropolitan hospital. Results The model achieved an area under the receiver operating characteristic curve (AUC) of 0.854 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within 2 days. Conclusion This model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-CoV-2 infected patients before their RT-PCR results are available. It may play an important role in assisting the identification of SARS-CoV-2 infected patients in areas where RT-PCR testing is not accessible due to financial or supply constraints.


2018 ◽  
Vol 129 (4) ◽  
pp. 675-688 ◽  
Author(s):  
Samir Kendale ◽  
Prathamesh Kulkarni ◽  
Andrew D. Rosenberg ◽  
Jing Wang

AbstractEditor’s PerspectiveWhat We Already Know about This TopicWhat This Article Tells Us That Is NewBackgroundHypotension is a risk factor for adverse perioperative outcomes. Machine-learning methods allow large amounts of data for development of robust predictive analytics. The authors hypothesized that machine-learning methods can provide prediction for the risk of postinduction hypotension.MethodsData was extracted from the electronic health record of a single quaternary care center from November 2015 to May 2016 for patients over age 12 that underwent general anesthesia, without procedure exclusions. Multiple supervised machine-learning classification techniques were attempted, with postinduction hypotension (mean arterial pressure less than 55 mmHg within 10 min of induction by any measurement) as primary outcome, and preoperative medications, medical comorbidities, induction medications, and intraoperative vital signs as features. Discrimination was assessed using cross-validated area under the receiver operating characteristic curve. The best performing model was tuned and final performance assessed using split-set validation.ResultsOut of 13,323 cases, 1,185 (8.9%) experienced postinduction hypotension. Area under the receiver operating characteristic curve using logistic regression was 0.71 (95% CI, 0.70 to 0.72), support vector machines was 0.63 (95% CI, 0.58 to 0.60), naive Bayes was 0.69 (95% CI, 0.67 to 0.69), k-nearest neighbor was 0.64 (95% CI, 0.63 to 0.65), linear discriminant analysis was 0.72 (95% CI, 0.71 to 0.73), random forest was 0.74 (95% CI, 0.73 to 0.75), neural nets 0.71 (95% CI, 0.69 to 0.71), and gradient boosting machine 0.76 (95% CI, 0.75 to 0.77). Test set area for the gradient boosting machine was 0.74 (95% CI, 0.72 to 0.77).ConclusionsThe success of this technique in predicting postinduction hypotension demonstrates feasibility of machine-learning models for predictive analytics in the field of anesthesiology, with performance dependent on model selection and appropriate tuning.


2021 ◽  
Author(s):  
Jin Zheng ◽  
Weibin Shi

Abstract Background Peritoneal metastasis is a critical way of metastasis for gastric cancer, patients with which tend to have poor prognosis. Laparoscopy or laparotomy is still major approach to diagnose peritoneal metastasis presently. This study was aimed to explore the factors affecting peritoneal metastasis of gastric cancer and establish a nomogram to predict that preoperatively. Methods 1002 gastric cancer patients who underwent surgery without distant organ metastasis was collected in the study. The nomogram was built with variables selected by univariate logistical regression and LASSO, and evaluated with internal and external validation ROC curve.Results Three factors including carbohydrate antigen 125, carbohydrate antigen 242 and serosal invasion or not of primary tumor were enrolled in the nomogram. The AUC value was0.922 (95%CI 0.897~0.947) in internal validation and 0.934 (95%CI 0.852~1.000) in external validation. Conclusions This study developed a nomogram with risk factors easily accessible before surgery in patients with gastric cancer, which can predict the probability of peritoneal metastasis well and would be helpful for clinicians to make appropriate therapy strategies.


2019 ◽  
Vol 58 (01) ◽  
pp. 031-041 ◽  
Author(s):  
Sara Rabhi ◽  
Jérémie Jakubowicz ◽  
Marie-Helene Metzger

Objective The objective of this article was to compare the performances of health care-associated infection (HAI) detection between deep learning and conventional machine learning (ML) methods in French medical reports. Methods The corpus consisted in different types of medical reports (discharge summaries, surgery reports, consultation reports, etc.). A total of 1,531 medical text documents were extracted and deidentified in three French university hospitals. Each of them was labeled as presence (1) or absence (0) of HAI. We started by normalizing the records using a list of preprocessing techniques. We calculated an overall performance metric, the F1 Score, to compare a deep learning method (convolutional neural network [CNN]) with the most popular conventional ML models (Bernoulli and multi-naïve Bayes, k-nearest neighbors, logistic regression, random forests, extra-trees, gradient boosting, support vector machines). We applied the hyperparameter Bayesian optimization for each model based on its HAI identification performances. We included the set of text representation as an additional hyperparameter for each model, using four different text representations (bag of words, term frequency–inverse document frequency, word2vec, and Glove). Results CNN outperforms all other conventional ML algorithms for HAI classification. The best F1 Score of 97.7% ± 3.6% and best area under the curve score of 99.8% ± 0.41% were achieved when CNN was directly applied to the processed clinical notes without a pretrained word2vec embedding. Through receiver operating characteristic curve analysis, we could achieve a good balance between false notifications (with a specificity equal to 0.937) and system detection capability (with a sensitivity equal to 0.962) using the Youden's index reference. Conclusions The main drawback of CNNs is their opacity. To address this issue, we investigated CNN inner layers' activation values to visualize the most meaningful phrases in a document. This method could be used to build a phrase-based medical assistant algorithm to help the infection control practitioner to select relevant medical records. Our study demonstrated that deep learning approach outperforms other classification learning algorithms for automatically identifying HAIs in medical reports.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Stephanie O Frisch ◽  
Zeineb Bouzid ◽  
Jessica Zègre-Hemsey ◽  
Clifton W CALLAWAY ◽  
Holli A Devon ◽  
...  

Introduction: Overcrowded emergency departments (ED) and undifferentiated patients make the provision of care and resources challenging. We examined whether machine learning algorithms could identify ED patients’ disposition (hospitalization and critical care admission) using readily available objective triage data among patients with symptoms suggestive of acute coronary syndrome (ACS). Methods: This was a retrospective observational cohort study of adult patients who were triaged at the ED for a suspected coronary event. A total of 162 input variables (k) were extracted from the electronic health record: demographics (k=3), mode of transportation (k=1), past medical/surgical history (k=57), first ED vital signs (k=7), home medications (k=31), symptomology (k=40), and the computer generated automatic interpretation of 12-lead electrocardiogram (k=23). The primary outcomes were hospitalization and critical care admission (i.e., admission to intensive or step-down care unit). We used 10-fold stratified cross validation to evaluate the performance of five machine learning algorithms to predict the study outcomes: logistic regression, naïve Bayes, random forest, gradient boosting and artificial neural network classifiers. We determined the best model by comparing the area under the receiver operating characteristic curve (AUC) of all models. Results: Included were 1201 patients (age 64±14, 39% female; 10% Black) with a total of 956 hospitalizations, and 169 critical care admissions. The best performing machine learning classifier for the outcome of hospitalization was gradient boosting machine with an AUC of 0.85 (95% CI, 0.82–0.89), 89% sensitivity, and F-score of 0.83; random forest classifier performed the best for the outcome of critical care admission with an AUC of 0.73 (95% CI, 0.70–0.77), 76% sensitivity, and F-score of 0.56. Conclusion: Predictive machine learning algorithms demonstrate excellent to good discriminative power to predict hospitalization and critical care admission, respectively. Administrators and clinicians could benefit from machine learning approaches to predict hospitalization and critical care admission, to optimize and allocate scarce ED and hospital resources and provide optimal care.


Circulation ◽  
2021 ◽  
Vol 144 (Suppl_2) ◽  
Author(s):  
Tsung-Chien Lu ◽  
Eric H Chou ◽  
CHIH-HUNG WANG ◽  
Amir Mostafavi ◽  
Mario Tovar ◽  
...  

Introduction: There are only scarce models developed for stratifying the risk of cardiac arrest from COVID-19 patients presenting to the ED with suspected pneumonia. By using the machine learning (ML) approach, we aimed to develop and validate the ML models to predict in-hospital cardiac arrest (IHCA) in patients admitted from the ED. Hypothesis: We hypothesized that ML approach can serve as a valuable tool in identifying patients at risk of IHCA in a timely fashion. Methods: We included the COVID-19 patients admitted from the EDs of five hospitals in Texas between March and November 2020. All adult (≥ 18 years) patients were included if they had positive RT-PCR for SARS-CoV-2 and also received CXR examination for suspected pneumonia. Patients’ demographic, past medical history, vital signs at ED triage, CXR findings, and laboratory results were retrieved from the EMR system. The primary outcome (IHCA) was identified via a resuscitation code. Patients presented as OHCA or without any blood testing were excluded. Nonrandom splitting strategy based on different location was used to divide the dataset into the training (one urban and two suburban hospitals) and testing cohort (one urban and one suburban hospital) at around 2-to-1 ratio. Three supervised ML models were trained and performances were evaluated and compared with the National Early Warning Score (NEWS) by the area under the receiver operating characteristic curve (AUC). Results: We included 1,485 records for analysis. Of them, 190 (12.8%) developed IHCA. Of the constructed ML models, Random Forest outperformed the others with the best AUC result (0.930, 95% CI: 0.896-0.958), followed by Gradient Boosting (0.929, 95% CI: 0.891-0.959) and Extra Trees classifier (0.909, 95% CI: 0.875-0.943). All constructed ML models performed significantly better than by using the NEWS scoring system (AUC: 0.787, 95% CI: 0.725-0.840). The top six important features selected were age, oxygen saturation at triage, and lab data of APTT, lactic acid, and LDH. Conclusions: The ML approach showed excellent discriminatory performance to identify IHCA for patients with COVID-19 and suspected pneumonia. It has the potential to save more life or provide end-of-life decision making if successfully implemented in the EMR system.


Surgery Today ◽  
2017 ◽  
Vol 48 (4) ◽  
pp. 388-394 ◽  
Author(s):  
Tsutomu Namikawa ◽  
Yasuhiro Kawanishi ◽  
Kazune Fujisawa ◽  
Eri Munekage ◽  
Jun Iwabu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document