scholarly journals Predicting Survival of Patients With Rectal Neuroendocrine Tumors Using Machine Learning: A SEER-Based Population Study

2021 ◽  
Vol 8 ◽  
Author(s):  
Xiaoyun Cheng ◽  
Jinzhang Li ◽  
Tianming Xu ◽  
Kemin Li ◽  
Jingnan Li

Background: The number of patients diagnosed with rectal neuroendocrine tumors (R-NETs) is increasing year by year. An integrated survival predictive model is required to predict the prognosis of R-NETs. The present study is aimed at exploring epidemiological characteristics of R-NETs based on a retrospective study from the Surveillance, Epidemiology, and End Results (SEER) database and predicting survival of R-NETs with machine learning.Methods: Data of patients with R-NETs were extracted from the SEER database (2000–2017), and data were also retrospectively collected from a single medical center in China. The main outcome measure was the 5-year survival status. Risk factors affecting survival were analyzed by Cox regression analysis, and six common machine learning algorithms were chosen to build the predictive models. Data from the SEER database were divided into a training set and an internal validation set according to the year 2010 as a time point. Data from China were chosen as an external validation set. The best machine learning predictive model was compared with the American Joint Committee on Cancer (AJCC) seventh staging system to evaluate its predictive performance in the internal validation dataset and external validation dataset.Results: A total of 10,580 patients from the SEER database and 68 patients from a single medical center were included in the analysis. Age, gender, race, histologic type, tumor size, tumor number, summary stage, and surgical treatment were risk factors affecting survival status. After the adjustment of parameters and algorithms comparison, the predictive model using the eXtreme Gradient Boosting (XGBoost) algorithm had the best predictive performance in the training set [area under the curve (AUC) = 0.87, 95%CI: 0.86–0.88]. In the internal validation, the predictive ability of XGBoost was better than that of the AJCC seventh staging system (AUC: 0.90 vs. 0.78). In the external validation, the XGBoost predictive model (AUC = 0.89) performed better than the AJCC seventh staging system (AUC = 0.83).Conclusions: The XGBoost algorithm had better predictive power than the AJCC seventh staging system, which had a potential value of the clinical application.

2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yue Gao ◽  
Lingxi Chen ◽  
Jianhua Chi ◽  
Shaoqing Zeng ◽  
Xikang Feng ◽  
...  

Abstract Background Immune and inflammatory dysfunction was reported to underpin critical COVID-19(coronavirus disease 2019). We aim to develop a machine learning model that enables accurate prediction of critical COVID-19 using immune-inflammatory features at admission. Methods We retrospectively collected 2076 consecutive COVID-19 patients with definite outcomes (discharge or death) between January 27, 2020 and March 30, 2020 from two hospitals in China. Critical illness was defined as admission to intensive care unit, receiving invasive ventilation, or death. Least Absolute Shrinkage and Selection Operator (LASSO) was applied for feature selection. Five machine learning algorithms, including Logistic Regression (LR), Support Vector Machine (SVM), Gradient Boosted Decision Tree (GBDT), K-Nearest Neighbor (KNN), and Neural Network (NN) were built in a training dataset, and assessed in an internal validation dataset and an external validation dataset. Results Six features (procalcitonin, [T + B + NK cell] count, interleukin 6, C reactive protein, interleukin 2 receptor, T-helper lymphocyte/T-suppressor lymphocyte) were finally used for model development. Five models displayed varying but all promising predictive performance. Notably, the ensemble model, SPMCIIP (severity prediction model for COVID-19 by immune-inflammatory parameters), derived from three contributive algorithms (SVM, GBDT, and NN) achieved the best performance with an area under the curve (AUC) of 0.991 (95% confidence interval [CI] 0.979–1.000) in internal validation cohort and 0.999 (95% CI 0.998–1.000) in external validation cohort to identify patients with critical COVID-19. SPMCIIP could accurately and expeditiously predict the occurrence of critical COVID-19 approximately 20 days in advance. Conclusions The developed online prediction model SPMCIIP is hopeful to facilitate intensive monitoring and early intervention of high risk of critical illness in COVID-19 patients. Trial registration This study was retrospectively registered in the Chinese Clinical Trial Registry (ChiCTR2000032161). Graphical abstracthelper lymphocytve vv


2020 ◽  
Author(s):  
Sunae Ryu ◽  
Woo Jin Jung ◽  
Zheng Jiao ◽  
Jung Woo Chae ◽  
Hwi-yeol Yun

Aim: Several studies have reported population pharmacokinetic models for phenobarbital (PB), but the predictive performance of these models has not been well documented. This study aims to do external validation of the predictive performance in published pharmacokinetic models. Methods: Therapeutic drug monitoring data collected in neonates and young infants treated with PB for seizure control, was used for external validation. A literature review was conducted through PubMed to identify population pharmacokinetic models. Prediction- and simulation-based diagnostics, and Bayesian forecasting were performed for external validation. The incorporation of size or maturity functions into the published models was also tested for prediction improvement. Results: A total of 79 serum concentrations from 28 subjects were included in the external validation dataset. Seven population pharmacokinetic studies of PB were selected for evaluation. The model by Voller et al. [27] showed the best performance concerning prediction-based evaluation. In simulation-based analyses, the normalized prediction distribution error of two models (those of Shellhaas et al. [24] and Marsot et al. [25]) obeyed a normal distribution. Bayesian forecasting with more than one observation improved predictive capability. Incorporation of both allometric size scaling and maturation function generally enhanced the predictive performance, but with marked improvement for the adult pharmacokinetic model. Conclusion: The predictive performance of published pharmacokinetic models of PB was diverse, and validation may be necessary to extrapolate to different clinical settings. Our findings suggest that Bayesian forecasting improves the predictive capability of individual concentrations for pediatrics.


2019 ◽  
Author(s):  
Zied Hosni ◽  
Annalisa Riccardi ◽  
Stephanie Yerdelen ◽  
Alan R. G. Martin ◽  
Deborah Bowering ◽  
...  

<div><div><p>Polymorphism is the capacity of a molecule to adopt different conformations or molecular packing arrangements in the solid state. This is a key property to control during pharmaceutical manufacturing because it can impact a range of properties including stability and solubility. In this study, a novel approach based on machine learning classification methods is used to predict the likelihood for an organic compound to crystallise in multiple forms. A training dataset of drug-like molecules was curated from the Cambridge Structural Database (CSD) and filtered according to entries in the Drug Bank database. The number of separate forms in the CSD for each molecule was recorded. A metaclassifier was trained using this dataset to predict the expected number of crystalline forms from the compound descriptors. This approach was used to estimate the number of crystallographic forms for an external validation dataset. These results suggest this novel methodology can be used to predict the extent of polymorphism of new drugs or not-yet experimentally screened molecules. This promising method complements expensive ab initio methods for crystal structure prediction and as integral to experimental physical form screening, may identify systems that with unexplored potential.</p> </div> </div>


2020 ◽  
Vol 9 (11) ◽  
pp. 3427 ◽  
Author(s):  
Youn I Choi ◽  
Sung Jin Park ◽  
Jun-Won Chung ◽  
Kyoung Oh Kim ◽  
Jae Hee Cho ◽  
...  

Background: The incidence and global burden of inflammatory bowel disease (IBD) have steadily increased in the past few decades. Improved methods to stratify risk and predict disease-related outcomes are required for IBD. Aim: The aim of this study was to develop and validate a machine learning (ML) model to predict the 5-year risk of starting biologic agents in IBD patients. Method: We applied an ML method to the database of the Korean common data model (K-CDM) network, a data sharing consortium of tertiary centers in Korea, to develop a model to predict the 5-year risk of starting biologic agents in IBD patients. The records analyzed were those of patients diagnosed with IBD between January 2006 and June 2017 at Gil Medical Center (GMC; n = 1299) or present in the K-CDM network (n = 3286). The ML algorithm was developed to predict 5- year risk of starting biologic agents in IBD patients using data from GMC and externally validated with the K-CDM network database. Result: The ML model for prediction of IBD-related outcomes at 5 years after diagnosis yielded an area under the curve (AUC) of 0.86 (95% CI: 0.82–0.92), in an internal validation study carried out at GMC. The model performed consistently across a range of other datasets, including that of the K-CDM network (AUC = 0.81; 95% CI: 0.80–0.85), in an external validation study. Conclusion: The ML-based prediction model can be used to identify IBD-related outcomes in patients at risk, enabling physicians to perform close follow-up based on the patient’s risk level, estimated through the ML algorithm.


2019 ◽  
Vol 8 (6) ◽  
pp. 799 ◽  
Author(s):  
Cheng-Shyuan Rau ◽  
Shao-Chun Wu ◽  
Jung-Fang Chuang ◽  
Chun-Ying Huang ◽  
Hang-Tsung Liu ◽  
...  

Background: We aimed to build a model using machine learning for the prediction of survival in trauma patients and compared these model predictions to those predicted by the most commonly used algorithm, the Trauma and Injury Severity Score (TRISS). Methods: Enrolled hospitalized trauma patients from 2009 to 2016 were divided into a training dataset (70% of the original data set) for generation of a plausible model under supervised classification, and a test dataset (30% of the original data set) to test the performance of the model. The training and test datasets comprised 13,208 (12,871 survival and 337 mortality) and 5603 (5473 survival and 130 mortality) patients, respectively. With the provision of additional information such as pre-existing comorbidity status or laboratory data, logistic regression (LR), support vector machine (SVM), and neural network (NN) (with the Stuttgart Neural Network Simulator (RSNNS)) were used to build models of survival prediction and compared to the predictive performance of TRISS. Predictive performance was evaluated by accuracy, sensitivity, and specificity, as well as by area under the curve (AUC) measures of receiver operating characteristic curves. Results: In the validation dataset, NN and the TRISS presented the highest score (82.0%) for balanced accuracy, followed by SVM (75.2%) and LR (71.8%) models. In the test dataset, NN had the highest balanced accuracy (75.1%), followed by the TRISS (70.2%), SVM (70.6%), and LR (68.9%) models. All four models (LR, SVM, NN, and TRISS) exhibited a high accuracy of more than 97.5% and a sensitivity of more than 98.6%. However, NN exhibited the highest specificity (51.5%), followed by the TRISS (41.5%), SVM (40.8%), and LR (38.5%) models. Conclusions: These four models (LR, SVM, NN, and TRISS) exhibited a similar high accuracy and sensitivity in predicting the survival of the trauma patients. In the test dataset, the NN model had the highest balanced accuracy and predictive specificity.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Hojjat Salehinejad ◽  
Jumpei Kitamura ◽  
Noah Ditkofsky ◽  
Amy Lin ◽  
Aditya Bharatha ◽  
...  

AbstractMachine learning (ML) holds great promise in transforming healthcare. While published studies have shown the utility of ML models in interpreting medical imaging examinations, these are often evaluated under laboratory settings. The importance of real world evaluation is best illustrated by case studies that have documented successes and failures in the translation of these models into clinical environments. A key prerequisite for the clinical adoption of these technologies is demonstrating generalizable ML model performance under real world circumstances. The purpose of this study was to demonstrate that ML model generalizability is achievable in medical imaging with the detection of intracranial hemorrhage (ICH) on non-contrast computed tomography (CT) scans serving as the use case. An ML model was trained using 21,784 scans from the RSNA Intracranial Hemorrhage CT dataset while generalizability was evaluated using an external validation dataset obtained from our busy trauma and neurosurgical center. This real world external validation dataset consisted of every unenhanced head CT scan (n = 5965) performed in our emergency department in 2019 without exclusion. The model demonstrated an AUC of 98.4%, sensitivity of 98.8%, and specificity of 98.0%, on the test dataset. On external validation, the model demonstrated an AUC of 95.4%, sensitivity of 91.3%, and specificity of 94.1%. Evaluating the ML model using a real world external validation dataset that is temporally and geographically distinct from the training dataset indicates that ML generalizability is achievable in medical imaging applications.


BMJ ◽  
2019 ◽  
pp. l4293 ◽  
Author(s):  
Mohammed T Hudda ◽  
Mary S Fewtrell ◽  
Dalia Haroun ◽  
Sooky Lum ◽  
Jane E Williams ◽  
...  

Abstract Objectives To develop and validate a prediction model for fat mass in children aged 4-15 years using routinely available risk factors of height, weight, and demographic information without the need for more complex forms of assessment. Design Individual participant data meta-analysis. Setting Four population based cross sectional studies and a fifth study for external validation, United Kingdom. Participants A pooled derivation dataset (four studies) of 2375 children and an external validation dataset of 176 children with complete data on anthropometric measurements and deuterium dilution assessments of fat mass. Main outcome measure Multivariable linear regression analysis, using backwards selection for inclusion of predictor variables and allowing non-linear relations, was used to develop a prediction model for fat-free mass (and subsequently fat mass by subtracting resulting estimates from weight) based on the four studies. Internal validation and then internal-external cross validation were used to examine overfitting and generalisability of the model’s predictive performance within the four development studies; external validation followed using the fifth dataset. Results Model derivation was based on a multi-ethnic population of 2375 children (47.8% boys, n=1136) aged 4-15 years. The final model containing predictor variables of height, weight, age, sex, and ethnicity had extremely high predictive ability (optimism adjusted R 2 : 94.8%, 95% confidence interval 94.4% to 95.2%) with excellent calibration of observed and predicted values. The internal validation showed minimal overfitting and good model generalisability, with excellent calibration and predictive performance. External validation in 176 children aged 11-12 years showed promising generalisability of the model (R 2 : 90.0%, 95% confidence interval 87.2% to 92.8%) with good calibration of observed and predicted fat mass (slope: 1.02, 95% confidence interval 0.97 to 1.07). The mean difference between observed and predicted fat mass was −1.29 kg (95% confidence interval −1.62 to −0.96 kg). Conclusion The developed model accurately predicted levels of fat mass in children aged 4-15 years. The prediction model is based on simple anthropometric measures without the need for more complex forms of assessment and could improve the accuracy of assessments for body fatness in children (compared with those provided by body mass index) for effective surveillance, prevention, and management of clinical and public health obesity.


Author(s):  
Jacopo Burrello ◽  
Martina Amongero ◽  
Fabrizio Buffolo ◽  
Elisa Sconfienza ◽  
Vittorio Forestiero ◽  
...  

Abstract Context The diagnostic work-up of primary aldosteronism (PA) includes screening and confirmation steps. Case confirmation is time-consuming, expensive, and there is no consensus on tests and thresholds to be used. Diagnostic algorithms to avoid confirmatory testing may be useful for the management of patients with PA. Objective Development and validation of diagnostic models to confirm or exclude PA diagnosis in patients with a positive screening test. Design, Patients and Setting We evaluated 1,024 patients who underwent confirmatory testing for PA. The diagnostic models were developed in a training cohort (n=522), and then tested on an internal validation cohort (n=174) and on an independent external prospective cohort (n=328). Main outcome measure Different diagnostic models and a 16-point score were developed by machine learning and regression analysis to discriminate patients with a confirmed diagnosis of PA. Results Male sex, antihypertensive medication, plasma renin activity, aldosterone, potassium levels and presence of organ damage were associated with a confirmed diagnosis of PA. Machine learning based models displayed an accuracy of 72.9-83.9%. The Primary Aldosteronism Confirmatory Testing (PACT) score correctly classified 84.1% at training and 83.9% or 81.1% at internal and external validation, respectively. A flow chart employing the PACT score to select patients for confirmatory testing, correctly managed all patients, and resulted in a 22.8% reduction in the number of confirmatory tests. Conclusions The integration of diagnostic modelling algorithms in clinical practice may improve the management of patients with PA by circumventing unnecessary confirmatory testing.


2021 ◽  
Vol 11 ◽  
Author(s):  
Shengnan Zhou ◽  
Shitao Jiang ◽  
Weijie Chen ◽  
Haixin Yin ◽  
Liangbo Dong ◽  
...  

BackgroundFor this study, we explored the prognostic profiles of biliary neuroendocrine neoplasms (NENs) patients and identified factors related to prognosis. Further, we developed and validated an effective nomogram to predict the overall survival (OS) of individual patients with biliary NENs.MethodsWe included a total of 446 biliary NENs patients from the SEER database. We used Kaplan-Meier curves to determine survival time. We employed univariate and multivariate Cox analyses to estimate hazard ratios to identify prognostic factors. We constructed a predictive nomogram based on the results of the multivariate analyses. In addition, we included 28 biliary NENs cases from our center as an external validation cohort.ResultsThe median survival time of biliary NENs from the SEER database was 31 months, and the value of gallbladder NENs (23 months) was significantly shorter than that of the bile duct (45 months) and ampulla of Vater (33.5 months, p=0.023). Multivariate Cox analyses indicated that age, tumor size, pathological classification, SEER stage, and surgery were independent variables associated with survival. The constructed prognostic nomogram demonstrated good calibration and discrimination C-index values of 0.783 and 0.795 in the training and validation dataset, respectively.ConclusionAge, tumor size, pathological classification, SEER stage, and surgery were predictors for the survival of biliary NENs. We developed a nomogram that could determine the 3-year and 5-year OS rates. Through validation of our central database, the novel nomogram is a useful tool for clinicians in estimating individual survival among biliary NENs patients.


2022 ◽  
Vol 8 ◽  
Author(s):  
Jinzhang Li ◽  
Ming Gong ◽  
Yashutosh Joshi ◽  
Lizhong Sun ◽  
Lianjun Huang ◽  
...  

BackgroundAcute renal failure (ARF) is the most common major complication following cardiac surgery for acute aortic syndrome (AAS) and worsens the postoperative prognosis. Our aim was to establish a machine learning prediction model for ARF occurrence in AAS patients.MethodsWe included AAS patient data from nine medical centers (n = 1,637) and analyzed the incidence of ARF and the risk factors for postoperative ARF. We used data from six medical centers to compare the performance of four machine learning models and performed internal validation to identify AAS patients who developed postoperative ARF. The area under the curve (AUC) of the receiver operating characteristic (ROC) curve was used to compare the performance of the predictive models. We compared the performance of the optimal machine learning prediction model with that of traditional prediction models. Data from three medical centers were used for external validation.ResultsThe eXtreme Gradient Boosting (XGBoost) algorithm performed best in the internal validation process (AUC = 0.82), which was better than both the logistic regression (LR) prediction model (AUC = 0.77, p &lt; 0.001) and the traditional scoring systems. Upon external validation, the XGBoost prediction model (AUC =0.81) also performed better than both the LR prediction model (AUC = 0.75, p = 0.03) and the traditional scoring systems. We created an online application based on the XGBoost prediction model.ConclusionsWe have developed a machine learning model that has better predictive performance than traditional LR prediction models as well as other existing risk scoring systems for postoperative ARF. This model can be utilized to provide early warnings when high-risk patients are found, enabling clinicians to take prompt measures.


Sign in / Sign up

Export Citation Format

Share Document