scholarly journals Development of a Gene-Based Prediction Model for Recurrence of Colorectal Cancer Using an Ensemble Learning Algorithm

2021 ◽  
Vol 11 ◽  
Author(s):  
Han-Ching Chan ◽  
Amrita Chattopadhyay ◽  
Eric Y. Chuang ◽  
Tzu-Pin Lu

It is difficult to determine which patients with stage I and II colorectal cancer are at high risk of recurrence, qualifying them to undergo adjuvant chemotherapy. In this study, we aimed to determine a gene signature using gene expression data that could successfully identify high risk of recurrence among stage I and II colorectal cancer patients. First, a synthetic minority oversampling technique was used to address the problem of imbalanced data due to rare recurrence events. We then applied a sequential workflow of three methods (significance analysis of microarrays, logistic regression, and recursive feature elimination) to identify genes differentially expressed between patients with and without recurrence. To stabilize the prediction algorithm, we repeated the above processes on 10 subsets by bagging the training data set and then used support vector machine methods to construct the prediction models. The final predictions were determined by majority voting. The 10 models, using 51 differentially expressed genes, successfully predicted a high risk of recurrence within 3 years in the training data set, with a sensitivity of 91.18%. For the validation data sets, the sensitivity of the prediction with samples from two other countries was 80.00% and 91.67%. These prediction models can potentially function as a tool to decide if adjuvant chemotherapy should be administered after surgery for patients with stage I and II colorectal cancer.

2019 ◽  
Author(s):  
Suresh K Bhavnani ◽  
Bryant Dang ◽  
Rebekah Penton ◽  
Shyam Visweswaran ◽  
Kevin E Bassler ◽  
...  

BACKGROUND When older adult patients with hip fracture (HFx) have unplanned hospital readmissions within 30 days of discharge, it doubles their 1-year mortality, resulting in substantial personal and financial burdens. Although such unplanned readmissions are predominantly caused by reasons not related to HFx surgery, few studies have focused on how pre-existing high-risk comorbidities co-occur within and across subgroups of patients with HFx. OBJECTIVE This study aims to use a combination of supervised and unsupervised visual analytical methods to (1) obtain an integrated understanding of comorbidity risk, comorbidity co-occurrence, and patient subgroups, and (2) enable a team of clinical and methodological stakeholders to infer the processes that precipitate unplanned hospital readmission, with the goal of designing targeted interventions. METHODS We extracted a training data set consisting of 16,886 patients (8443 readmitted patients with HFx and 8443 matched controls) and a replication data set consisting of 16,222 patients (8111 readmitted patients with HFx and 8111 matched controls) from the 2010 and 2009 Medicare database, respectively. The analyses consisted of a supervised combinatorial analysis to identify and replicate combinations of comorbidities that conferred significant risk for readmission, an unsupervised bipartite network analysis to identify and replicate how high-risk comorbidity combinations co-occur across readmitted patients with HFx, and an integrated visualization and analysis of comorbidity risk, comorbidity co-occurrence, and patient subgroups to enable clinician stakeholders to infer the processes that precipitate readmission in patient subgroups and to propose targeted interventions. RESULTS The analyses helped to identify (1) 11 comorbidity combinations that conferred significantly higher risk (ranging from <i>P</i>&lt;.001 to <i>P</i>=.01) for a 30-day readmission, (2) 7 biclusters of patients and comorbidities with a significant bicluster modularity (<i>P</i>&lt;.001; Medicare=0.440; random mean 0.383 [0.002]), indicating strong heterogeneity in the comorbidity profiles of readmitted patients, and (3) inter- and intracluster risk associations, which enabled clinician stakeholders to infer the processes involved in the exacerbation of specific combinations of comorbidities leading to readmission in patient subgroups. CONCLUSIONS The integrated analysis of risk, co-occurrence, and patient subgroups enabled the inference of processes that precipitate readmission, leading to a comorbidity exacerbation risk model for readmission after HFx. These results have direct implications for (1) the management of comorbidities targeted at high-risk subgroups of patients with the goal of pre-emptively reducing their risk of readmission and (2) the development of more accurate risk prediction models that incorporate information about patient subgroups.


2012 ◽  
Vol 30 (15_suppl) ◽  
pp. e14117-e14117
Author(s):  
Campbell SD Roxburgh ◽  
Alan K Foulis ◽  
Manal Atwan ◽  
Paul G Horgan ◽  
Donald C. Mcmillan

e14117 Background: Venous invasion (VI) is a high-risk characteristic in colorectal cancer (CRC) and in stage II disease guides provision of adjuvant therapy. However, reported rates vary in published studies from 10-90%. We recently reported use of elastica stains improve reproducibility of reporting, increasing rates to >50% (Roxburgh, Ann Surg, 2010). Furthermore, compared to H&E alone, elastica detected VI provided superior prediction of 3yr cancer survival in an unselected cohort of CRC patients. The present study aims to examine how the approach could be used in patients with node negative CRC. Methods: We retrieved pre-2003 tumour blocks, sectioned and stained them with elastica. Post-2003, elastica detected VI was routinely reported. A minimum of 3 blocks was required for analysis. Those who died within 30 days of surgery or had neoadjuvant therapy were excluded. Results: 244 stage I/II patients underwent surgery between 1997-2006. 65 cases pre-2003 were analyzed retrospectively. The rate of elastica detected VI was 54%. Elastica detected VI related to other high-risk pathology including T stage (p<0.001), serosal invasion (p<0.01), tumour grade (p<0.05) and low-grade lymphocytic infiltrate (P<0.05). Minimum follow-up was 5 yrs; mean follow-up 99 months (60-178), during which there were 99 deaths, 48 from cancer. Absence of VI related to improved 5-yr cancer specific survival (93% vs 66%). On multivariate analysis, VI independently related to cancer specific survival (HR=5.5,95%CI 2-13,p<0.001) with margin involvement (HR=2.4,95%CI 1-6,p=0.067) and serosal involvement (HR=2.2,95%CI1-4, p=0.015). For CRC mortality, the area under the receiver operator curve was highest for VI compared with other pathology (AUC 0.69, 95%CI 0.6-0.8, P<0.001). Absence of VI related to 5-yr survivals of 92% and 97% in colon and rectal cancer respectively. Conclusions: More objective assessment of VI with routine elastica staining provides accurate prediction of survival in stage I/II CRC. Presence of VI was associated with a 5.5 fold increased risk of cancer death at 5 yrs. Such results support routine use of elastica stains to identify patients with node negative disease at risk of recurrence.


10.2196/13567 ◽  
2020 ◽  
Vol 8 (10) ◽  
pp. e13567
Author(s):  
Suresh K Bhavnani ◽  
Bryant Dang ◽  
Rebekah Penton ◽  
Shyam Visweswaran ◽  
Kevin E Bassler ◽  
...  

Background When older adult patients with hip fracture (HFx) have unplanned hospital readmissions within 30 days of discharge, it doubles their 1-year mortality, resulting in substantial personal and financial burdens. Although such unplanned readmissions are predominantly caused by reasons not related to HFx surgery, few studies have focused on how pre-existing high-risk comorbidities co-occur within and across subgroups of patients with HFx. Objective This study aims to use a combination of supervised and unsupervised visual analytical methods to (1) obtain an integrated understanding of comorbidity risk, comorbidity co-occurrence, and patient subgroups, and (2) enable a team of clinical and methodological stakeholders to infer the processes that precipitate unplanned hospital readmission, with the goal of designing targeted interventions. Methods We extracted a training data set consisting of 16,886 patients (8443 readmitted patients with HFx and 8443 matched controls) and a replication data set consisting of 16,222 patients (8111 readmitted patients with HFx and 8111 matched controls) from the 2010 and 2009 Medicare database, respectively. The analyses consisted of a supervised combinatorial analysis to identify and replicate combinations of comorbidities that conferred significant risk for readmission, an unsupervised bipartite network analysis to identify and replicate how high-risk comorbidity combinations co-occur across readmitted patients with HFx, and an integrated visualization and analysis of comorbidity risk, comorbidity co-occurrence, and patient subgroups to enable clinician stakeholders to infer the processes that precipitate readmission in patient subgroups and to propose targeted interventions. Results The analyses helped to identify (1) 11 comorbidity combinations that conferred significantly higher risk (ranging from P<.001 to P=.01) for a 30-day readmission, (2) 7 biclusters of patients and comorbidities with a significant bicluster modularity (P<.001; Medicare=0.440; random mean 0.383 [0.002]), indicating strong heterogeneity in the comorbidity profiles of readmitted patients, and (3) inter- and intracluster risk associations, which enabled clinician stakeholders to infer the processes involved in the exacerbation of specific combinations of comorbidities leading to readmission in patient subgroups. Conclusions The integrated analysis of risk, co-occurrence, and patient subgroups enabled the inference of processes that precipitate readmission, leading to a comorbidity exacerbation risk model for readmission after HFx. These results have direct implications for (1) the management of comorbidities targeted at high-risk subgroups of patients with the goal of pre-emptively reducing their risk of readmission and (2) the development of more accurate risk prediction models that incorporate information about patient subgroups.


2016 ◽  
Vol Volume 9 ◽  
pp. 6325-6332 ◽  
Author(s):  
Leonardo Muratori ◽  
Giulia Petroni ◽  
Lorenzo Antonuzzo ◽  
Luca Boni ◽  
Jessica Iorio ◽  
...  

2022 ◽  
Vol 34 (2) ◽  
pp. 1-17
Author(s):  
Rahman A. B. M. Salman ◽  
Lee Myeongbae ◽  
Lim Jonghyun ◽  
Yongyun Cho ◽  
Shin Changsun

Energy has been obtained as one of the key inputs for a country's economic growth and social development. Analysis and modeling of industrial energy are currently a time-insertion process because more and more energy is consumed for economic growth in a smart factory. This study aims to present and analyse the predictive models of the data-driven system to be used by appliances and find out the most significant product item. With repeated cross-validation, three statistical models were trained and tested in a test set: 1) General Linear Regression Model (GLM), 2) Support Vector Machine (SVM), and 3) boosting Tree (BT). The performance of prediction models measured by R2 error, Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Variation (CV). The best model from the study is the Support Vector Machine (SVM) that has been able to provide R2 of 0.86 for the training data set and 0.85 for the testing data set with a low coefficient of variation, and the most significant product of this smart factory is Skelp.


2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


Author(s):  
Kosuke Mima ◽  
Nobutomo Miyanari ◽  
Keisuke Kosumi ◽  
Takuya Tajiri ◽  
Kosuke Kanemitsu ◽  
...  

2016 ◽  
Vol 49 (1) ◽  
pp. 1600764 ◽  
Author(s):  
Fiona McDonald ◽  
Michèle De Waele ◽  
Lizza E. L. Hendriks ◽  
Corinne Faivre-Finn ◽  
Anne-Marie C. Dingemans ◽  
...  

The incidence of stage I and II nonsmall cell lung cancer is likely to increase with the ageing population and introduction of screening for high-risk individuals. Optimal management requires multidisciplinary collaboration. Local treatments include surgery and radiotherapy and these are currently combined with (neo)adjuvant chemotherapy in specific cases to improve long-term outcome. Targeted therapies and immunotherapy may also become important therapeutic modalities in this patient group. For resectable disease in patients with low cardiopulmonary risk, complete surgical resection with lobectomy remains the gold standard. Minimally invasive techniques, conservative and sublobar resections are suitable for a subset of patients. Data are emerging that radiotherapy, especially stereotactic body radiation therapy, is a valid alternative in compromised patients who are high-risk candidates for surgery. Whether this is also true for good surgical candidates remains to be evaluated in randomised trials. In specific subgroups adjuvant chemotherapy has been shown to prolong survival; however, patient selection remains important. Neoadjuvant chemotherapy may yield similar results as adjuvant chemotherapy. The role of targeted therapies and immunotherapy in early stage nonsmall cell lung cancer has not yet been determined and results of randomised trials are awaited.


Sign in / Sign up

Export Citation Format

Share Document