scholarly journals Cancer classification using machine learning and HRV analysis: preliminary evidence from a pilot study

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Marta Vigier ◽  
Benjamin Vigier ◽  
Elisabeth Andritsch ◽  
Andreas R. Schwerdtfeger

AbstractMost cancer patients exhibit autonomic dysfunction with attenuated heart rate variability (HRV) levels compared to healthy controls. This research aimed to create and evaluate a machine learning (ML) model enabling discrimination between cancer patients and healthy controls based on 5-min-ECG recordings. We selected 12 HRV features based on previous research and compared the results between cancer patients and healthy individuals using Wilcoxon sum-rank test. Recursive Feature Elimination (RFE) identified the top five features, averaged over 5 min and employed them as input to three different ML. Next, we created an ensemble model based on a stacking method that aggregated the predictions from all three base classifiers. All HRV features were significantly different between the two groups. SDNN, RMSSD, pNN50%, HRV triangular index, and SD1 were selected by RFE and used as an input to three different ML. All three base-classifiers performed above chance level, RF being the most efficient with a testing accuracy of 83%. The ensemble model showed a classification accuracy of 86% and an AUC of 0.95. The results obtained by ML algorithms suggest HRV parameters could be a reliable input for differentiating between cancer patients and healthy controls. Results should be interpreted in light of some limitations that call for replication studies with larger sample sizes.

2019 ◽  
Vol 37 (15_suppl) ◽  
pp. 3135-3135
Author(s):  
Takeshi Murata ◽  
Takako Yanagisawa ◽  
Toshiaki Kurihara ◽  
Miku Kaneko ◽  
Sana Ota ◽  
...  

3135 Background: Saliva is non-invasively accessible and informative biological fluid which has high potential for the early diagnosis of various diseases. The aim of this study is to develop machine learning methods and to explore new salivary biomarkers to discriminate breast cancer patients from healthy controls. Methods: We conducted a comprehensive metabolite analysis of saliva samples obtained from 101 patients with invasive carcinoma (IC), 23 patients with ductal carcinoma in situ (DCIS) and 42 healthy controls, using capillary electrophoresis and liquid chromatography with mass spectrometry to quantify hundreds of hydrophilic metabolites. Saliva samples were collected under 9h fasting and were split into training and validation data. Conventional statistical analyses and artificial intelligence-based methods were used to access the discrimination abilities of the quantified metabolite. Multiple logistic regression (MLR) model and an alternative decision tree (ADTree)-based machine learning methods were used. The generalization abilities of these mathematical models were validated in various computational tests, such as cross-validation and resampling methods. Results: Among quantified 260 metabolites, amino acids and polyamines showed significantly elevated in saliva from breast cancer patients, e.g. spermine showed the highest area under the receiver operating characteristic curves (AUC) to discriminate IC from C; 0.766 (95% confidence interval [CI]; 0.671 – 0.840, P < 0.0001). These metabolites showed no significant difference between C and DICS, i.e., these metabolites were elevated only in the samples of IC. The MLR yielded higher AUC to discriminate IC from C; 0.790 (95% CI; 0.699 – 0.859, P < 0.0001). The ADTree with ensemble approach showed the best AUC; 0.912 (95% CI; 0.838 – 0.961, P < 0.0001). In the comparison of these metabolites in the analysis of each subtype, seven metabolites were significantly different between Luminal A-like and Luminal B-like while, but few metabolites were significantly different among the other subtypes. Conclusions: These data indicated the combination of salivary metabolomic profiles including polyamines showed potential ability to screening breast cancer in a non-invasive way.


2020 ◽  
Vol 38 (15_suppl) ◽  
pp. e14069-e14069
Author(s):  
Oguz Akbilgic ◽  
Ibrahim Karabayir ◽  
Hakan Gunturkun ◽  
Joseph F Pierre ◽  
Ashley C Rashe ◽  
...  

e14069 Background: There is growing interest in the links between cancer and the gut microbiome. However, the effect of chemotherapy upon the gut microbiome remains unknown. We studied whether machine learning can: 1) accurately classify subjects with cancer vs healthy controls and 2) whether this classification model is affected by chemotherapy exposure status. Methods: We used the American Gut Project data to build a extreme gradient boosting (XGBoost) model to distinguish between subjects with cancer vs healthy controls using data on simple demographics and published microbiome. We then further explore the selected features for cancer subjects based on chemotherapy exposure. Results: The cohort included 7,685 subjects consisting of 561 subjects with cancer, 52.5% female, 87.3% White, and average age of 44.7 (SD 17.7). The binary outcome variable represents cancer status. Among 561 subjects with cancer, 94 of them were treated with chemotherapy agents before sampling of microbiomes. As predictors, there were four demographic variables (sex, race, age, BMI) and 1,812 operational taxonomic units (OTUs) each found in at least 2 subjects via RNA sequencing. We randomly split data into 80% training and 20% hidden test. We then built an XGBoost model with 5-fold cross-validation using only training data yielding an AUC (with 95% CI) of 0.79 (0.77, 0.80) and obtained the almost the same AUC on the hidden test data. Based on feature importance analysis, we identified 12 most important features (Age, BMI and 12 OTUs; 4C0d-2, Brachyspirae, Methanosphaera, Geodermatophilaceae, Bifidobacteriaceae, Slackia, Staphylococcus, Acidaminoccus, Devosia, Proteus) and rebuilt a model using only these features and obtained AUC of 0.80 (0.77, 0.83) on the hidden test data. The average predicted probabilities for controls, cancer patients who were exposed to chemotherapy, and cancer patients who were not were 0.071 (0.070,0.073), 0.125 (0.110, 0.140), 0.156 (0.148, 0.164), respectively. There was no statistically significant difference on levels of these 12 OTUs between cancer subjects treated with and without chemotherapy. Conclusions: Machine learning achieved a moderately high accuracy identifying patients’ cancer status based on microbiome. Despite the literature on microbiome and chemotherapy interaction, the levels of 12 OTUs used in our model were not significantly different for cancer patients with or without chemotherapy exposure. Testing this model on other large population databases is needed for broader validation.


2019 ◽  
Vol 5 (suppl) ◽  
pp. 13-13
Author(s):  
Po-Jung SU ◽  
Yu-Ann Fang ◽  
Yung-Chun Chang ◽  
Yung-Chia Kuo ◽  
Yung-Chang Lin

13 Background: For de novo metastatic prostate cancer (mPC)) patients, their prognosis may be really different. Some of these patients response very well to hormone therapy with durable survival, but others may be not. For those poor prognosis patients, if we could predict them as high risk patients when diagnosed, and provide aggressive upfront chemotherapy or novel hormonal therapy, they might get better treatment outcomes. Methods: We used data of prostate cancer patients from 2000 to 2016 in Chang Gung Research Database. There are 799 de novo mPC patients with castration. We predicted the possibility for these patients progressed to metastatic castration-resistant prostate cancer (mCRPC) in 1 year and find the high risk group patients. Then we figured out the best features for prediction from the best classifier with Recursive Feature Elimination. Results: The de nove mPC patients who pregressed to mCRPC in 1 year, whose mOS is 21.9 months is worse than who progressed to mCRPC beyond 1 year significantly, whose mOS is 80.7 months. (adjusted hazard ratio[aHR]: 6.43, P<0.001). The overall performance of machine learning by XGBoost is the best in all predictive models for high risk patients. (AUC=0.7000, Accuracy=0.7143). We excluded the features with missing data over 50%, then put all other features in the model. (AUC=0.7042, Accuracy=0.7239). But we got the best performance with only 11 features, including age, time from diagnosis to castration, nadir PSA, hemoglobin, eosinophil/white blood cell ratio, alkaline phosphatase, alanine transaminase, blood urea nitrogen, creatinine, prothrombin time, and secondary primary cancer, by Recursive Feature Elimination. (AUC=0.7131, Accuracy=0.7267). Conclusions: We found the predictive model has better predictive accuracy and shorter manuscript time with less features selected by Recursive Feature Elimination.We can predict high risk group in de novo mPC patients and make better clinical decision for treatment with this XGBoost model.


2020 ◽  
Vol 38 (15_suppl) ◽  
pp. e13555-e13555
Author(s):  
Wei Zhou ◽  
Ji He

e13555 Background: Survival analysis is used to establish a connection between covariates and the time of event with censored data. Compared with traditional statistical methods, machine learning approaches based on sophisticated and effective computational algorithms are more capable for handling complex multi-dimensional medical data. Methods: We developed an automated machine learning tool MLsurvival to analyze survival data of cancer patients, algorithms of which include the statistical cox regression and machine learning based on linear model (elastic net), ensemble model (gradient boosting with least squares or regression trees and random forest) and support vector kernel (linear and non-linear). The workflow of MLsurvival is comprised with four modules: preprocessing (missing data remove or imputation and feature standardization), feature selection (unsupervised multi-statistics and supervised machine recursive feature elimination with cross-validation), modeling (hyperparameter and performance evaluation) and prediction. To evaluate the performance of this tool, we analyzed medical data for 222 hepatocellular carcinoma (HCC) patients at stage II-III who underwent surgical resection and developed five machine learning approach based estimation models for overall survival (OS). Models were trained on 155 patients with 300 features, including clinical information, somatic mutation and copy number variation, and independently validated on the rest 67 patients. Results: The ensemble model of gradient boosting fitted by MLsurvival using 48 selected features for the data of 155 HCC patients possessed the highest mean AUC and C-Index value. For 67 patients in validation set, this model predicted half year mortality of patients with an AUC of 0.9 (95% CI, 0.771-1.029) and one year mortality with an AUC of 0.897 (95% CI, 0.816-0.978). In addition to that, this model was also predictive for the time of recurrence (pvalue < 0.0001). Furthermore, we also utilized this tool in survival analysis for extensive real data from patients with breast, lung, and esophagus cancers, while most of results showed superior accuracy and stable performance. Conclusions: MLsurvial is an automate tool for survival analysis of cancer patients with well performance. The risk scoring system implemented in this tool offers a novel strategy for incorporating multi-dimensional risk factors to predict clinical outcome, contributes to the better understanding of disease background and helps to optimize the clinical follow-up and therapeutic treatment for cancer patients.


MicroRNA ◽  
2019 ◽  
Vol 9 (1) ◽  
pp. 58-63
Author(s):  
Batool Savari ◽  
Sohrab Boozarpour ◽  
Maryam Tahmasebi-Birgani ◽  
Hossein Sabouri ◽  
Seyed Mohammad Hosseini

Background: Breast cancer is the most common cancer diagnosed in women worldwide. So it seems that there's a good chance of recovery if it's detected in its early stages even before the appearances of symptoms. Recent studies have shown that miRNAs play an important role during cancer progression. These transcripts can be tracked in liquid samples to reveal if cancer exists, for earlier treatment. MicroRNA-21 (miR-21) has been shown to be a key regulator of carcinogenesis, and breast tumor is no exception. Objective: The present study was aimed to track the miR-21 expression level in serum of the breast cancer patients in comparison with that of normal counterparts. Methods: Comparative real-time polymerase chain reaction was applied to determine the levels of expression of miR-21 in the serum samples of 57 participants from which, 42 were the patients with breast cancer including pre-surgery patients (n = 30) and post-surgery patients (n = 12), and the others were the healthy controls (n = 15). Results: MiR-21 was significantly over expressed in the serum of breast cancer patients as compared with healthy controls (P = 0.002). A significant decrease was also observed following tumor resection (P < 0.0001). Moreover, it was found that miR-21 overexpression level was significantly associated with tumor grade (P = 0.004). Conclusion: These findings suggest that miR-21 has the potential to be used as a novel breast cancer biomarker for early detection and prognosis, although further experiments are needed.


Diagnostics ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 574
Author(s):  
Gennaro Tartarisco ◽  
Giovanni Cicceri ◽  
Davide Di Pietro ◽  
Elisa Leonardi ◽  
Stefania Aiello ◽  
...  

In the past two decades, several screening instruments were developed to detect toddlers who may be autistic both in clinical and unselected samples. Among others, the Quantitative CHecklist for Autism in Toddlers (Q-CHAT) is a quantitative and normally distributed measure of autistic traits that demonstrates good psychometric properties in different settings and cultures. Recently, machine learning (ML) has been applied to behavioral science to improve the classification performance of autism screening and diagnostic tools, but mainly in children, adolescents, and adults. In this study, we used ML to investigate the accuracy and reliability of the Q-CHAT in discriminating young autistic children from those without. Five different ML algorithms (random forest (RF), naïve Bayes (NB), support vector machine (SVM), logistic regression (LR), and K-nearest neighbors (KNN)) were applied to investigate the complete set of Q-CHAT items. Our results showed that ML achieved an overall accuracy of 90%, and the SVM was the most effective, being able to classify autism with 95% accuracy. Furthermore, using the SVM–recursive feature elimination (RFE) approach, we selected a subset of 14 items ensuring 91% accuracy, while 83% accuracy was obtained from the 3 best discriminating items in common to ours and the previously reported Q-CHAT-10. This evidence confirms the high performance and cross-cultural validity of the Q-CHAT, and supports the application of ML to create shorter and faster versions of the instrument, maintaining high classification accuracy, to be used as a quick, easy, and high-performance tool in primary-care settings.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Pratyusha Rakshit ◽  
Onintze Zaballa ◽  
Aritz Pérez ◽  
Elisa Gómez-Inhiesto ◽  
Maria T. Acaiturri-Ayesta ◽  
...  

AbstractThis paper presents a novel machine learning approach to perform an early prediction of the healthcare cost of breast cancer patients. The learning phase of our prediction method considers the following two steps: (1) in the first step, the patients are clustered taking into account the sequences of actions undergoing similar clinical activities and ensuring similar healthcare costs, and (2) a Markov chain is then learned for each group to describe the action-sequences of the patients in the cluster. A two step procedure is undertaken in the prediction phase: (1) first, the healthcare cost of a new patient’s treatment is estimated based on the average healthcare cost of its k-nearest neighbors in each group, and (2) finally, an aggregate measure of the healthcare cost estimated by each group is used as the final predicted cost. Experiments undertaken reveal a mean absolute percentage error as small as 6%, even when half of the clinical records of a patient is available, substantiating the early prediction capability of the proposed method. Comparative analysis substantiates the superiority of the proposed algorithm over the state-of-the-art techniques.


2021 ◽  
pp. 1-5
Author(s):  
David Samuel Kereh ◽  
John Pieter ◽  
William Hamdani ◽  
Haryasena Haryasena ◽  
Daniel Sampepajung ◽  
...  

BACKGROUND: AGR2 expression is associated with luminal breast cancer. Overexpression of AGR2 is a predictor of poor prognosis. Several studies have found correlations between AGR2 in disseminated tumor cells (DTCs) in breast cancer patients. OBJECTIVE: This study aims to determine the correlation between anterior Gradient2 (AGR2) expression with the incidence of distant metastases in luminal breast cancer. METHODS: This study was an observational study using a cross-sectional method and was conducted at Wahidin Sudirohusodo Hospital and the network. ELISA methods examine AGR2 expression from blood serum of breast cancer patients. To compare the AGR2 expression in metastatic patients and the non-metastatic patient was tested with Mann Whitney test. The correlation of AGR2 expression and metastasis was tested with the Rank Spearman test. RESULTS: The mean value of AGR2 antibody expression on ELISA in this study was 2.90 ± 1.82 ng/dl, and its cut-off point was 2.1 ng/dl. Based on this cut-off point value, 14 subjects (66.7%) had overexpression of AGR2 serum ELISA, and 7 subjects (33.3%) had not. The mean value AGR2 was significantly higher in metastatic than not metastatic, 3.77 versus 1.76 (p < 0.01). The Spearman rank test obtained a p-value for the 2 tail test of 0.003 (p < 0.05), which showed a significant correlation of both, while the correlation coefficient of 0.612 showed a strong positive correlation of AGR2 overexpression and metastasis. CONCLUSIONS: AGR2 expression is correlated with metastasis in Luminal breast cancer.


2021 ◽  
pp. 025576142110273
Author(s):  
Erkan Sülün ◽  
Hüseyin Olgaçer ◽  
Hakkı Cengiz Eren

In this study, the authors evaluated the potential role of an activity-based guitar training program on reducing anxiety and providing fulfillment for younger relatives of cancer patients. Ten active members of KHYD (The Society for Relatives of Cancer Patients), between ages 11 and 17 participated in an 8-week guitar education program. The participants filled out two questionnaires before and after their engagement in the 8-week program, one to measure changes in their anxiety levels (State-Trait Anxiety Inventory) and the other to measure changes in their general fulfillment levels (Multidimensional Students’ Life Satisfaction Scale). Wilcoxon signed rank test, as well as descriptive statistics were used in the analysis of data. Mean rank differences were observed to be statistically significant with respect to total state and trait anxiety scores; in both cases, the participants’ scores decreased after their engagement in the program. Statistically significant mean rank differences were also observed in the overall MSLSS scores and its “friends” and “environment” sub-dimensions; with respect to these, participants’ scores increased after their engagement in the program. Recommendations for more comprehensive, larger-scale studies are given at the end.


Sign in / Sign up

Export Citation Format

Share Document