Abstract 2449: Unsupervised machine learning methods reveal metabolomic based clusters in breast cancer patients

Author(s):  
Jocelyn Gal ◽  
Caroline Bailleux ◽  
David Chardin ◽  
Thierry Pourcher ◽  
Lun Jing ◽  
...  
2019 ◽  
Vol 37 (15_suppl) ◽  
pp. 3135-3135
Author(s):  
Takeshi Murata ◽  
Takako Yanagisawa ◽  
Toshiaki Kurihara ◽  
Miku Kaneko ◽  
Sana Ota ◽  
...  

3135 Background: Saliva is non-invasively accessible and informative biological fluid which has high potential for the early diagnosis of various diseases. The aim of this study is to develop machine learning methods and to explore new salivary biomarkers to discriminate breast cancer patients from healthy controls. Methods: We conducted a comprehensive metabolite analysis of saliva samples obtained from 101 patients with invasive carcinoma (IC), 23 patients with ductal carcinoma in situ (DCIS) and 42 healthy controls, using capillary electrophoresis and liquid chromatography with mass spectrometry to quantify hundreds of hydrophilic metabolites. Saliva samples were collected under 9h fasting and were split into training and validation data. Conventional statistical analyses and artificial intelligence-based methods were used to access the discrimination abilities of the quantified metabolite. Multiple logistic regression (MLR) model and an alternative decision tree (ADTree)-based machine learning methods were used. The generalization abilities of these mathematical models were validated in various computational tests, such as cross-validation and resampling methods. Results: Among quantified 260 metabolites, amino acids and polyamines showed significantly elevated in saliva from breast cancer patients, e.g. spermine showed the highest area under the receiver operating characteristic curves (AUC) to discriminate IC from C; 0.766 (95% confidence interval [CI]; 0.671 – 0.840, P < 0.0001). These metabolites showed no significant difference between C and DICS, i.e., these metabolites were elevated only in the samples of IC. The MLR yielded higher AUC to discriminate IC from C; 0.790 (95% CI; 0.699 – 0.859, P < 0.0001). The ADTree with ensemble approach showed the best AUC; 0.912 (95% CI; 0.838 – 0.961, P < 0.0001). In the comparison of these metabolites in the analysis of each subtype, seven metabolites were significantly different between Luminal A-like and Luminal B-like while, but few metabolites were significantly different among the other subtypes. Conclusions: These data indicated the combination of salivary metabolomic profiles including polyamines showed potential ability to screening breast cancer in a non-invasive way.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Pratyusha Rakshit ◽  
Onintze Zaballa ◽  
Aritz Pérez ◽  
Elisa Gómez-Inhiesto ◽  
Maria T. Acaiturri-Ayesta ◽  
...  

AbstractThis paper presents a novel machine learning approach to perform an early prediction of the healthcare cost of breast cancer patients. The learning phase of our prediction method considers the following two steps: (1) in the first step, the patients are clustered taking into account the sequences of actions undergoing similar clinical activities and ensuring similar healthcare costs, and (2) a Markov chain is then learned for each group to describe the action-sequences of the patients in the cluster. A two step procedure is undertaken in the prediction phase: (1) first, the healthcare cost of a new patient’s treatment is estimated based on the average healthcare cost of its k-nearest neighbors in each group, and (2) finally, an aggregate measure of the healthcare cost estimated by each group is used as the final predicted cost. Experiments undertaken reveal a mean absolute percentage error as small as 6%, even when half of the clinical records of a patient is available, substantiating the early prediction capability of the proposed method. Comparative analysis substantiates the superiority of the proposed algorithm over the state-of-the-art techniques.


Sign in / Sign up

Export Citation Format

Share Document