Abstract 2449: Unsupervised machine learning methods reveal metabolomic based clusters in breast cancer patients

3135 Background: Saliva is non-invasively accessible and informative biological fluid which has high potential for the early diagnosis of various diseases. The aim of this study is to develop machine learning methods and to explore new salivary biomarkers to discriminate breast cancer patients from healthy controls. Methods: We conducted a comprehensive metabolite analysis of saliva samples obtained from 101 patients with invasive carcinoma (IC), 23 patients with ductal carcinoma in situ (DCIS) and 42 healthy controls, using capillary electrophoresis and liquid chromatography with mass spectrometry to quantify hundreds of hydrophilic metabolites. Saliva samples were collected under 9h fasting and were split into training and validation data. Conventional statistical analyses and artificial intelligence-based methods were used to access the discrimination abilities of the quantified metabolite. Multiple logistic regression (MLR) model and an alternative decision tree (ADTree)-based machine learning methods were used. The generalization abilities of these mathematical models were validated in various computational tests, such as cross-validation and resampling methods. Results: Among quantified 260 metabolites, amino acids and polyamines showed significantly elevated in saliva from breast cancer patients, e.g. spermine showed the highest area under the receiver operating characteristic curves (AUC) to discriminate IC from C; 0.766 (95% confidence interval [CI]; 0.671 – 0.840, P < 0.0001). These metabolites showed no significant difference between C and DICS, i.e., these metabolites were elevated only in the samples of IC. The MLR yielded higher AUC to discriminate IC from C; 0.790 (95% CI; 0.699 – 0.859, P < 0.0001). The ADTree with ensemble approach showed the best AUC; 0.912 (95% CI; 0.838 – 0.961, P < 0.0001). In the comparison of these metabolites in the analysis of each subtype, seven metabolites were significantly different between Luminal A-like and Luminal B-like while, but few metabolites were significantly different among the other subtypes. Conclusions: These data indicated the combination of salivary metabolomic profiles including polyamines showed potential ability to screening breast cancer in a non-invasive way.

Download Full-text

Comparison of unsupervised machine-learning methods to identify metabolomic signatures in patients with localized breast cancer

Computational and Structural Biotechnology Journal ◽

10.1016/j.csbj.2020.05.021 ◽

2020 ◽

Vol 18 ◽

pp. 1509-1524 ◽

Cited By ~ 1

Author(s):

Jocelyn Gal ◽

Caroline Bailleux ◽

David Chardin ◽

Thierry Pourcher ◽

Julia Gilhodes ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Learning Methods ◽

Unsupervised Machine Learning ◽

Machine Learning Methods

Download Full-text

Applying a Machine Learning Approach to Predict Acute Toxicities During Radiation for Breast Cancer Patients

International Journal of Radiation Oncology*Biology*Physics ◽

10.1016/j.ijrobp.2018.06.167 ◽

2018 ◽

Vol 102 (3) ◽

pp. S59

Author(s):

J. Reddy ◽

W.D. Lindsay ◽

C.G. Berlind ◽

C.A. Ahern ◽

B.D. Smith

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cancer Patients ◽

Learning Approach ◽

Breast Cancer Patients ◽

Machine Learning Approach

Download Full-text

A machine learning approach to predict healthcare cost of breast cancer patients

Scientific Reports ◽

10.1038/s41598-021-91580-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Pratyusha Rakshit ◽

Onintze Zaballa ◽

Aritz Pérez ◽

Elisa Gómez-Inhiesto ◽

Maria T. Acaiturri-Ayesta ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cancer Patients ◽

Healthcare Cost ◽

Percentage Error ◽

Learning Approach ◽

Early Prediction ◽

Breast Cancer Patients ◽

Machine Learning Approach ◽

Clinical Records

AbstractThis paper presents a novel machine learning approach to perform an early prediction of the healthcare cost of breast cancer patients. The learning phase of our prediction method considers the following two steps: (1) in the first step, the patients are clustered taking into account the sequences of actions undergoing similar clinical activities and ensuring similar healthcare costs, and (2) a Markov chain is then learned for each group to describe the action-sequences of the patients in the cluster. A two step procedure is undertaken in the prediction phase: (1) first, the healthcare cost of a new patient’s treatment is estimated based on the average healthcare cost of its k-nearest neighbors in each group, and (2) finally, an aggregate measure of the healthcare cost estimated by each group is used as the final predicted cost. Experiments undertaken reveal a mean absolute percentage error as small as 6%, even when half of the clinical records of a patient is available, substantiating the early prediction capability of the proposed method. Comparative analysis substantiates the superiority of the proposed algorithm over the state-of-the-art techniques.

Download Full-text