Ovarian Cancer Prediction from Ovarian Cysts Based on TVUS Using Machine Learning Algorithms

Cancer has been characterized as one of the leading diseases that causes death in humans. Breast cancer being a subtype of cancer causes death in one out of every eight women worldwide. The solution to counter this is by conducting early and accurate diagnosis for faster treatment. To achieve such accuracy in a short span of time proves difficult with existing techniques. In this paper, different machine learning algorithms which can be used as tools by physicians for early and effective detection and prediction of cancerous cells have been studied and introduced. The different algorithms introduced here are ANN, DT, Random Forest (RF), Naive Bayes Classifier (NBC), SVM and KNN. These algorithms are trained with a dataset that contain parameters describing the tumor of a person having breast cancer and are then used to classify and predict whether the cell is cancerous.

Download Full-text

Classification of molecular subtypes of high-grade serous ovarian cancer by MALDI-Imaging.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.e17544 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. e17544-e17544

Author(s):

Wanja Nikolai Kassuhn ◽

Oliver Klein ◽

Silvia Darb-Esfahani ◽

Hedwig Lammert ◽

Sylwia Handzik ◽

...

Keyword(s):

Machine Learning ◽

Ovarian Cancer ◽

Molecular Subtypes ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Clear Correlation ◽

High Grade ◽

Serous Ovarian Cancer ◽

Tissue Samples ◽

Maldi Ims

e17544 Background: High-grade serous ovarian cancer (HGSOC) can be separated by gene expression profiling into four molecular subtypes with clear correlation of the clinical outcome. However, these gene signatures have not been implemented in clinical practice to stratify patients for targeted therapy. This is mainly due to a lack of easy, cost-effective and reproducible methods, as well as the high heterogeneity of HGSOC. Hence, we aimed to examine the potential of unsupervised matrix assisted laser desorption/ionization imaging mass spectrometry (MALDI-IMS) to stratify patients, which might benefit from targeted therapeutic strategies. Methods: Molecular subtyping of paraffin-embedded tissue samples from 279 HGSOC patients was performed by NanoString analysis (ground truth labeling). Next, we applied MALDI-IMS, a novel technology to identify distinct mass profiles on the same paraffin-embedded tissue sections paired with machine learning algorithms to identify HGSOC subtypes by proteomic signature. Finally, we devised a novel strategy to annotate spectra of stromal origin. Results: We elucidated a MALDI-derived proteomic signature (135 peptides) able to classify HGSOC subtypes. Random forest classifiers achieved an area under the curve (AUC) of 0.983. Furthermore, we demonstrated that the exclusion of stroma associated spectra provides tangible improvements to classification quality (AUC = 0.988). False discovery rates (FDR) were reduced from 10.2% to 8.0%. Finally, novel MALDI-based stroma annotation achieved near-perfect classifications (AUC = 0.999, FDR < 1.0%). Conclusions: Here, we present a concept integrating MALDI-IMS with machine learning algorithms to classify patients according to distinct molecular subtypes of HGSOC. This has great potential to assign patients for targeted therapies.

Download Full-text

Applying Best Machine Learning Algorithms for Breast Cancer Prediction and Classification

2018 International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS) ◽

10.1109/icecocs.2018.8610632 ◽

2018 ◽

Cited By ~ 6

Author(s):

Youness Khourdifi ◽

Mohamed Bahaj

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Cancer Prediction

Download Full-text

Evaluation of Combined Cancer Markers With Lactate Dehydrogenase and Application of Machine Learning Algorithms for Differentiating Benign Disease From Malignant Ovarian Cancer

Cancer Control ◽

10.1177/10732748211033401 ◽

2021 ◽

Vol 28 ◽

pp. 107327482110334

Author(s):

Seri Jeong ◽

Dae-Soon Son ◽

Minseob Cho ◽

Nuri Lee ◽

Wonkeun Song ◽

...

Keyword(s):

Machine Learning ◽

Ovarian Cancer ◽

Lactate Dehydrogenase ◽

Total Population ◽

Early Stage ◽

Learning Algorithms ◽

Benign Disease ◽

Machine Learning Algorithms ◽

Cancer Markers ◽

Early Stage Ovarian Cancer

Background The differential diagnosis of ovarian cancer is important, and there has been ongoing research to identify biomarkers with higher performance. This study aimed to evaluate the diagnostic utility of combinations of cancer markers classified by machine learning algorithms in patients with early stage ovarian cancer, which has rarely been reported. Methods In total, 730 serum samples were assayed for lactate dehydrogenase (LD), neutrophil-to-lymphocyte ratio (NLR), human epididymis protein 4 (HE4), cancer antigen 125 (CA125), and risk of ovarian malignancy algorithm (ROMA). Among them, 53 were diagnosed with early stage ovarian cancer, and the remaining 677 were diagnosed with benign disease. Results The areas under the receiver operating characteristic curves (ROC-AUCs) of the ROMA, HE4, CA125, LD, and NLR for discriminating ovarian cancer from non-cancerous disease were .707, .680, .643, .657, and .624, respectively. ROC-AUC of the combination of ROMA and LD (.709) was similar to that of single ROMA in the total population. In the postmenopausal group, ROC-AUCs of HE4 and CA125 combined with LD presented the highest value (.718). When machine learning algorithms were applied to ROMA combined with LD, the ROC-AUC of random forest was higher than that of other applied algorithms in the total population (.757), showing acceptable performance. Conclusion Our data suggest that the combinations of ovarian cancer-specific markers with LD classified by random forest may be a useful tool for predicting ovarian cancer, particularly in clinical settings, due to easy accessibility and cost-effectiveness. Application of an optimal combination of cancer markers and algorithms would facilitate appropriate management of ovarian cancer patients.

Download Full-text

Predicting 180-day mortality for women with ovarian cancer using machine learning and patient-reported outcome data.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.e13555 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. e13555-e13555

Author(s):

Chris Sidey-Gibbons ◽

Charlotte C. Sun ◽

Cai Xu ◽

Amy Schneider ◽

Sheng-Chieh Lu ◽

...

Keyword(s):

Machine Learning ◽

Ovarian Cancer ◽

End Of Life ◽

Learning Algorithms ◽

The United States ◽

Patient Reported Outcome ◽

Machine Learning Algorithms ◽

Electronic Health Record Data ◽

Testing Dataset ◽

Patient Reported

e13555 Background: Contra to national guidelines, women with ovarian cancer often receive aggressive treatment until the end-of-life. We trained machine learning algorithms to predict mortality within 180 days for women with ovarian cancer. Methods: Data were collected data from a single academic cancer institution in the United States. Women with recurrent ovarian cancer completed biopsychosocial patient-reported outcome measures (PROMs) every 90 days. We randomly partitioned our dataset into training and testing samples with a 2:1 ratio. We used synthetic minority oversampling to reduce class imbalance in the training dataset. We fitted training data to six machine learning algorithms and combined their classifications on the testing dataset into a voting ensemble. We assessed the accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC) for each algorithm. Results: We recruited 245 patients who completed 1319 assessments. The final voting ensemble performed well across all performance metrics (Accuracy = .79, Sensitivity = .71, Specificity = .80, AUROC = .76). The algorithm correctly identified 25 of the 35 women in the testing dataset who died within 180 days of assessment Conclusions: Machine learning algorithms trained using PROM data offer state-of-the-art performance in predicting whether a woman with ovarian cancer will reach the end-of-life within 180 days. We highlight the importance of PROM data in ML models of mortality. Our model exhibits substantial improvements in prediction sensitivity compared to other similar models trained using electronic health record data alone. This model could inform clinical decision making and improve the uptake of appropriate end-of-life care. Further research is warranted to expand on these findings in a larger, more diverse sample.

Download Full-text