Early Breast Cancer Prediction Using Dermatoglyphics: Data Mining Pilot Study in a General Hospital in Iran (Preprint)

2020 ◽  
Author(s):  
Seyed Mohammad Ayyoubzadeh ◽  
Aysan Almasizand ◽  
Sharareh R. Niakan Kalhori ◽  
Sakineh Abbasi

BACKGROUND Dermatoglyphics is the study of skin patterns on hands and feet. It has been shown in some studies that specific finger patterns could be a risk factor of breast cancer. There are several studies using data mining methods to evaluate the risk of breast cancer; while there is no or little study that evaluates finger patterns with data mining for breast cancer risk prediction. OBJECTIVE This study aims to evaluate fingerprint patterns along with other easy-to-obtain features in the risk of breast cancer. METHODS A dataset containing 462 records includes female patients in Imam Khomeini Hospital Complex, Tehran, Iran was obtained. The dataset has comprised of age, menstruation age, menopause age, and situation, has a child, age at first live birth, family history of breast cancer, and figure print patterns features of hands. The factors weight was determined by the Information Gain index. Predictive models were built once without fingerprint features and once with fingerprint features using Naïve Bayes, Decision Tree, Random Forest (RF), Support Vector Machine (SVM), and Deep Learning classifiers. RESULTS The most important factor determining breast cancer were age, having a child, menopause situation, and menopause age. The best performance belongs to the RF model with accuracy and AUC of 84.43% and 0.923 respectively. The fingerprint patterns feature increased the RF accuracy from 79.44% to 84.43%. CONCLUSIONS An early breast cancer screening model could be built with the use of data mining methods. The fingerprint patterns could increase the performance of these models. The Random Forest model could be used. The results of such models could be used in designing apps for self-screening breast cancer.

2016 ◽  
Vol 51 (20) ◽  
pp. 2853-2862 ◽  
Author(s):  
Serkan Ballı

The aim of this study is to diagnose and classify the failure modes for two serial fastened sandwich composite plates using data mining techniques. The composite material used in the study was manufactured using glass fiber reinforced layer and aluminum sheets. Obtained results of previous experimental study for sandwich composite plates, which were mechanically fastened with two serial pins or bolts were used for classification of failure modes. Furthermore, experimental data from previous study consists of different geometrical parameters for various applied preload moments as 0 (pinned), 2, 3, 4, and 5 Nm (bolted). In this study, data mining methods were applied by using these geometrical parameters and pinned/bolted joint configurations. Therefore, three geometrical parameters and 100 test data were used for classification by utilizing support vector machine, Naive Bayes, K-Nearest Neighbors, Logistic Regression, and Random Forest methods. According to experiments, Random Forest method achieved better results than others and it was appropriate for diagnosing and classification of the failure modes. Performances of all data mining methods used were discussed in terms of accuracy and error ratios.


Breast cancer classification can be useful for discovering the genetic behavior of tumors and envision the outcome of some diseases. Through this paper we are predicting the noxious behavior of a tumor. The prediction models used are Random Forest, Naïve Bayes, IBK (Instance Based Learner), SMO (Sequential minimal optimization), and Multi Class Classifier. This prediction model which can potentially be used as a biomarker of breast cancer is based on physical attributes of a breast mass and which is gathered from digitized image of Fine Needle Aspirate (FNA). These can be helpful in prediction and reduction of invasive tumors


Author(s):  
Alice Constance Mensah ◽  
Isaac Ofori Asare

Breast cancer is the most common of all cancers and is the leading cause of cancer deaths in women worldwide. The classification of breast cancer data can be useful to predict the outcome of some diseases or discover the genetic behavior of tumors. Data mining technology helps in classifying cancer patients and this technique helps to identify potential cancer patients by simply analyzing the data. This study examines the determinant factors of breast cancer and measures the breast cancer patient data to build a useful classification model using a data mining approach. In this study of 2397 women, 1022 (42.64%) were diagnosed with breast cancer. Among the four main learning techniques such as: Random Forest, Naive Bayes, Classification and Regression Model (CART), and Boosted Tree model were used for the study. The Random Forest technique had the better accuracy value of 0.9892(95%CI,0.9832 -0.9935) and a sensitivity value of about 92%. This means that the Random Forest learning model is the best model to classify and predict breast cancer based on associated factors.


2021 ◽  
Vol 1 (4) ◽  
pp. 362-392
Author(s):  
Haihua Liu ◽  
◽  
Shan Huang ◽  
Peng Wang ◽  
Zejun Li ◽  
...  

<abstract><p>Financial activities are closely related to human social life. Data mining plays an important role in the analysis and prediction of financial markets, especially in the context of the current era of big data. However, it is not simple to use data mining methods in the process of analyzing financial data, due to the differences in the background of researchers in different disciplines. This review summarizes several commonly used data mining methods in financial data analysis. The purpose is to make it easier for researchers in the financial field to use data mining methods and to expand the application scenarios of it used by researchers in the computer field. This review introduces the principles and steps of decision trees, support vector machines, Bayesian, K-nearest neighbors, k-means, Expectation-maximization algorithm, and ensemble learning, and points out their advantages, disadvantages and applicable scenarios. After introducing the algorithms, it summarizes the use of the algorithm in the process of financial data analysis, hoping that readers can get specific examples of using the algorithm. In this review, the difficulties and countermeasures of using data mining methods are summarized, and the development trend of using data mining methods to analyze financial data is predicted.</p></abstract>


Author(s):  
Sarangam Kodati ◽  
Jeeva Selvaraj

Data mining is the most famous knowledge extraction approach for knowledge discovery from data (KDD). Machine learning is used to enable a program to analyze data, recognize correlations, and make usage on insights to solve issues and/or enrich data and because of prediction. The chapter highlights the need for more research within the usage of robust data mining methods in imitation of help healthcare specialists between the diagnosis regarding heart diseases and other debilitating disease conditions. Heart disease is the primary reason of death of people in the world. Nearly 47% of death is caused by heart disease. The authors use algorithms including random forest, naïve Bayes, support vector machine to analyze heart disease. Accuracy on the prediction stage is high when using a greater number of attributes. The goal is to function predictive evaluation using data mining, using data mining to analyze heart disease, and show which methods are effective and efficient.


Sign in / Sign up

Export Citation Format

Share Document