Holo entropy enabled decision tree classifier for breast cancer diagnosis using wisconsin (prognostic) data set

Right now, breast cancer is considered as a most important health problem among women over the world. The detection of breast cancer in the beginning stage can reduce the mortality rate to a considerable extent. Mammogram is an effective and regularly used technique for the detection and screening of breast cancer. The advanced deep learning (DL) techniques are utilized by radiologists for accurate finding and classification of medical images. This paper develops a new deep segmentation with residual network (DS-RN) based breast cancer diagnosis model using mammogram images. The presented DS-RN model involves preprocessing, Faster Region based Convolution Neural Network (R-CNN) (Faster R-CNN) with Inception v2 model based segmentation, feature extraction and classification. To classify the mammogram images, decision tree (DT) classifier model is used. A detailed simulation process is performed to ensure the betterment of the presented model on the Mini-MIAS dataset. The obtained experimental values stated that the DS-RN model has reached to a maximum classification performance with the maximum sensitivity, specificity, accuracy and F-Measure of 98.15%, 100%, 98.86% and 99.07% respectively.

Download Full-text

PERFORMANCE ANALYSIS OF BREAST CANCER CLASSIFICATION USING DECISION TREE CLASSIFIERS

International Journal of Current Pharmaceutical Research ◽

10.22159/ijcpr.2017v9i2.17383 ◽

2017 ◽

Vol 9 (2) ◽

pp. 19 ◽

Cited By ~ 6

Author(s):

P. Hamsagayathri ◽

P. Sampath

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Ductal Carcinoma ◽

Research Work ◽

The United States ◽

Breast Cancer Dataset ◽

Decision Tree Classifier ◽

Cancer Dataset ◽

Term Survival ◽

Tree Classifier

Breast cancer is one of the dangerous cancers among world’s women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in the United States. Also, 246,660 new cases of women with cancer are estimated for the year 2016. Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification plays an important role in breast cancer detection and used by researchers to analyse and classify the medical data. In this research work, priority-based decision tree classifier algorithm has been implemented for Wisconsin Breast cancer dataset. This paper analyzes the different decision tree classifier algorithms for Wisconsin original, diagnostic and prognostic dataset using WEKA software. The performance of the classifiers are evaluated against the parameters like accuracy, Kappa statistic, Entropy, RMSE, TP Rate, FP Rate, Precision, Recall, F-Measure, ROC, Specificity, Sensitivity.

Download Full-text

Ensemble Decision Tree Classifier For Breast Cancer Data

International Journal of Information Technology Convergence and Services ◽

10.5121/ijitcs.2012.2103 ◽

2012 ◽

Vol 2 (1) ◽

pp. 17-24 ◽

Cited By ~ 37

Author(s):

D Lavanya

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Breast Cancer Data ◽

Decision Tree Classifier ◽

Cancer Data ◽

Tree Classifier

Download Full-text

A Comparative Study to Evaluate the Performance of Classification Algorithms in Mammogram Analysis

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.6.14960 ◽

2018 ◽

Vol 7 (3.6) ◽

pp. 154

Author(s):

S K. Sajan ◽

M Germanus Alex

Keyword(s):

Breast Cancer ◽

Neural Network ◽

Decision Tree ◽

Automated System ◽

Support Vector ◽

Classification Algorithms ◽

Neural Network Classifier ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Mammogram Image

Breast cancer is a major threat humans are facing irrespective of geographical limits. The awareness about breast cancer has increased during the last decade and many preventive measures were in practice to detect the breast cancer before the symptoms were felt. Mammography is a screening methodology currently in practice. In this paper the mammogram image is analyzed using automated system. The automated system is designed to be capable of distinguishing the mammogram image into a normal or malignant. This process involves image enhancement and image segmentation at preprocessing level. Histogram equalization technique is used to transform low contrast region of the mammogram into region with higher contrast and Fuzzy C Means (FCM) algorithm is used to segment the mammogram image into regions suitable for further analysis. After enhancement and segmentation at preprocessing level the classification is done using three classification algorithms like decision tree classifier, Neural Network classifier and Support Vector Machine (SVM). The performance of the classification algorithms is evaluated using the following criteria like speed, flexibility, robustness, scalability, interpretability, Time complexity and also based on accuracy, sensitivity and specificity. The results obtained in classification are compared with other classification algorithms. It is found that the neural network classifier approach produces better results compared to other classifiers.The average accuracy in diagnosis by Neural Network approach classifier is around 91%. Also it is found that the decision tree approach is much flexible and easy to use compared to other approaches.

Download Full-text

Using T3, an Improved Decision Tree Classifier, for Mining Stroke-related Medical Data

Methods of Information in Medicine ◽

10.1160/me0317 ◽

2007 ◽

Vol 46 (05) ◽

pp. 523-529 ◽

Cited By ~ 8

Author(s):

M. Saraee ◽

B. Theodoulidis ◽

J. A. Keane ◽

C. Tjortjis

Keyword(s):

Data Mining ◽

Decision Tree ◽

Predictive Models ◽

Medical Data ◽

Classification Algorithm ◽

Medical Decision ◽

Classification Error ◽

Decision Tree Classifier ◽

Data Set ◽

Tree Classifier

Summary Objectives: Medical data are a valuable resource from which novel and potentially useful knowledge can be discovered by using data mining. Data mining can assist and support medical decision making and enhance clinical managementand investigative research. The objective of this work is to propose a method for building accurate descriptive and predictive models based on classification of past medical data. We also aim to compare this method with other well established data mining methods and identify strengths and weaknesses. Method: We propose T3, a decision tree classifier which builds predictive models based on known classes, by allowing for a certain amount of misclassification error in training in order to achieve better descriptive and predictive accuracy. We then experiment with a real medical data set on stroke, and various subsets, in order to identify strengths and weaknesses. We also compare performance with a very successful and well established decision tree classifier. Results: T3 demonstrated impressive performance when predicting unseen cases of stroke resulting in as little as 0.4% classification error while the state of the art decision tree classifier resulted in 33.6% classification error respectively. Conclusions: This paper presents and evaluates T3, a classification algorithm that builds decision trees of depth at most three, and results in high accuracy whilst keeping the tree size reasonably small. T3 demonstrates strong descriptive and predictive power without compromising simplicity and clarity. We evaluate T3 based on real stroke register data and compare it with C4.5, a well-known classification algorithm, showing that T3 produces significantly more accurate and readable classifiers.

Download Full-text

Entropy-based feature extraction and decision tree induction for breast cancer diagnosis with standardized thermograph images

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2010.04.014 ◽

2010 ◽

Vol 100 (3) ◽

pp. 269-282 ◽

Cited By ~ 26

Author(s):

Ming-Yih Lee ◽

Chi-Shih Yang

Keyword(s):

Breast Cancer ◽

Feature Extraction ◽

Decision Tree ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Decision Tree Induction

Download Full-text

Causes of Death and Relative Survival of Older Women After a Breast Cancer Diagnosis

Journal of Clinical Oncology ◽

10.1200/jco.2010.33.0472 ◽

2011 ◽

Vol 29 (12) ◽

pp. 1570-1577 ◽

Cited By ~ 80

Author(s):

Mara A. Schonberg ◽

Edward R. Marcantonio ◽

Long Ngo ◽

Donglin Li ◽

Rebecca A. Silliman ◽

...

Keyword(s):

Breast Cancer ◽

Older Women ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Stage I ◽

Stage Ii ◽

Mortality Data ◽

Data Set ◽

The Impact

Purpose To understand the impact of breast cancer on older women's survival, we compared survival of older women diagnosed with breast cancer with matched controls. Methods Using the linked 1992 to 2003 Surveillance, Epidemiology, and End Results (SEER) -Medicare data set, we identified women age 67 years or older who were newly diagnosed with ductal carcinoma in situ (DCIS) or breast cancer. We identified women not diagnosed with breast cancer from the 5% random sample of Medicare beneficiaries residing in SEER areas. We matched patient cases to controls by birth year and registry (99% or 66,039 patient cases matched successfully). We assigned the start of follow-up for controls as the patient cases' date of diagnosis. Mortality data were available through 2006. We compared survival of women with breast cancer by stage with survival of controls using multivariable proportional hazards models adjusting for age at diagnosis, comorbidity, prior mammography use, and sociodemographics. We repeated these analyses stratifying by age. Results Median follow-up time was 7.7 years. Differences between patient cases and controls in sociodemographics and comorbidities were small (< 4%). Women diagnosed with DCIS (adjusted hazard ratio [aHR], 0.7; 95% CI, 0.7 to 0.7) or stage I disease (aHR, 0.8; 95% CI, 0.8 to 0.8) had slightly lower mortality than controls. Women diagnosed with stage II disease or higher had greater mortality than controls (stage II disease: aHR, 1.2; 95% CI, 1.2 to 1.2). The association of a breast cancer diagnosis with mortality declined with age among women with advanced disease. Conclusion Compared with matched controls, a diagnosis of DCIS or stage I breast cancer in older women is associated with better survival, whereas a diagnosis of stage II or higher breast cancer is associated with worse survival.

Download Full-text

Breast cancer diagnosis using a multi-verse optimizer-based gradient boosting decision tree

SN Applied Sciences ◽

10.1007/s42452-020-2575-9 ◽

2020 ◽

Vol 2 (4) ◽

Cited By ~ 1

Author(s):

Hamed Tabrizchi ◽

Mohammad Tabrizchi ◽

Hamid Tabrizchi

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Gradient Boosting

Download Full-text

Performance Evaluation of a Proposed Machine Learning Model for Chronic Disease Datasets Using an Integrated Attribute Evaluator and an Improved Decision Tree Classifier

Applied Sciences ◽

10.3390/app10228137 ◽

2020 ◽

Vol 10 (22) ◽

pp. 8137

Author(s):

Sushruta Mishra ◽

Pradeep Kumar Mallick ◽

Hrudaya Kumar Tripathy ◽

Akash Kumar Bhoi ◽

Alfonso González-Briones

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Heart Disease ◽

Chronic Disease ◽

Decision Tree ◽

Classification Performance ◽

Decision Tree Classifier ◽

Accuracy Rate ◽

Filter Methods ◽

Tree Classifier

There is a consistent rise in chronic diseases worldwide. These diseases decrease immunity and the quality of daily life. The treatment of these disorders is a challenging task for medical professionals. Dimensionality reduction techniques make it possible to handle big data samples, providing decision support in relation to chronic diseases. These datasets contain a series of symptoms that are used in disease prediction. The presence of redundant and irrelevant symptoms in the datasets should be identified and removed using feature selection techniques to improve classification accuracy. Therefore, the main contribution of this paper is a comparative analysis of the impact of wrapper and filter selection methods on classification performance. The filter methods that have been considered include the Correlation Feature Selection (CFS) method, the Information Gain (IG) method and the Chi-Square (CS) method. The wrapper methods that have been considered include the Best First Search (BFS) method, the Linear Forward Selection (LFS) method and the Greedy Step Wise Search (GSS) method. A Decision Tree algorithm has been used as a classifier for this analysis and is implemented through the WEKA tool. An attribute significance analysis has been performed on the diabetes, breast cancer and heart disease datasets used in the study. It was observed that the CFS method outperformed other filter methods concerning the accuracy rate and execution time. The accuracy rate using the CFS method on the datasets for heart disease, diabetes, breast cancer was 93.8%, 89.5% and 96.8% respectively. Moreover, latency delays of 1.08 s, 1.02 s and 1.01 s were noted using the same method for the respective datasets. Among wrapper methods, BFS’ performance was impressive in comparison to other methods. Maximum accuracy of 94.7%, 95.8% and 96.8% were achieved on the datasets for heart disease, diabetes and breast cancer respectively. Latency delays of 1.42 s, 1.44 s and 132 s were recorded using the same method for the respective datasets. On the basis of the obtained result, a new hybrid Attribute Evaluator method has been proposed which effectively integrates enhanced K-Means clustering with the CFS filter method and the BFS wrapper method. Furthermore, the hybrid method was evaluated with an improved decision tree classifier. The improved decision tree classifier combined clustering with classification. It was validated on 14 different chronic disease datasets and its performance was recorded. A very optimal and consistent classification performance was observed. The mean values for accuracy, specificity, sensitivity and f-score metrics were 96.7%, 96.5%, 95.6% and 96.2% respectively.

Download Full-text

DECISION TREE CLASSIFIERS FOR CLASSIFICATION OF BREAST CANCER

International Journal of Current Pharmaceutical Research ◽

10.22159/ijcpr.2017v9i1.17377 ◽

2017 ◽

Vol 9 (2) ◽

pp. 31 ◽

Cited By ~ 4

Author(s):

P. Hamsagayathri ◽

P. Sampath

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Ductal Carcinoma ◽

Kappa Statistic ◽

Breast Cancer Dataset ◽

Decision Tree Classifier ◽

Cancer Dataset ◽

Term Survival ◽

Time To Build ◽

Tree Classifier

Objective: Breast cancer is one of the dangerous cancers among world’s women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. Survey: According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in united states. Also, 246,660 new cases of women with cancer are estimated for the year 2016.Methods: Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification is one of the vital techniques used by researchers to analyze and classify the medical data.Results: This paper analyzes the different decision tree classifier algorithms for seer breast cancer dataset using WEKA software. The performance of the classifiers are evaluated against the parameters like accuracy, Kappa statistic, Entropy, RMSE, TP Rate, FP Rate, Precision, Recall, F-Measure, ROC, Specificity, Sensitivity.Conclusion: The simulation results shows REPTree classifier classifies the data with 93.63% accuracy and minimum RMSE of 0.1628 REPTree algorithm consumes less time to build the model with 0.929 ROC and 0.959 PRC values. By comparing classification results, we confirm that a REPTree algorithm is better than other classification algorithms for SEER dataset.

Download Full-text