Holo entropy enabled decision tree classifier for breast cancer diagnosis using wisconsin (prognostic) data set

Author(s):  
Shabina Sayed ◽  
Shoeb Ahmed ◽  
Rakesh Poonia
Author(s):  
T. Sathya Priya, Et. al.

Right now, breast cancer is considered as a most important health problem among women over the world. The detection of breast cancer in the beginning stage can reduce the mortality rate to a considerable extent. Mammogram is an effective and regularly used technique for the detection and screening of breast cancer. The advanced deep learning (DL) techniques are utilized by radiologists for accurate finding and classification of medical images. This paper develops a new deep segmentation with residual network (DS-RN) based breast cancer diagnosis model using mammogram images. The presented DS-RN model involves preprocessing, Faster Region based Convolution Neural Network (R-CNN) (Faster R-CNN) with Inception v2 model based segmentation, feature extraction and classification. To classify the mammogram images, decision tree (DT) classifier model is used. A detailed simulation process is performed to ensure the betterment of the presented model on the Mini-MIAS dataset. The obtained experimental values stated that the DS-RN model has reached to a maximum classification performance with the maximum sensitivity, specificity, accuracy and F-Measure of 98.15%, 100%, 98.86% and 99.07% respectively.  


Author(s):  
P. Hamsagayathri ◽  
P. Sampath

Breast cancer is one of the dangerous cancers among world’s women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in the United States. Also, 246,660 new cases of women with cancer are estimated for the year 2016. Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification plays an important role in breast cancer detection and used by researchers to analyse and classify the medical data. In this research work, priority-based decision tree classifier algorithm has been implemented for Wisconsin Breast cancer dataset. This paper analyzes the different decision tree classifier algorithms for Wisconsin original, diagnostic and prognostic dataset using WEKA software. The performance of the classifiers are evaluated against the parameters like accuracy, Kappa statistic, Entropy, RMSE, TP Rate, FP Rate, Precision, Recall, F-Measure, ROC, Specificity, Sensitivity.


2018 ◽  
Vol 7 (3.6) ◽  
pp. 154
Author(s):  
S K. Sajan ◽  
M Germanus Alex

Breast cancer is a major threat humans are facing irrespective of geographical limits. The awareness about breast cancer has increased during the last decade and many preventive measures were in practice to detect the breast cancer before the symptoms were felt. Mammography is a screening methodology currently in practice. In this paper the mammogram image is analyzed using automated system. The automated system is designed to be capable of distinguishing the mammogram image into a normal or malignant. This process involves image enhancement and image segmentation at preprocessing level. Histogram equalization technique is used to transform low contrast region of the mammogram into region with higher contrast and Fuzzy C Means (FCM) algorithm is used to segment the mammogram image into regions suitable for further analysis. After enhancement and segmentation at preprocessing level the classification is done using three classification algorithms like decision tree classifier, Neural Network classifier and Support Vector Machine (SVM). The performance of the classification algorithms is evaluated using the following criteria like speed, flexibility, robustness, scalability, interpretability, Time complexity and also based on accuracy, sensitivity and specificity. The results obtained in classification are compared with other classification algorithms. It is found that the neural network classifier approach produces better results compared to other classifiers.The average accuracy in diagnosis by Neural Network approach classifier is around 91%.  Also it is found that the decision tree approach is much flexible and easy to use compared to other approaches.  


2007 ◽  
Vol 46 (05) ◽  
pp. 523-529 ◽  
Author(s):  
M. Saraee ◽  
B. Theodoulidis ◽  
J. A. Keane ◽  
C. Tjortjis

Summary Objectives: Medical data are a valuable resource from which novel and potentially useful knowledge can be discovered by using data mining. Data mining can assist and support medical decision making and enhance clinical managementand investigative research. The objective of this work is to propose a method for building accurate descriptive and predictive models based on classification of past medical data. We also aim to compare this method with other well established data mining methods and identify strengths and weaknesses. Method: We propose T3, a decision tree classifier which builds predictive models based on known classes, by allowing for a certain amount of misclassification error in training in order to achieve better descriptive and predictive accuracy. We then experiment with a real medical data set on stroke, and various subsets, in order to identify strengths and weaknesses. We also compare performance with a very successful and well established decision tree classifier. Results: T3 demonstrated impressive performance when predicting unseen cases of stroke resulting in as little as 0.4% classification error while the state of the art decision tree classifier resulted in 33.6% classification error respectively. Conclusions: This paper presents and evaluates T3, a classification algorithm that builds decision trees of depth at most three, and results in high accuracy whilst keeping the tree size reasonably small. T3 demonstrates strong descriptive and predictive power without compromising simplicity and clarity. We evaluate T3 based on real stroke register data and compare it with C4.5, a well-known classification algorithm, showing that T3 produces significantly more accurate and readable classifiers.


2011 ◽  
Vol 29 (12) ◽  
pp. 1570-1577 ◽  
Author(s):  
Mara A. Schonberg ◽  
Edward R. Marcantonio ◽  
Long Ngo ◽  
Donglin Li ◽  
Rebecca A. Silliman ◽  
...  

Purpose To understand the impact of breast cancer on older women's survival, we compared survival of older women diagnosed with breast cancer with matched controls. Methods Using the linked 1992 to 2003 Surveillance, Epidemiology, and End Results (SEER) -Medicare data set, we identified women age 67 years or older who were newly diagnosed with ductal carcinoma in situ (DCIS) or breast cancer. We identified women not diagnosed with breast cancer from the 5% random sample of Medicare beneficiaries residing in SEER areas. We matched patient cases to controls by birth year and registry (99% or 66,039 patient cases matched successfully). We assigned the start of follow-up for controls as the patient cases' date of diagnosis. Mortality data were available through 2006. We compared survival of women with breast cancer by stage with survival of controls using multivariable proportional hazards models adjusting for age at diagnosis, comorbidity, prior mammography use, and sociodemographics. We repeated these analyses stratifying by age. Results Median follow-up time was 7.7 years. Differences between patient cases and controls in sociodemographics and comorbidities were small (< 4%). Women diagnosed with DCIS (adjusted hazard ratio [aHR], 0.7; 95% CI, 0.7 to 0.7) or stage I disease (aHR, 0.8; 95% CI, 0.8 to 0.8) had slightly lower mortality than controls. Women diagnosed with stage II disease or higher had greater mortality than controls (stage II disease: aHR, 1.2; 95% CI, 1.2 to 1.2). The association of a breast cancer diagnosis with mortality declined with age among women with advanced disease. Conclusion Compared with matched controls, a diagnosis of DCIS or stage I breast cancer in older women is associated with better survival, whereas a diagnosis of stage II or higher breast cancer is associated with worse survival.


2020 ◽  
Vol 10 (22) ◽  
pp. 8137
Author(s):  
Sushruta Mishra ◽  
Pradeep Kumar Mallick ◽  
Hrudaya Kumar Tripathy ◽  
Akash Kumar Bhoi ◽  
Alfonso González-Briones

There is a consistent rise in chronic diseases worldwide. These diseases decrease immunity and the quality of daily life. The treatment of these disorders is a challenging task for medical professionals. Dimensionality reduction techniques make it possible to handle big data samples, providing decision support in relation to chronic diseases. These datasets contain a series of symptoms that are used in disease prediction. The presence of redundant and irrelevant symptoms in the datasets should be identified and removed using feature selection techniques to improve classification accuracy. Therefore, the main contribution of this paper is a comparative analysis of the impact of wrapper and filter selection methods on classification performance. The filter methods that have been considered include the Correlation Feature Selection (CFS) method, the Information Gain (IG) method and the Chi-Square (CS) method. The wrapper methods that have been considered include the Best First Search (BFS) method, the Linear Forward Selection (LFS) method and the Greedy Step Wise Search (GSS) method. A Decision Tree algorithm has been used as a classifier for this analysis and is implemented through the WEKA tool. An attribute significance analysis has been performed on the diabetes, breast cancer and heart disease datasets used in the study. It was observed that the CFS method outperformed other filter methods concerning the accuracy rate and execution time. The accuracy rate using the CFS method on the datasets for heart disease, diabetes, breast cancer was 93.8%, 89.5% and 96.8% respectively. Moreover, latency delays of 1.08 s, 1.02 s and 1.01 s were noted using the same method for the respective datasets. Among wrapper methods, BFS’ performance was impressive in comparison to other methods. Maximum accuracy of 94.7%, 95.8% and 96.8% were achieved on the datasets for heart disease, diabetes and breast cancer respectively. Latency delays of 1.42 s, 1.44 s and 132 s were recorded using the same method for the respective datasets. On the basis of the obtained result, a new hybrid Attribute Evaluator method has been proposed which effectively integrates enhanced K-Means clustering with the CFS filter method and the BFS wrapper method. Furthermore, the hybrid method was evaluated with an improved decision tree classifier. The improved decision tree classifier combined clustering with classification. It was validated on 14 different chronic disease datasets and its performance was recorded. A very optimal and consistent classification performance was observed. The mean values for accuracy, specificity, sensitivity and f-score metrics were 96.7%, 96.5%, 95.6% and 96.2% respectively.


Author(s):  
P. Hamsagayathri ◽  
P. Sampath

Objective: Breast cancer is one of the dangerous cancers among world’s women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. Survey: According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in united states. Also, 246,660 new cases of women with cancer are estimated for the year 2016.Methods: Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification is one of the vital techniques used by researchers to analyze and classify the medical data.Results: This paper analyzes the different decision tree classifier algorithms for seer breast cancer dataset using WEKA software. The performance of the classifiers are evaluated against the parameters like accuracy, Kappa statistic, Entropy, RMSE, TP Rate, FP Rate, Precision, Recall, F-Measure, ROC, Specificity, Sensitivity.Conclusion: The simulation results shows REPTree classifier classifies the data with 93.63% accuracy and minimum RMSE of 0.1628 REPTree algorithm consumes less time to build the model with 0.929 ROC and 0.959 PRC values. By comparing classification results, we confirm that a REPTree algorithm is better than other classification algorithms for SEER dataset.


Sign in / Sign up

Export Citation Format

Share Document