scholarly journals Ensemble Comparative Study for Diagnosis of Breast Cancer Datasets

2018 ◽  
Vol 7 (4.15) ◽  
pp. 281
Author(s):  
Bibhuprasad Sahu ◽  
Sujata Dash ◽  
Sachi Nandan Mohanty ◽  
Saroj Kumar Rout

Every disease is curable if a little amount of human effort is applied for early diagnosis. The death rate in world increases day by day as patient fail to detect it before it becomes chronic. Breast cancer is curable if detection is done at early stage before it spread across all part of body. Now-a-days computer aided diagnosis are automated assistance for the doctors to produce accurate prediction about the stage of disease. This study provided CAD system for diagnosis of breast cancer. This method uses Neural Network (NN) as a classifier model and PCA/LDA for dimension reduction method to attain higher classification rate. Multiple layers of neural network are applied to classify the breast cancer data. This system experiment done on Wisconsin breast cancer dataset (WBCD) from UCI repository. The dataset is divided into 2 parts train and test. With the result of accuracy, sensitivity, specificity, precision and recall the performance can be measured. The results obtained are this study is 97% using ANN and PCA-ANN, which is better than other state-of-art methods. As per the result analysis this system outperformed then the existing system.  

2020 ◽  
pp. 1410-1421 ◽  
Author(s):  
Aindrila Bhattacherjee ◽  
Sourav Roy ◽  
Sneha Paul ◽  
Payel Roy ◽  
Noreen Kausar ◽  
...  

According to the recent surveys, breast cancer has become one of the major causes of mortality rate among women. Breast cancer can be defined as a group of rapidly growing cells that lead to the formation of a lump or an extra mass in the breast tissue which consequently leads to the formation of tumor. Tumors can be classified as malignant (cancerous) or benign (non-cancerous). Feature selection is an important parameter in determining the classification systems. Machine learning methods are the most commonly used methods among researchers for breast cancer diagnosis. This paper proposes to investigate the WBCD (Wisconsin Breast Cancer Dataset) which comprises of 683 patients and implements the chosen features to train the back propagation neural network. The performance is then analyzed on the basis of classification accuracy, sensitivity, specificity, positive and negative predictor values, receiver operating characteristic curves and confusion matrix. A total of 9 features has been used to classify breast cancer with an accuracy of 99.27%. According to the recent surveys, breast cancer has become one of the major causes of mortality rate among women. Breast cancer can be defined as a group of rapidly growing cells that lead to the formation of a lump or an extra mass in the breast tissue which consequently leads to the formation of tumor. Tumors can be classified as malignant (cancerous) or benign (non-cancerous). Feature selection is an important parameter in determining the classification systems. Machine learning methods are the most commonly used methods among researchers for breast cancer diagnosis. This paper proposes to investigate the WBCD (Wisconsin Breast Cancer Dataset) which comprises of 683 patients and implements the chosen features to train the back propagation neural network. The performance is then analyzed on the basis of classification accuracy, sensitivity, specificity, positive and negative predictor values, receiver operating characteristic curves and confusion matrix. A total of 9 features has been used to classify breast cancer with an accuracy of 99.27%.


Author(s):  
Aindrila Bhattacherjee ◽  
Sourav Roy ◽  
Sneha Paul ◽  
Payel Roy ◽  
Noreen Kausar ◽  
...  

According to the recent surveys, breast cancer has become one of the major causes of mortality rate among women. Breast cancer can be defined as a group of rapidly growing cells that lead to the formation of a lump or an extra mass in the breast tissue which consequently leads to the formation of tumor. Tumors can be classified as malignant (cancerous) or benign (non-cancerous). Feature selection is an important parameter in determining the classification systems. Machine learning methods are the most commonly used methods among researchers for breast cancer diagnosis. This paper proposes to investigate the WBCD (Wisconsin Breast Cancer Dataset) which comprises of 683 patients and implements the chosen features to train the back propagation neural network. The performance is then analyzed on the basis of classification accuracy, sensitivity, specificity, positive and negative predictor values, receiver operating characteristic curves and confusion matrix. A total of 9 features has been used to classify breast cancer with an accuracy of 99.27%. According to the recent surveys, breast cancer has become one of the major causes of mortality rate among women. Breast cancer can be defined as a group of rapidly growing cells that lead to the formation of a lump or an extra mass in the breast tissue which consequently leads to the formation of tumor. Tumors can be classified as malignant (cancerous) or benign (non-cancerous). Feature selection is an important parameter in determining the classification systems. Machine learning methods are the most commonly used methods among researchers for breast cancer diagnosis. This paper proposes to investigate the WBCD (Wisconsin Breast Cancer Dataset) which comprises of 683 patients and implements the chosen features to train the back propagation neural network. The performance is then analyzed on the basis of classification accuracy, sensitivity, specificity, positive and negative predictor values, receiver operating characteristic curves and confusion matrix. A total of 9 features has been used to classify breast cancer with an accuracy of 99.27%.


2020 ◽  
pp. 2385-2394
Author(s):  
Kamal R. AL-Rawi ◽  
Saifaldeen K. AL-Rawi

Wisconsin Breast Cancer Dataset (WBCD) was employed to show the performance of the Adaptive Resonance Theory (ART), specifically the supervised ART-I Artificial Neural Network (ANN), to build a breast cancer diagnosis smart system. It was fed with different learning parameters and sets. The best result was achieved when the model was trained with 50% of the data and tested with the remaining 50%. Classification accuracy was compared to other artificial intelligence algorithms, which included fuzzy classifier, MLP-ANN, and SVM. We achieved the highest accuracy with such low learning/testing ratio.


Scientifica ◽  
2016 ◽  
Vol 2016 ◽  
pp. 1-6 ◽  
Author(s):  
Amir Ahmad

The early diagnosis of breast cancer is an important step in a fight against the disease. Machine learning techniques have shown promise in improving our understanding of the disease. As medical datasets consist of data points which cannot be precisely assigned to a class, fuzzy methods have been useful for studying of these datasets. Sometimes breast cancer datasets are described by categorical features. Many fuzzy clustering algorithms have been developed for categorical datasets. However, in most of these methods Hamming distance is used to define the distance between the two categorical feature values. In this paper, we use a probabilistic distance measure for the distance computation among a pair of categorical feature values. Experiments demonstrate that the distance measure performs better than Hamming distance for Wisconsin breast cancer data.


2020 ◽  
Vol 2 ◽  
Author(s):  
Panagiotis Stanitsas ◽  
Anoop Cherian ◽  
Vassilios Morellas ◽  
Resha Tejpaul ◽  
Nikolaos Papanikolopoulos ◽  
...  

Introduction: Cancerous Tissue Recognition (CTR) methodologies are continuously integrating advancements at the forefront of machine learning and computer vision, providing a variety of inference schemes for histopathological data. Histopathological data, in most cases, come in the form of high-resolution images, and thus methodologies operating at the patch level are more computationally attractive. Such methodologies capitalize on pixel level annotations (tissue delineations) from expert pathologists, which are then used to derive labels at the patch level. In this work, we envision a digital connected health system that augments the capabilities of the clinicians by providing powerful feature descriptors that may describe malignant regions.Material and Methods: We start with a patch level descriptor, termed Covariance-Kernel Descriptor (CKD), capable of compactly describing tissue architectures associated with carcinomas. To leverage the recognition capability of the CKDs to larger slide regions, we resort to a multiple instance learning framework. In that direction, we derive the Weakly Annotated Image Descriptor (WAID) as the parameters of classifier decision boundaries in a Multiple Instance Learning framework. The WAID is computed on bags of patches corresponding to larger image regions for which binary labels (malignant vs. benign) are provided, thus obviating the necessity for tissue delineations.Results: The CKD was seen to outperform all the considered descriptors, reaching classification accuracy (ACC) of 92.83%. and area under the curve (AUC) of 0.98. The CKD captures higher order correlations between features and was shown to achieve superior performance against a large collection of computer vision features on a private breast cancer dataset. The WAID outperform all other descriptors on the Breast Cancer Histopathological database (BreakHis) where correctly classified malignant (CCM) instances reached 91.27 and 92.00% at the patient and image level, respectively, without resorting to a deep learning scheme achieves state-of-the-art performance.Discussion: Our proposed derivation of the CKD and WAID can help medical experts accomplish their work accurately and faster than the current state-of-the-art.


2017 ◽  
Vol 114 (19) ◽  
pp. 4863-4868 ◽  
Author(s):  
Susanne Gerber ◽  
Illia Horenko

The applicability of many computational approaches is dwelling on the identification of reduced models defined on a small set of collective variables (colvars). A methodology for scalable probability-preserving identification of reduced models and colvars directly from the data is derived—not relying on the availability of the full relation matrices at any stage of the resulting algorithm, allowing for a robust quantification of reduced model uncertainty and allowing us to impose a priori available physical information. We show two applications of the methodology: (i) to obtain a reduced dynamical model for a polypeptide dynamics in water and (ii) to identify diagnostic rules from a standard breast cancer dataset. For the first example, we show that the obtained reduced dynamical model can reproduce the full statistics of spatial molecular configurations—opening possibilities for a robust dimension and model reduction in molecular dynamics. For the breast cancer data, this methodology identifies a very simple diagnostics rule—free of any tuning parameters and exhibiting the same performance quality as the state of the art machine-learning applications with multiple tuning parameters reported for this problem.


2020 ◽  
Vol 13 (6) ◽  
pp. 330-337
Author(s):  
Edi Kusuma ◽  
◽  
Guruh Shidik ◽  
Ricardus Pramunendar ◽  
◽  
...  

Classification is one of the data mining techniques which considered as supervised learning. Classification technique such as Backpropagation Neural Network (BPNN) has been utilized in several fields to increase human productivity. BPNN can give better results (more natural) compared with other statistical techniques. However, the learning process of BPNN could give an inefficient synapse weight of each hidden layer. This ineffective weight can affect the performance of the network. In this research, BPNN optimization using Nelder Mead to identifying the appearance of breast cancer is proposed. The datasets used are Breast Cancer Coimbra Dataset (BCCD), and Wisconsin Breast Cancer Dataset (WBCD). The testing result using accuracy and k-fold validation presents better performance compared with the original BPNN. Best average performance can be seen in the fifth fold of BCCD with 76.5217% of accuracy. Moreover, the highest average result of WBCD presented in the fourth fold with 91.1765% of average accuracy.


Cancer is a disease, which develops, in human body due to gene mutation. Due to various factor cells turn into cancerous cell and grow rapidly while damaging normal cells. Many women get affected by breast cancer, which might even cause death if not treated at early stage. Early detection of breast cancer is highly important to increase the survival rate. Machine learning methods and technologies are making it possible to classify and detect the class in an accurate manner. Among other classifiers, random forest and support vector machine are two classifiers that have a good classification power. In this, research a combination of these two classifier i.e. Random Forest and Support Vector Machine (RFSVM) is proposed for early diagnosis of breast cancer cell using Wisconsin Breast Cancer Dataset (WBCD). Using different train-test data ratio experiments are performed and an average of more than 98percentage accuracy is achieved using this hybrid classifier. This paper overcomes the over-fitting problem of random forest and the need of tuning the parameters of Support Vector Machine. Even with limited data available, the classifier tunes its parameters so well to give a highly accurate result.


Sign in / Sign up

Export Citation Format

Share Document