A Comparison of Open Source Data Mining Tools for Breast Cancer Classification
Data Mining is a field that interconnects areas from computer science, trying to discover knowledge from databases in order to simplify the decision making. Classification is a Data Mining chore that learns from a set of instances in order to precisely classify the target class for new instances. Open source Data Mining tools can be used to make classification. This paper compares four tools: KNIME, Orange, Tanagra and Weka. Our goal is to discover the most precise tool and technique for breast cancer classifications. The experimental results show that some tools achieve better results more than others. Also, using fusion classification task verified to be better than the single classification task over the four datasets have been used. Also, we present a comparison between using complete datasets by substituting missing feature values and incomplete ones. The experimental results show that some datasets have better accuracy when using complete datasets.