Data mining classification algorithms: An overview

Data mining is also defined as the process of analyzing a quantity of data (usually a large amount) to find a logical relationship that summarizes the data in a new way that is understandable and useful to the owner of the data. This paper examines the various types of classification algorithms in Data Mining, their applications and categorically states the strengths and limitations of each type. The weaknesses found in each algorithm demonstrate how tasks cannot be performed well when only one type of algorithm is applied. For this reason, it is the view of the writer that further research needs to be carried out to explore the potential of combining several of these algorithms to solve machine learning problems.

Download Full-text

Credit Card Fraud Detection Using Machine Learning Classification Algorithms over Highly Imbalanced Data

Issue 4 - Journal of Science and Technology ◽

10.46243/jst.2020.v5.i3.pp138-146 ◽

2020 ◽

Vol 5 (3) ◽

pp. 138-146 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Data Mining ◽

Credit Card ◽

Imbalanced Data ◽

Learning Systems ◽

Classification Algorithms ◽

Machine Learning Classification ◽

Information Models ◽

Mining Methods ◽

Do So

:Most online customers use cards to pay for their purchases. As charge cards become the most mainstream strategy for installment, instances of misrepresentation relationship with it too increases. The primary goal of this venture is to be ready to perceive false exchanges from non-fake exchanges. In request to do so,primarily,data mining methods are utilized to examine the examples and attributes of deceitful and non-fake transactions.Then,machine learning systems are utilized to foresee the fake and non-fake exchanges automatically. Algorithms LR (Logistic Regression) is used. Therefore, the blend of AI and information mining procedures are utilized to distinguish the fake and non-fake exchanges by learning the examples of the information. Models are made utilizing these calculations and afterward precision,accuracy,recall are determined and an examination is made.

Download Full-text

INTEGRASI NAIVE BAYES DENGAN TEKNIK SAMPLING SMOTE UNTUK MENANGANI DATA TIDAK SEIMBANG

NUANSA INFORMATIKA ◽

10.25134/nuansa.v14i1.2411 ◽

2020 ◽

Vol 14 (1) ◽

pp. 34

Author(s):

Nina Sulistiyowati ◽

Mohamad Jajuli

Keyword(s):

Machine Learning ◽

Data Mining ◽

Sampling Technique ◽

Unbalanced Data ◽

Classification Algorithms ◽

Customer Data ◽

Ve Bayes ◽

Almost All ◽

Loan Amount

Classification of data with unbalanced classes is a major problem in the field of machine learning and data mining. If working on unbalanced data, almost all classification algorithms will produce much higher accuracy for majority classes than minority classes. This research will implement the Synthetic Minority Over-sampling Technique (SMOTE) method to overcome unbalanced data on credit customer data in Rawamerta teacher cooperatives. The research methodology uses SEMMA with the stages of research Sample, Explore, Modify, Model, and Asses. The Sample Phase was conducted to choose the data of the Rawamerta Teachers Cooperative credit customers for 2015-2017 with a total of 878 data with the attributes used namely income, total deposits, loan amount, duration of installments, services, installments, and credit status. The Explore phase analyzes current classes which are categorized as majority classes because there are 813 data, while traffic classes can be categorized as minority classes because there are 65 data. The data shows an imbalance of data between the two classes. The Modify stages perform the 500% SMOTE process. The Model Stage classifies using Na�ve Bayes. Na�ve Bayes modeling with SMOTE produced 1131 successfully classified data correctly and 72 data were not classified correctly while without SMOTE resulted in 818 data was classified correctly and 60 data were not classified correctly.Keywords: Na�ve Bayes, SMOTE, unbalanced data

Download Full-text

Data Mining Approach of Accident Occurrences Identification with Effective Methodology and Implementation

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v8i5.pp4033-4041 ◽

2018 ◽

Vol 8 (5) ◽

pp. 4033 ◽

Cited By ~ 3

Author(s):

Meenu Gupta ◽

Vijender Kumar Solanki ◽

Vijay Kumar Singh ◽

Vicente García-Díaz

Keyword(s):

Machine Learning ◽

Data Mining ◽

Structural Data ◽

Machine Learning Algorithms ◽

Support Vector ◽

Classification Algorithms ◽

Research Approach ◽

Word Count ◽

Data Mining Approach ◽

Frequency Calculation

Data mining is used in various domains of research to identify a new cause for tan effect in the society over the globe. This article includes the same reason for using the data mining to identify the Accident Occurrences in different regions and to identify the most valid reason for happening accidents over the globe. Data Mining and Advanced Machine Learning algorithms are used in this research approach and this article discusses about hyperline, classifications, pre-processing of the data, training the machine with the sample datasets which are collected from different regions in which we have structural and semi-structural data. We will dive into deep of machine learning and data mining classification algorithms to find or predict something novel about the accident occurrences over the globe. We majorly concentrate on two classification algorithms to minify the research and task and they are very basic and important classification algorithms. SVM (Support vector machine), CNB Classifier. This discussion will be quite interesting with WEKA tool for CNB classifier, Bag of Words Identification, Word Count and Frequency Calculation.

Download Full-text

Data Mining Using $\mathcal{MLC}++$ a Machine Learning Library in C++

International Journal of Artificial Intelligence Tools ◽

10.1142/s021821309700027x ◽

1997 ◽

Vol 06 (04) ◽

pp. 537-566 ◽

Cited By ~ 69

Author(s):

Ron Kohavi ◽

Dan Sommerfield ◽

James Dougherty

Keyword(s):

Machine Learning ◽

Data Mining ◽

Pattern Recognition ◽

Statistical Analysis ◽

Classification Algorithms ◽

Pattern Recognition Techniques ◽

Data Mining Algorithms ◽

Multiple Classification ◽

Mining Algorithms ◽

New Algorithms

Data mining algorithms including maching learning, statistical analysis, and pattern recognition techniques can greatly improve our understanding of data warehouses that are now becoming more widespread. In this paper, we focus on classification algorithms and review the need for multiple classification algorithms. We describe a system called [Formula: see text], which was designed to help choose the appropriate classification algorithm for a given dataset by making it easy to compare the utility of different algorithms on a specific dataset of interest. [Formula: see text] not only provides a workbench for such comparisons, but also provides a library of C++ classes to aid in the development of new algorithms, especially hybrid algorithms and multi-strategy algorithms. Such algorithms are generally hard to code from scratch. We discuss design issues, interfaces to other programs, and visualization of the resulting classifiers.

Download Full-text

Applied Classification Algorithms Used in Data Mining During the Vocational Guidance Process in Machine Learning

Inventive Systems and Control - Lecture Notes in Networks and Systems ◽

10.1007/978-981-16-1395-1_11 ◽

2021 ◽

pp. 137-146

Author(s):

Pradeep Bedi ◽

S. B. Goyal ◽

Jugnesh Kumar

Keyword(s):

Machine Learning ◽

Data Mining ◽

Vocational Guidance ◽

Classification Algorithms

Download Full-text

Applications of Multi-Label Classification

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1008.0394s220 ◽

2020 ◽

Vol 9 (4S2) ◽

pp. 86-92

Keyword(s):

Machine Learning ◽

Data Mining ◽

Literature Review ◽

Text Categorization ◽

Research Problem ◽

Learning Problems ◽

Quality Of Data ◽

Challenging Research ◽

Numerous Data

The absence of labels and the bad quality of data is a prevailing challenge in numerous data mining and machine learning problems. The performance of a model is limited by available data samples with few labels for training. These problems are ultra-critical in multi-label classification, which usually needs clean data. Multi-label classification is a challenging research problem that emerges in several applications such as multi-object recognition, text categorization, music categorization and image classification. This paper presents a literature review on multi-label classification, various evaluation metrics used for analyzing performance and research hchallenges.

Download Full-text

On quantum methods for machine learning problems part II: Quantum classification algorithms

Big Data Mining and Analytics ◽

10.26599/bdma.2019.9020018 ◽

2020 ◽

Vol 3 (1) ◽

pp. 56-67

Author(s):

Farid Ablayev ◽

Marat Ablayev ◽

Joshua Zhexue Huang ◽

Kamil Khadiev ◽

Nailya Salikhova ◽

...

Keyword(s):

Machine Learning ◽

Learning Problems ◽

Classification Algorithms

Download Full-text

Ensemble Classification Algorithms for Breast Cancer Prognosis

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b6886.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 1499-1502

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Data Mining ◽

Feature Selection ◽

Death Rate ◽

Breast Cancer Prognosis ◽

Ensemble Classification ◽

Cancer Prognosis ◽

Classification Algorithms

Breast Cancer is the second highest reason for the death rate among women as well as men too in world. In this paper, we used Data mining classification algorithms to find the presence of breast cancer whether it is benign or malignant and analysis is done on the basics of accuracy and time taken in build model. The data is collected from WISCONSIN of UCI machine learning Repository, which includes patient’s samples. The dataset undergoes different algorithm with and without feature selection.

Download Full-text

Enhanced Decision Tree Algorithm for Discovering Intra and Inter Class Exceptions

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1816.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 1539-1548

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Tree ◽

Classification Algorithms ◽

Decision Tree Algorithm ◽

High Confidence ◽

Tree Algorithm ◽

General Rules ◽

Tree Algorithms ◽

Small Disjuncts

Decision tree algorithms, being accurate and comprehensible classifiers, have been one of the most widely used classifiers in data mining and machine learning. However, like many other classification algorithms, decision tree algorithms focus on extracting patterns with high generality and in the process, these ignore some rare but useful and interesting patterns that may exist in small disjuncts of data. Such extraordinary patterns with low support and high confidence capture very specific but exceptional behavior present in data. This paper proposes a novel Enhanced Decision Tree Algorithm for Discovering Intra and Inter-class Exceptions (EDTADE). Intra-class exceptions cover objects of unique interest within a class whereas inter-class exceptions capture rare conditions due to which we are forced shift the class of few unusual objects. For instance, whales and bats are examples of intra-class exceptions since these have unique characteristics within the class of mammals. Further, most of the birds are flying creatures, but the rare birds, like penguin and ostrich fall in the category of no flying birds. Here, penguin and ostrich are inter-class exceptions. In fact, without knowing about such exceptional patterns, our knowledge about a domain is incomplete. We have enhanced the decision tree algorithm by defining a framework for capturing intra and inter-class exceptions at leaf nodes of a decision tree. The proposed algorithm (EDTADE) is applied to many datasets from UCI Machine Learning Repository. The results show that the EDTADE has been successful in discovering many intra and inter-class exceptions. The decision tree augmented with intra and inter-class exceptions are more accurate, comprehensible as well as interesting since these provide additional knowledge in the form of exceptional patterns that deviate from the general rules discovered for classification

Download Full-text

Data Mining and Machine Learning

10.1017/9781108564175 ◽

2020 ◽

Cited By ~ 2

Author(s):

Mohammed J. Zaki ◽

Wagner Meira, Jr

Keyword(s):

Machine Learning ◽

Data Mining

Download Full-text