Research on Classification Algorithm in Data Mining

The detection of frauds in credit card transactions is a major topic in financial research, of profound economic implications. While this has hitherto been tackled through data analysis techniques, the resemblances between this and other problems, like the design of recommendation systems and of diagnostic/prognostic medical tools, suggest that a complex network approach may yield important benefits. In this paper we present a first hybrid data mining/complex network classification algorithm, able to detect illegal instances in a real card transaction data set. It is based on a recently proposed network reconstruction algorithm that allows creating representations of the deviation of one instance from a reference group. We show how the inclusion of features extracted from the network data representation improves the score obtained by a standard, neural network-based classification algorithm and additionally how this combined approach can outperform a commercial fraud detection system in specific operation niches. Beyond these specific results, this contribution represents a new example on how complex networks and data mining can be integrated as complementary tools, with the former providing a view to data beyond the capabilities of the latter.

Download Full-text

Analysis of Classification Algorithm in Data Mining

International Journal of Data Mining Techniques and Applications ◽

10.20894/ijdmta.102.003.001.007 ◽

2014 ◽

Vol 3 (1) ◽

pp. 30-32

Author(s):

R. Aruna devi ◽

◽

K. Nirmala ◽

Keyword(s):

Data Mining ◽

Classification Algorithm

Download Full-text

Data Mining Based Fuzzy Classification Algorithm for Imbalanced Data

2006 IEEE International Conference on Fuzzy Systems ◽

10.1109/fuzzy.2006.1681806 ◽

2006 ◽

Cited By ~ 1

Author(s):

Le Xu ◽

Mo-Yuen Chow ◽

L.S. Taylor

Keyword(s):

Data Mining ◽

Imbalanced Data ◽

Classification Algorithm ◽

Fuzzy Classification

Download Full-text

Analysis the Data Mining Classification Algorithm

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2018.7066 ◽

2018 ◽

Vol 6 (7) ◽

pp. 463-466

Author(s):

Disha A. Katariya

Keyword(s):

Data Mining ◽

Classification Algorithm

Download Full-text

An Application of J48 Classification Algorithm in Predicting Students’ Academic Performance

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1531.029420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 1172-1176

Keyword(s):

Data Mining ◽

Academic Performance ◽

Knowledge Discovery ◽

Classification Algorithm ◽

Knowledge Discovery In Databases ◽

Struggling Students ◽

Correlational Design ◽

New Information ◽

Exploratory Data ◽

Students Success

This paper sets out to use J48 classification algorithm to predict students’ academic performance towards the end of the semester in the Data Structure course under the Computer Science Program. This algorithm aimed to help faculty in forecasting who among the students would likely to fail and who would make it until the end of the semester. In this way, the faculty could make remedial measures to help those struggling students pass the subject and advance to the next level, thus, increasing students’ success rate and retention in a Higher Education Institutions (HEI). This research employed a descriptive correlational design using Exploratory Data Analysis (EDA) for Data Mining in testing and verifying data to generate new information. Data mining is part of the Knowledge Discovery in Databases (KDD) process where it follows six steps: data selection, data pre-processing, data transformation, data mining, interpretation, and knowledge discovery. Step 1 includes gathering and selecting data for the study and for this purpose, a total of 103 students’ records were collected from the instructors for a period of two semesters, S.Y. 2014 -2015 & 2015 – 2016. Different evaluative criteria contained in the class records were utilized as attributes in predicting students’ academic performance. Steps 2 and 3 is pre-processing and transforming the data where it involves discarding those students who dropped/withdrawn from the semester, and converting the excel file into a comma separated values or .csv file, respectively. After these steps, step 4 or the application of J48 classification algorithm was utilized to discover classification rules. Step 5 refers to the tree visualization results where it identified the strongest predictor that most likely influence the students’ final average grade. Finally, step 7 shows the extracted information from the tree or the extracted rules that can be used by the administration, faculty and other stakeholders to improve the academic performance of the students. In particular, they might consider redesigning and restructuring teaching pedagogies to assist and focus more on struggling students.

Download Full-text

Performance Analysis of Student Healthcare Dataset using Classification Algorithm

Journal of Applied and Emerging Sciences ◽

10.36785/buitems.jaes.278 ◽

2019 ◽

pp. 130-137

Keyword(s):

Data Mining ◽

Decision Tree ◽

Heterogeneous Data ◽

Health Data ◽

Classification Algorithm ◽

Parametric Data ◽

Diagnosis Method ◽

Iot Devices ◽

Tools And Techniques

Nowadays health is considered as a backbone in terms of performance based on Internet of things (IoT devices), which turned out to be important in diagnosing health level of person with the type of disease a person is suffering with plus its severity level. Basically, IoT sensors operate on medical devices produce large volume of dynamic data. The fluctuation in health data, which forced to use data mining tools and techniques for extracting useful data. Therefore, for applying data mining techniques, heterogeneous data needs to be preprocessed. Therefore, by refining the collection of data, health parametric data mining yields better results with associated benefits. The decision tree is proposed in order to consolidate the health attributes of the students to decide the metrics of health scale. This could lead to evaluate the level of performance of the student in class. After mining the student’s health data it is passed to K-Fold cross validation check, so that to determine the accuracy, error rate, precision and recall. The proposed method is considered as an enhanced diagnosis method with fixed patterns for decision tree to make precise decisions. By considering a case study of student’s health prediction based on certain attributes with its levels, the diagnostic such as pattern based using K-NN and decision tree algorithm are tested on trained dataset using WEKA tool. At the end, the comparison of different algorithms will be reflected to generalize the introduction of optimized classification algorithm.

Download Full-text