A decision tree classifier for credit assessment problems in big data environments

DEVELOPING A PARALLEL CLASSIFIER FOR MINING IN BIG DATA SETS

IIUM Engineering Journal ◽

10.31436/iiumej.v22i2.1541 ◽

2021 ◽

Vol 22 (2) ◽

pp. 119-134

Author(s):

Ahad Shamseen ◽

Morteza Mohammadi Zanjireh ◽

Mahdi Bahaghighat ◽

Qin Xin

Keyword(s):

Data Mining ◽

Big Data ◽

Decision Tree ◽

Main Memory ◽

Experimental Results ◽

Primary Data ◽

Data Sets ◽

Decision Tree Classifier ◽

Vast Amount ◽

Tree Classifier

Data mining is the extraction of information and its roles from a vast amount of data. This topic is one of the most important topics these days. Nowadays, massive amounts of data are generated and stored each day. This data has useful information in different fields that attract programmers’ and engineers’ attention. One of the primary data mining classifying algorithms is the decision tree. Decision tree techniques have several advantages but also present drawbacks. One of its main drawbacks is its need to reside its data in the main memory. SPRINT is one of the decision tree builder classifiers that has proposed a fix for this problem. In this paper, our research developed a new parallel decision tree classifier by working on SPRINT results. Our experimental results show considerable improvements in terms of the runtime and memory requirements compared to the SPRINT classifier. Our proposed classifier algorithm could be implemented in serial and parallel environments and can deal with big data. ABSTRAK: Perlombongan data adalah pengekstrakan maklumat dan peranannya dari sejumlah besar data. Topik ini adalah salah satu topik yang paling penting pada masa ini. Pada masa ini, data yang banyak dihasilkan dan disimpan setiap hari. Data ini mempunyai maklumat berguna dalam pelbagai bidang yang menarik perhatian pengaturcara dan jurutera. Salah satu algoritma pengkelasan perlombongan data utama adalah pokok keputusan. Teknik pokok keputusan mempunyai beberapa kelebihan tetapi kekurangan. Salah satu kelemahan utamanya adalah keperluan menyimpan datanya dalam memori utama. SPRINT adalah salah satu pengelasan pembangun pokok keputusan yang telah mengemukakan untuk masalah ini. Dalam makalah ini, penyelidikan kami sedang mengembangkan pengkelasan pokok keputusan selari baru dengan mengusahakan hasil SPRINT. Hasil percubaan kami menunjukkan peningkatan yang besar dari segi jangka masa dan keperluan memori berbanding dengan pengelasan SPRINT. Algoritma pengklasifikasi yang dicadangkan kami dapat dilaksanakan dalam persekitaran bersiri dan selari dan dapat menangani data besar.

A Parallel Weighted Decision Tree Classifier for Complex Spatial Landslide Analysis: Big Data Computation Approach

International Journal of Computer Applications ◽

10.5120/ijca2015905346 ◽

2015 ◽

Vol 124 (2) ◽

pp. 5-9 ◽

Cited By ~ 1

Author(s):

P. Anbalagan ◽

R.M. Chandrasekaran

Keyword(s):

Big Data ◽

Decision Tree ◽

Decision Tree Classifier ◽

Tree Classifier

Improving the Performance of a Proxy Cache Using Very Fast Decision Tree Classifier

Procedia Computer Science ◽

10.1016/j.procs.2015.04.186 ◽

2015 ◽

Vol 48 ◽

pp. 304-312 ◽

Cited By ~ 6

Author(s):

P. Julian Benadit ◽

F. Sagayaraj Francis

Keyword(s):

Decision Tree ◽

Decision Tree Classifier ◽

Proxy Cache ◽

Tree Classifier ◽

Very Fast Decision Tree ◽

Fast Decision

Comparing learning accuracies of neural nets and decision-tree classifier systems

Proceedings of the 1990 Symposium on Applied Computing ◽

10.1109/soac.1990.82136 ◽

2002 ◽

Author(s):

A.K. Rigler ◽

D.C. St. Clair

Keyword(s):

Decision Tree ◽

Neural Nets ◽

Classifier Systems ◽

Decision Tree Classifier ◽

Tree Classifier

CMP: a fast decision tree classifier using multivariate predictions

Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073) ◽

10.1109/icde.2000.839444 ◽

2002 ◽

Cited By ~ 10

Author(s):

H. Wang ◽

C. Zaniolo

Keyword(s):

Decision Tree ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Fast Decision

Comparison of Land Cover Characterization Using EOS MISR and MODIS Data and a Decision Tree Classifier

Geocarto International ◽

10.1080/10106040608542389 ◽

2006 ◽

Vol 21 (3) ◽

pp. 19-26 ◽

Cited By ~ 2

Author(s):

Limin Yang

Keyword(s):

Land Cover ◽

Decision Tree ◽

Decision Tree Classifier ◽

Modis Data ◽

Tree Classifier

Traffic Prediction Using Decision Tree Classifier in Hive Metastore

Lecture Notes on Data Engineering and Communications Technologies - Proceeding of the International Conference on Computer Networks, Big Data and IoT (ICCBI - 2018) ◽

10.1007/978-3-030-24643-3_68 ◽

2019 ◽

pp. 571-578

Author(s):

D. Suvitha ◽

M. Vijayalakshmi

Keyword(s):

Decision Tree ◽

Traffic Prediction ◽

Decision Tree Classifier ◽

Tree Classifier

PERFORMANCE ANALYSIS OF BREAST CANCER CLASSIFICATION USING DECISION TREE CLASSIFIERS

International Journal of Current Pharmaceutical Research ◽

10.22159/ijcpr.2017v9i2.17383 ◽

2017 ◽

Vol 9 (2) ◽

pp. 19 ◽

Cited By ~ 6

Author(s):

P. Hamsagayathri ◽

P. Sampath

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Ductal Carcinoma ◽

Research Work ◽

The United States ◽

Breast Cancer Dataset ◽

Decision Tree Classifier ◽

Cancer Dataset ◽

Term Survival ◽

Tree Classifier

Breast cancer is one of the dangerous cancers among world’s women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in the United States. Also, 246,660 new cases of women with cancer are estimated for the year 2016. Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification plays an important role in breast cancer detection and used by researchers to analyse and classify the medical data. In this research work, priority-based decision tree classifier algorithm has been implemented for Wisconsin Breast cancer dataset. This paper analyzes the different decision tree classifier algorithms for Wisconsin original, diagnostic and prognostic dataset using WEKA software. The performance of the classifiers are evaluated against the parameters like accuracy, Kappa statistic, Entropy, RMSE, TP Rate, FP Rate, Precision, Recall, F-Measure, ROC, Specificity, Sensitivity.

Unified framework for triaxial accelerometer‐based fall event detection and classification using cumulants and hierarchical decision tree classifier

Healthcare Technology Letters ◽

10.1049/htl.2015.0018 ◽

2015 ◽

Vol 2 (4) ◽

pp. 101-107 ◽

Cited By ~ 9

Author(s):

Satya Samyukta Kambhampati ◽

Vishal Singh ◽

M. Sabarimalai Manikandan ◽

Barathram Ramkumar

Keyword(s):

Decision Tree ◽

Event Detection ◽

Triaxial Accelerometer ◽

Decision Tree Classifier ◽

Unified Framework ◽

Tree Classifier ◽

Hierarchical Decision

Rule Induction of Automotive Historic Styles Using Decision Tree Classifier

Advances in Computational Collective Intelligence - Communications in Computer and Information Science ◽

10.1007/978-3-030-63119-2_1 ◽

2020 ◽

pp. 3-14

Author(s):

Hung-Hsiang Wang ◽

Chih-Ping Chen

Keyword(s):

Decision Tree ◽

Rule Induction ◽

Decision Tree Classifier ◽

Tree Classifier