scholarly journals An Extended C4.5 Classification Algorithm using Mathematical Series

2019 ◽  
Vol 7 (2) ◽  
pp. 54-59
Author(s):  
R. Raja Aswathi ◽  
◽  
K. Pazhani Kumar ◽  
B. Ramakrishnan

The algorithm C4.5 is an efficient decision tree based classification, which is derived from the ID3 approach. C4.5 is also a rule based classification algorithm. The main importance of the C4.5 algorithm is that it can deal with categorical data, over fitting of data and handling of missing values. The performance of C4.5 is superior to ID3 even with equal number of attributes. The EC4.5 (Exponential C4.5) is an extension of C4.5 algorithm which uses exponential of split value to predict the gain of attributes and handled the set back reported in C4.5. However the EC4.5 has some misclassification of data and to avoid this problem a new technique is introduced. This paper proposes a proficient technique TMC4.5 (Taylor-Madhava C4.5) to reduce the uncertainty in classification of data by integrating an exponential split value in EC4.5 and sin splitting value derived from the Madhava series. By using this technique an optimized gain value is obtained that reduces uncertainty. From the obtained result the TMC4.5 has far better results than the C4.5 and EC4.5 algorithms.

2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Shah Nazir ◽  
Sara Shahzad ◽  
Sher Afzal Khan ◽  
Norma Binti Alias ◽  
Sajid Anwar

Software birthmark is a unique quality of software to detect software theft. Comparing birthmarks of software can tell us whether a program or software is a copy of another. Software theft and piracy are rapidly increasing problems of copying, stealing, and misusing the software without proper permission, as mentioned in the desired license agreement. The estimation of birthmark can play a key role in understanding the effectiveness of a birthmark. In this paper, a new technique is presented to evaluate and estimate software birthmark based on the two most sought-after properties of birthmarks, that is, credibility and resilience. For this purpose, the concept of soft computing such as probabilistic and fuzzy computing has been taken into account and fuzzy logic is used to estimate properties of birthmark. The proposed fuzzy rule based technique is validated through a case study and the results show that the technique is successful in assessing the specified properties of the birthmark, its resilience and credibility. This, in turn, shows how much effort will be required to detect the originality of the software based on its birthmark.


Author(s):  
Heni Sulistiani ◽  
Ahmad Ari Aldino

In pandemic era, almost everyone struggles for their life. College students are such example. They have difficulty in paying tuition fee to continue their study. Based on this problematic situation, Universitas Teknokrat Indonesia grants the students who have good academic performance with tuition fee aid program. Many variables used for determining the grant made it hard to make a decision in a short time or even takes very long time. To make it easier for management to decide who is the right student to get grant, it needs classification model. The purpose of this study is the classification of grant recipients by using decision tree C4.5 algorithm. That can determine whether a potential student can be accepted as an awardee or not. Then, the results of the classification are validated with ten-fold cross validation with an accuracy, precision and recall with the score of 87 % for all part. It means the model perform quite well to be implemented into system.


Author(s):  
Hananda Hafizan ◽  
Anggita Nadia Putri

One of the health problems in Indonesia is the problem of nutritional status of children under five years. Cases of malnutrition are not only a family problem, but also a state problem. The nutritional status of children under five years can be assessed by measuring the human body known as "Anthropometry". To be able to carry out anthropometric examinations and measurements in order to find out the nutritional status of children under five, they can go to public health service places such as the Posyandu. We went to the KENANGA Posyandu located in Wonorejo, Kerasaan sub-district, Simalungun district. The purpose of this study will be to test the model for the classification of nutritional status of children under the WHO-2005 reference standard by utilizing data mining techniques using the Decision Tree method C4.5 Algorithm.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Fei Yang ◽  
Jiazhi Du ◽  
Jiying Lang ◽  
Weigang Lu ◽  
Lei Liu ◽  
...  

Electrocardiogram (ECG) signal is critical to the classification of cardiac arrhythmia using some machine learning methods. In practice, the ECG datasets are usually with multiple missing values due to faults or distortion. Unfortunately, many established algorithms for classification require a fully complete matrix as input. Thus it is necessary to impute the missing data to increase the effectiveness of classification for datasets with a few missing values. In this paper, we compare the main methods for estimating the missing values in electrocardiogram data, e.g., the “Zero method”, “Mean method”, “PCA-based method”, and “RPCA-based method” and then propose a novel KNN-based classification algorithm, i.e., a modified kernel Difference-Weighted KNN classifier (MKDF-WKNN), which is fit for the classification of imbalance datasets. The experimental results on the UCI database indicate that the “RPCA-based method” can successfully handle missing values in arrhythmia dataset no matter how many values in it are missing and our proposed classification algorithm, MKDF-WKNN, is superior to other state-of-the-art algorithms like KNN, DS-WKNN, DF-WKNN, and KDF-WKNN for uneven datasets which impacts the accuracy of classification.


2015 ◽  
Vol 30 (2) ◽  
pp. 446-454 ◽  
Author(s):  
Wei Zhang ◽  
Bing Fu ◽  
Melinda S. Peng ◽  
Tim Li

Abstract This study investigates the classification of developing and nondeveloping tropical disturbances in the western North Pacific (WNP) through the C4.5 algorithm. A decision tree is built based on this algorithm and can be used as a tool to predict future tropical cyclone (TC) genesis events. The results show that the maximum 800-hPa relative vorticity, SST, precipitation rate, divergence averaged between 1000- and 500-hPa levels, and 300-hPa air temperature anomaly are the five most important variables for separating the developing and nondeveloping tropical disturbances. This algorithm also unravels the thresholds of the five variables (i.e., 4.2 × 10−5 s−1 for maximum 800-hPa relative vorticity, 28.2°C for SST, 0.1 mm h−1 for precipitation rate, −0.7 × 10−6 s−1 for vertically averaged convergence, and 0.5°C for 300-hPa air temperature anomaly). Six rules are derived from the decision tree. The classification accuracy of this decision tree is 81.7% for the 2004–10 cases. The hindcast accuracy for the 2011–13 dataset is 84.6%.


2008 ◽  
Vol 17 (05) ◽  
pp. 957-971
Author(s):  
ATAOLLAH EBRAHIMZADEH ◽  
ABOLFAZL RANJBAR ◽  
MEHRDAD ARDEBLILPOUR

Classification of the communication signals has seen under increasing demands. In this paper, we present a new technique that identifies a variety of digital communication signal types. This technique utilizes a radial basis function neural network (RBFN) as the classifier. Swarm intelligence, as an evolutionary algorithm, is used to construct RBFN. A combination of the higher-order moments and the higher-order cumulants up to eight are selected as the features of the considered digital signal types. In conjunction with RBFN, we have used k-fold cross-validation to improve the generalization potentiality. Simulation results show that the proposed technique has high performance for classification of different communication signals even at very low signal-to-noise ratios.


Sign in / Sign up

Export Citation Format

Share Document