Cross-Project Change Prediction Using Meta-Heuristic Techniques

Changes in software systems are inevitable. Identification of change-prone modules can help developers to focus efforts and resources on them. In this article, the authors conduct various intra-project and cross-project change predictions. The authors use distributional characteristics of dataset to generate rules which can be used for successful change prediction. The authors analyze the effectiveness of meta-heuristic decision trees in generating rules for successful cross-project change prediction. The employed meta-heuristic algorithms are hybrid decision tree genetic algorithms and oblique decision trees with evolutionary learning. The authors compare the performance of these meta-heuristic algorithms with C4.5 decision tree model. The authors observe that the accuracy of C4.5 decision tree is 73.33%, whereas the accuracy of the hybrid decision tree genetic algorithm and oblique decision tree are 75.00% and 75.56%, respectively. These values indicate that distributional characteristics are helpful in identifying suitable training set for cross-project change prediction.

Download Full-text

Private Evaluation of Decision Trees using Sublinear Cost

Proceedings on Privacy Enhancing Technologies ◽

10.2478/popets-2019-0015 ◽

2019 ◽

Vol 2019 (1) ◽

pp. 266-286 ◽

Cited By ~ 6

Author(s):

Anselme Tueno ◽

Florian Kerschbaum ◽

Stefan Katzenbeisser

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Main Idea ◽

Computation Time ◽

Tree Model ◽

Real World Data ◽

Data Set ◽

Attribute Vector ◽

Garbled Circuits ◽

Large Trees

Abstract Decision trees are widespread machine learning models used for data classification and have many applications in areas such as healthcare, remote diagnostics, spam filtering, etc. In this paper, we address the problem of privately evaluating a decision tree on private data. In this scenario, the server holds a private decision tree model and the client wants to classify its private attribute vector using the server’s private model. The goal is to obtain the classification while preserving the privacy of both – the decision tree and the client input. After the computation, only the classification result is revealed to the client, while nothing is revealed to the server. Many existing protocols require a constant number of rounds. However, some of these protocols perform as many comparisons as there are decision nodes in the entire tree and others transform the whole plaintext decision tree into an oblivious program, resulting in higher communication costs. The main idea of our novel solution is to represent the tree as an array. Then we execute only d – the depth of the tree – comparisons. Each comparison is performed using a small garbled circuit, which output secret-shares of the index of the next node. We get the inputs to the comparison by obliviously indexing the tree and the attribute vector. We implement oblivious array indexing using either garbled circuits, Oblivious Transfer or Oblivious RAM (ORAM). Using ORAM, this results in the first protocol with sub-linear cost in the size of the tree. We implemented and evaluated our solution using the different array indexing procedures mentioned above. As a result, we are not only able to provide the first protocol with sublinear cost for large trees, but also reduce the communication cost for the large real-world data set “Spambase” from 18 MB to 1[triangleright]2 MB and the computation time from 17 seconds to less than 1 second in a LAN setting, compared to the best related work.

Download Full-text

A New Aggregated Attribute Values Match Technique for Improving the Quality of Probability Estimated Decision Trees

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g5323.059720 ◽

2020 ◽

Vol 9 (7) ◽

pp. 446-452

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Area Under The Curve ◽

Probability Estimation ◽

Evaluation Technique ◽

Tree Model ◽

Match Technique ◽

Single Tree ◽

Model Technique

Probability estimations of decision trees may not be useful directly because their poor probability estimations but the best probability estimations are desired in many useful applications. Many techniques have been proposed for obtaining good probability estimations of decision trees. Two such optical techniques are identified and the first one is single tree based aggregation of mismatched attribute values of instances. The second one is bagging technique but it is costly and less comprehensible. So, in this paper a single aggregated probability estimation decision tree model technique is proposed for improving the performance of probability estimations of decision trees and the performance of new technique is evaluated using area under the curve (AUC) evaluation technique. The proposed technique computes aggregate scores based on matched attribute values of test tuples.

Download Full-text

Weighted Oblique Decision Trees

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015621 ◽

2019 ◽

Vol 33 ◽

pp. 5621-5627

Author(s):

Bin-Bin Yang ◽

Song-Qing Shen ◽

Wei Gao

Keyword(s):

Decision Tree ◽

Objective Function ◽

Decision Trees ◽

Information Entropy ◽

Heuristic Algorithms ◽

Continuous Optimization ◽

Tree Structure ◽

The Past ◽

Axis Parallel ◽

Random Initialization

Decision trees have attracted much attention during the past decades. Previous decision trees include axis-parallel and oblique decision trees; both of them try to find the best splits via exhaustive search or heuristic algorithms in each iteration. Oblique decision trees generally simplify tree structure and take better performance, but are always accompanied with higher computation, as well as the initialization with the best axis-parallel splits. This work presents the Weighted Oblique Decision Tree (WODT) based on continuous optimization with random initialization. We consider different weights of each instance for child nodes at all internal nodes, and then obtain a split by optimizing the continuous and differentiable objective function of weighted information entropy. Extensive experiments show the effectiveness of the proposed algorithm.

Download Full-text

C4.5 Algorithm To Predict The Impact Of The Earthquake

10.31227/osf.io/rbwmg ◽

2017 ◽

Author(s):

Robbi Rahim ◽

Efori Buulolo ◽

Natalia Silalahi ◽

Fadlina

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Tree Form ◽

C4.5 Algorithm ◽

C4.5 Decision Tree ◽

The Impact

One of the impacts of the quake was heavily damaged, the even tsunami killed at no less. One cause many deaths is because many can not predict the impact of earthquakes. Data earthquakes that occurred earlier can be used to predict the incidence of the quake will probably happen someday. One algorithm that can be used to predict is the algorithm C4.5. The results of the algorithm C4.5 decision tree form, decision trees characteristic or condition of the earthquake and the decision, where the decision is a fruit of the earthquake that occurred modeling

Download Full-text

Teaching Quality Evaluation and Scheme Prediction Model Based on Improved Decision Tree Algorithm

International Journal of Emerging Technologies in Learning (iJET) ◽

10.3991/ijet.v13i10.9460 ◽

2018 ◽

Vol 13 (10) ◽

pp. 146 ◽

Cited By ~ 3

Author(s):

Sujuan Jia ◽

Yajing Pang

Keyword(s):

Decision Tree ◽

Quality Evaluation ◽

Computation Time ◽

Teaching Quality ◽

Quality Data ◽

Decision Tree Algorithm ◽

Tree Model ◽

Tree Algorithm ◽

Higher Education System ◽

C4.5 Decision Tree

Vast data in the higher education system are used to analyse and evaluate the teaching quality, so that the key factors that affect the quality of teaching can be predicted. Besides, the learner’s personalized behaviour can also become the data source for teaching result prediction. This paper proposes a decision tree model by taking the teaching quality data and the statistical analysis results of the learn-er’s personalized behaviour as inputs. This model was based on the improved C4.5 decision tree algorithm, which used the FAYYAD boundary point decision theorem for effectively reducing the computation time to the most threshold. In this algorithm, the iterative analysis mechanism was introduced in combination with the data change of the learner’s personalized behaviour, so as to dynamically adjust the final teaching evaluation result. Finally, according to the actual statisti-cal data of one academic year, the teaching quality evaluation was effectively completed and the direction of future teaching prediction was proposed.

Download Full-text

Common Metamodel of Questionnaire Model and Decision Tree Model

Research in Applied Economics ◽

10.5296/rae.v10i3.13540 ◽

2018 ◽

Vol 10 (3) ◽

pp. 106

Author(s):

Mirza Suljic ◽

Edin Osmanbegovic ◽

Željko Dobrović

Keyword(s):

Decision Making ◽

Decision Tree ◽

Decision Trees ◽

Decision Tree Model ◽

Tree Model ◽

Meta Model ◽

The Subject ◽

Decision Tree Method ◽

Questionnaire Method ◽

Tree Method

The subject of this paper is metamodeling and its application in the field of scientific research. The main goal is to explore the possibilities of integration of two methods: questionnaires and decision trees. The questionnaire method was established as one of the methods for data collecting, while the decision tree method represents an alternative way of presenting and analyzing decision making situations. These two methods are not completely independent, but on the contrary, there is a strong natural bond between them. Therefore, the result reveals a common meta-model that over common concepts and with the use of metamodeling connects the methods: questionnaires and decision trees. The obtained results can be used to create a CASE tool or create repository that can be suitable for exchange between different systems. The proposed meta-model is not necessarily the final product. It could be further developed by adding more entities that will keep some other data.

Download Full-text

R-C4.5 decision tree model and its applications to health care dataset

Proceedings of ICSSSM '05. 2005 International Conference on Services Systems and Services Management, 2005. ◽

10.1109/icsssm.2005.1500165 ◽

2005 ◽

Cited By ~ 5

Author(s):

Zheng Yao ◽

Peng Liu ◽

Lei Lei ◽

Junjie Yin

Keyword(s):

Health Care ◽

Decision Tree ◽

Decision Tree Model ◽

Tree Model ◽

C4.5 Decision Tree

Download Full-text

Reservoir inflow forecasting using ID3 and C4.5 decision tree model

2017 3rd IEEE International Conference on Control Science and Systems Engineering (ICCSSE) ◽

10.1109/ccsse.2017.8088023 ◽

2017 ◽

Cited By ~ 2

Author(s):

Pattama Charoenporn

Keyword(s):

Decision Tree ◽

Decision Tree Model ◽

Tree Model ◽

Inflow Forecasting ◽

C4.5 Decision Tree ◽

Reservoir Inflow

Download Full-text

ANALISIS KINERJA ALGORITMA C4.5 DAN NAÏVE BAYES UNTUK MEMPREDIKSI PRESTASI SISWA SEKOLAH MENENGAH KEJURUAN

JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer) ◽

10.33480/jitk.v5i1.607 ◽

2019 ◽

Vol 5 (1) ◽

pp. 23-28

Author(s):

Astrid Noviriandini ◽

Nurajijah Nurajijah

Keyword(s):

Student Achievement ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Tree Model ◽

Learning Achievement ◽

Achievement Prediction ◽

C4.5 Decision Tree ◽

Auc Value ◽

Bayes Algorithm

This research informs students and teachers to anticipate early in following the learning period in order to get maximum learning outcomes. The method used is C4.5 decision tree algorithm and Naïve Bayes algorithm. The purpose of this study was to compare and evaluate the decision tree model C4.5 as the selected algorithm and Naïve Bayes to find out algorithms that have higher accuracy in predicting student achievement. Learning achievement can be measured by the value of report cards. After comparison of the two algorithms, the results of the learning achievement prediction are obtained. The results showed that the Naïve Bayes algorithm had an accuracy value of 95.67% and the AUC value of 0.999 was included in Excellent Clasification, for the C4.5 algorithm the accuracy value was 90.91% and the AUC value of 0.639 was included in the state of Poor Clasification. Thus the Naïve Bayes algorithm can better predict student achievement.

Download Full-text