scholarly journals Interactive Decision Tree Learning and Decision Rule Extraction Based on the ImbTreeEntropy and ImbTreeAUC Packages

Processes ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 1107
Author(s):  
Krzysztof Gajowniczek ◽  
Tomasz Ząbkowski

This paper presents two new R packages ImbTreeEntropy and ImbTreeAUC for building decision trees, including their interactive construction and analysis, which is a highly regarded feature for field experts who want to be involved in the learning process. ImbTreeEntropy functionality includes the application of generalized entropy functions, such as Renyi, Tsallis, Sharma-Mittal, Sharma-Taneja and Kapur, to measure the impurity of a node. ImbTreeAUC provides non-standard measures to choose an optimal split point for an attribute (as well the optimal attribute for splitting) by employing local, semi-global and global AUC measures. The contribution of both packages is that thanks to interactive learning, the user is able to construct a new tree from scratch or, if required, the learning phase enables making a decision regarding the optimal split in ambiguous situations, taking into account each attribute and its cut-off. The main difference with existing solutions is that our packages provide mechanisms that allow for analyzing the trees’ structures (several trees simultaneously) that are built after growing and/or pruning. Both packages support cost-sensitive learning by defining a misclassification cost matrix, as well as weight-sensitive learning. Additionally, the tree structure of the model can be represented as a rule-based model, along with the various quality measures, such as support, confidence, lift, conviction, addedValue, cosine, Jaccard and Laplace.

Electronics ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 657
Author(s):  
Krzysztof Gajowniczek ◽  
Tomasz Ząbkowski

This paper presents two R packages ImbTreeEntropy and ImbTreeAUC to handle imbalanced data problems. ImbTreeEntropy functionality includes application of a generalized entropy functions, such as Rényi, Tsallis, Sharma–Mittal, Sharma–Taneja and Kapur, to measure impurity of a node. ImbTreeAUC provides non-standard measures to choose an optimal split point for an attribute (as well the optimal attribute for splitting) by employing local, semi-global and global AUC (Area Under the ROC curve) measures. Both packages are applicable for binary and multiclass problems and they support cost-sensitive learning, by defining a misclassification cost matrix, and weighted-sensitive learning. The packages accept all types of attributes, including continuous, ordered and nominal, where the latter type is simplified for multiclass problems to reduce the computational overheads. Both applications enable optimization of the thresholds where posterior probabilities determine final class labels in a way that misclassification costs are minimized. Model overfitting can be managed either during the growing phase or at the end using post-pruning. The packages are mainly implemented in R, however some computationally demanding functions are written in plain C++. In order to speed up learning time, parallel processing is supported as well.


Author(s):  
G Deena ◽  
K Raja ◽  
K Kannan

: In this competing world, education has become part of everyday life. The process of imparting the knowledge to the learner through education is the core idea in the Teaching-Learning Process (TLP). An assessment is one way to identify the learner’s weak spot of the area under discussion. An assessment question has higher preferences in judging the learner's skill. In manual preparation, the questions are not assured in excellence and fairness to assess the learner’s cognitive skill. Question generation is the most important part of the teaching-learning process. It is clearly understood that generating the test question is the toughest part. Methods: Proposed an Automatic Question Generation (AQG) system which automatically generates the assessment questions dynamically from the input file. Objective: The Proposed system is to generate the test questions that are mapped with blooms taxonomy to determine the learner’s cognitive level. The cloze type questions are generated using the tag part-of-speech and random function. Rule-based approaches and Natural Language Processing (NLP) techniques are implemented to generate the procedural question of the lowest blooms cognitive levels. Analysis: The outputs are dynamic in nature to create a different set of questions at each execution. Here, input paragraph is selected from computer science domain and their output efficiency are measured using the precision and recall.


2018 ◽  
Vol 38 (2) ◽  
pp. 42-51 ◽  
Author(s):  
José Antonio Hoyo-Montaño ◽  
Jesús Naim Leon-Ortega ◽  
Guillermo Valencia-Palomo ◽  
Rafael Armando Galaz-Bustamante ◽  
Daniel Fernando Espejel-Blanco ◽  
...  

This paper shows the development of a decision tree for the classification of loads in a non-intrusive load monitoring (NILM) system implemented in a simple board computer (Raspberry Pi 3). The decision tree uses the total energy value of the power signal of an equipment, which is generated using a discrete wavelet transform and Parseval’s theorem. The power consumption data of different types of equipment were obtained from a public access database for NILM applications. The best split point for the design of the decision tree was determined using the weighted average Gini index. The tree was validated using loads available in the same public access database.


Author(s):  
Tomoharu Nakashima ◽  
◽  
Yasuyuki Yokota ◽  
Hisao Ishibuchi ◽  
Gerald Schaefer ◽  
...  

We evaluate the performance of cost-sensitive fuzzy-rule-based systems for pattern classification problems. We assume that a misclassification cost is given a priori for each training pattern. The task of classification thus becomes to minimize both classification error and misclassification cost. We examine the performance of two types of fuzzy classification based on fuzzy if-then rules generated from training patterns. The difference is whether or not they consider misclassification costs in rule generation. In our computational experiments, we use several specifications of misclassification cost to evaluate the performance of the two classifiers. Experimental results show that both classification error and misclassification cost are reduced by considering the misclassification cost in fuzzy rule generation.


Sign in / Sign up

Export Citation Format

Share Document