decision trees
Recently Published Documents


TOTAL DOCUMENTS

3675
(FIVE YEARS 856)

H-INDEX

82
(FIVE YEARS 13)

2022 ◽  
Vol 34 (4) ◽  
pp. 1-17
Author(s):  
Yunhong Xu ◽  
Guangyu Wu ◽  
Yu Chen

Online medical communities have revolutionized the way patients obtain medical-related information and services. Investigating what factors might influence patients’ satisfaction with doctors and predicting their satisfaction can help patients narrow down their choices and increase their loyalty towards online medical communities. Considering the imbalanced feature of dataset collected from Good Doctor, we integrated XGBoost and SMOTE algorithm to examine what factors and these factors can be used to predict patient satisfaction. SMOTE algorithm addresses the imbalanced issue by oversampling imbalanced classification datasets. And XGBoost algorithm is an ensemble of decision trees algorithm where new trees fix errors of existing trees. The experimental results demonstrate that SMOTE and XGBoost algorithm can achieve better performance. We further analyzed the role of features played in satisfaction prediction from two levels: individual feature level and feature combination level.


Entropy ◽  
2022 ◽  
Vol 24 (1) ◽  
pp. 116
Author(s):  
Mikhail Moshkov

In this paper, based on the results of rough set theory, test theory, and exact learning, we investigate decision trees over infinite sets of binary attributes represented as infinite binary information systems. We define the notion of a problem over an information system and study three functions of the Shannon type, which characterize the dependence in the worst case of the minimum depth of a decision tree solving a problem on the number of attributes in the problem description. The considered three functions correspond to (i) decision trees using attributes, (ii) decision trees using hypotheses (an analog of equivalence queries from exact learning), and (iii) decision trees using both attributes and hypotheses. The first function has two possible types of behavior: logarithmic and linear (this result follows from more general results published by the author earlier). The second and the third functions have three possible types of behavior: constant, logarithmic, and linear (these results were published by the author earlier without proofs that are given in the present paper). Based on the obtained results, we divided the set of all infinite binary information systems into four complexity classes. In each class, the type of behavior for each of the considered three functions does not change.


2022 ◽  
Author(s):  
Géraldin Nanfack ◽  
Paul Temple ◽  
Benoît Frénay

Decision trees have the particularity of being machine learning models that are visually easy to interpret and understand. Therefore, they are primarily suited for sensitive domains like medical diagnosis, where decisions need to be explainable. However, if used on complex problems, decision trees can become large, making them hard to grasp. In addition to this aspect, when learning decision trees, it may be necessary to consider a broader class of constraints, such as the fact that two variables should not be used in a single branch of the tree. This motivates the need to enforce constraints in learning algorithms of decision trees. We propose a survey of works that attempted to solve the problem of learning decision trees under constraints. Our contributions are fourfold. First, to the best of our knowledge, this is the first survey that deals with constraints on decision trees. Second, we define a flexible taxonomy of constraints applied to decision trees and methods for their treatment in the literature. Third, we benchmark state-of-the art depth-constrained decision tree learners with respect to predictive accuracy and computational time. Fourth, we discuss potential future research directions that would be of interest for researchers who wish to conduct research in this field.


Author(s):  
Oren Fivel ◽  
Moshe Klein ◽  
Oded Maimon

In this paper we develop the foundation of a new theory for decision trees based on new modeling of phenomena with soft numbers. Soft numbers represent the theory of soft logic that addresses the need to combine real processes and cognitive ones in the same framework. At the same time soft logic develops a new concept of modeling and dealing with uncertainty: the uncertainty of time and space. It is a language that can talk in two reference frames, and also suggest a way to combine them. In the classical probability, in continuous random variables there is no distinguishing between the probability involving strict inequality and non-strict inequality. Moreover, a probability involves equality collapse to zero, without distinguishing among the values that we would like that the random variable will have for comparison. This work presents Soft Probability, by incorporating of Soft Numbers into probability theory. Soft Numbers are set of new numbers that are linear combinations of multiples of ”ones” and multiples of ”zeros”. In this work, we develop a probability involving equality as a ”soft zero” multiple of a probability density function (PDF). We also extend this notion of soft probabilities to the classical definitions of Complements, Unions, Intersections and Conditional probabilities, and also to the expectation, variance and entropy of a continuous random variable, condition being in a union of disjoint intervals and a discrete set of numbers. This extension provides information regarding to a continuous random variable being within discrete set of numbers, such that its probability does not collapse completely to zero. When we developed the notion of soft entropy, we found potentially another soft axis, multiples of 0log(0), that motivates to explore the properties of those new numbers and applications. We extend the notion of soft entropy into the definition of Cross Entropy and Kullback–Leibler-Divergence (KLD), and we found that a soft KLD is a soft number, that does not have a multiple of 0log(0). Based on a soft KLD, we defined a soft mutual information, that can be used as a splitting criteria in decision trees with data set of continuous random variables, consist of single samples and intervals.


Author(s):  
Chaitanya Manapragada ◽  
Heitor M. Gomes ◽  
Mahsa Salehi ◽  
Albert Bifet ◽  
Geoffrey I. Webb
Keyword(s):  

2022 ◽  
pp. 251-275
Author(s):  
Edgar Cossio Franco ◽  
Jorge Alberto Delgado Cazarez ◽  
Carlos Alberto Ochoa Ortiz Zezzatti

The objective of this chapter is to implement an intelligent model based on machine learning in the application of macro-ergonomic methods in human resources processes based on the ISO 12207 standard. To achieve the objective, a method of constructing a Java language algorithm is applied to select the best prospect for a given position. Machine learning is done through decision trees and algorithm j48. Among the findings, it is shown that the model is useful in identifying the best profiles for a given position, optimizing the time in the selection process and human resources as well as the reduction of work stress.


Author(s):  
Akinola S. Olayinka ◽  
Charles Oluwaseun Adetunji ◽  
Wilson Nwankwo ◽  
Olaniyan T. Olugbemi ◽  
Tosin C. Olayinka

Sign in / Sign up

Export Citation Format

Share Document