Multi-Step Classification Trees

2012 ◽  
Vol 41 (9) ◽  
pp. 1728-1744 ◽  
Author(s):  
Youngjae Chang
Keyword(s):  
1996 ◽  
Vol 6 (3) ◽  
pp. 231-243 ◽  
Author(s):  
Stanislav Keprta

2003 ◽  
Vol 12 (4) ◽  
pp. 340-346 ◽  
Author(s):  
Christian Y. Mardin ◽  
Torsten Hothorn ◽  
Andrea Peters ◽  
Anselm G. J??nemann ◽  
Nhung X. Nguyen ◽  
...  

2018 ◽  
Author(s):  
John Sheldon ◽  
Francisco Alhanati ◽  
Paul Skoczylas
Keyword(s):  

Author(s):  
TAGHI M. KHOSHGOFTAAR ◽  
EDWARD B. ALLEN ◽  
ARCHANA NAIK ◽  
WENDELL D. JONES ◽  
JOHN P. HUDEPOHL

High software reliability is an important attribute of high-assurance systems. Software quality models yield timely predictions of quality indicators on a module-by-module basis, enabling one to focus on finding faults early in development. This paper introduces the Classification And Regression Trees (CART) a algorithm to practitioners in high-assurance systems engineering. This paper presents practical lessons learned on building classification trees for software quality modeling, including an innovative way to control the balance between misclassification rates. A case study of a very large telecommunications system used CART to build software quality models. The models predicted whether or not modules would have faults discovered by customers, based on various sets of software product and process metrics as independent variables. We found that a model based on two software product metrics had comparable accuracy to a model based on forty product and process metrics.


2018 ◽  
Vol 82 (7) ◽  
pp. 6980
Author(s):  
Samuel C. Karpen ◽  
Steve C. Ellis
Keyword(s):  

Entropy ◽  
2021 ◽  
Vol 23 (9) ◽  
pp. 1210
Author(s):  
Elzbieta Turska ◽  
Szymon Jurga ◽  
Jaroslaw Piskorski

We apply tree-based classification algorithms, namely the classification trees, with the use of the rpart algorithm, random forests and XGBoost methods to detect mood disorder in a group of 2508 lower secondary school students. The dataset presents many challenges, the most important of which is many missing data as well as the being heavily unbalanced (there are few severe mood disorder cases). We find that all algorithms are specific, but only the rpart algorithm is sensitive; i.e., it is able to detect cases of real cases mood disorder. The conclusion of this paper is that this is caused by the fact that the rpart algorithm uses the surrogate variables to handle missing data. The most important social-studies-related result is that the adolescents’ relationships with their parents are the single most important factor in developing mood disorders—far more important than other factors, such as the socio-economic status or school success.


Sign in / Sign up

Export Citation Format

Share Document