Mutual information-based multi-output tree learning algorithm

A tree model with low time complexity can support the application of artificial intelligence to industrial systems. Variable selection based tree learning algorithms are more time efficient than existing Classification and Regression Tree (CART) algorithms. To our best knowledge, there is no attempt to deal with categorical input variable in variable selection based multi-output tree learning. Also, in the case of multi-output regression tree, a conventional variable selection based algorithm is not suitable to large datasets. We propose a mutual information-based multi-output tree learning algorithm that consists of variable selection and split optimization. The proposed method discretizes each variable based on k-means into 2–4 clusters and selects the variable for splitting based on the discretized variables using mutual information. This variable selection component has relatively low time complexity and can be applied regardless of output dimension and types. The proposed split optimization component is more efficient than an exhaustive search. The performance of the proposed tree learning algorithm is similar to or better than that of a multi-output version of CART algorithm on a specific dataset. In addition, with a large dataset, the time complexity of the proposed algorithm is significantly reduced compared to a CART algorithm.

Download Full-text

A Novel Backward Stepwise Logistic Regression and Classification and Regression Tree Model to Predict 180-day Clinical Outcomes in Hepatitis B Virus-Acute-on-Chronic Liver Failure Patients

Journal of Clinical and Translational Hepatology ◽

10.14218/jcth.2021.00240 ◽

2021 ◽

Vol 000 (000) ◽

pp. 000-000

Author(s):

Shima Ghavimi

Keyword(s):

Logistic Regression ◽

Hepatitis B Virus ◽

Regression Tree ◽

Classification And Regression Tree ◽

Stepwise Logistic Regression ◽

Chronic Liver Failure ◽

Tree Model ◽

B Virus ◽

Classification And Regression ◽

Regression Tree Model

Download Full-text

Species Diversity and Community Assembly of Cladocera in the Sand Ponds of the Ulan Buh Desert, Inner Mongolia of China

Diversity ◽

10.3390/d13100502 ◽

2021 ◽

Vol 13 (10) ◽

pp. 502

Author(s):

Yang-Liang Gu ◽

Qi Huang ◽

Lei Xu ◽

Eric Zeus Rizo ◽

Miguel Alonso ◽

...

Keyword(s):

Species Richness ◽

Species Diversity ◽

Community Assembly ◽

Inner Mongolia ◽

Regression Tree ◽

Environmental Variable ◽

Classification And Regression Tree ◽

Tree Model ◽

Environmental Selection ◽

Ulan Buh Desert

In deserts, pond cladocerans suffer harsh conditions like low and erratic rainfall, high evaporation, and highly variable salinity, and they have limited species richness. The limited species can take advantage of ephippia or resting eggs for being dispersed with winds in such habitats. Thus, environmental selection is assumed to play a major role in community assembly, especially at a fine spatial scale. Located in Inner Mongolia, the Ulan Buh desert has plenty of temporary water bodies and a few permanent lakes filled by groundwater. To determine species diversity and the role of environmental selection in community assembly in such a harsh environment, we sampled 37 sand ponds in June 2012. Fourteen species of Cladocera were found in total, including six pelagic species, eight littoral species, and two benthic species. These cladocerans were mainly temperate and cosmopolitan fauna. Our classification and regression tree model showed that conductivity, dissolved oxygen, and pH were the main factors correlated with species richness in the sand ponds. Spatial analysis using a PCNM model demonstrated a broad-scale spatial structure in the cladoceran communities. Conductivity was the most significant environmental variable explaining cladoceran community variation. Two species, Moina cf. brachiata and Ceriodaphnia reticulata occurred commonly, with an overlap at intermediate conductivity. Our results, therefore, support that environmental selection plays a major role in structuring cladoceran communities in deserts.

Download Full-text

Prediction of Sudden Cardiac Death using Classification and Regression Tree Model with Coalesced based ECG and Clinical Data

2018 3rd International Conference on Communication and Electronics Systems (ICCES) ◽

10.1109/cesys.2018.8723979 ◽

2018 ◽

Author(s):

Sean Pereira ◽

Deepak Karia

Keyword(s):

Sudden Cardiac Death ◽

Clinical Data ◽

Cardiac Death ◽

Regression Tree ◽

Classification And Regression Tree ◽

Tree Model ◽

Classification And Regression ◽

Regression Tree Model

Download Full-text

Rainfall Forecasting using the Classification and Regression Tree (CART) Algorithm and Adaptive Synthetic Sampling (Study Case: Bandung Regency)

2019 7th International Conference on Information and Communication Technology (ICoICT) ◽

10.1109/icoict.2019.8835308 ◽

2019 ◽

Cited By ~ 1

Author(s):

Siti Nur Lathifah ◽

Fhira Nhita ◽

Annisa Aditsania ◽

Deni Saepudin

Keyword(s):

Regression Tree ◽

Classification And Regression Tree ◽

Rainfall Forecasting ◽

Study Case ◽

Classification And Regression ◽

Cart Algorithm

Download Full-text

Using machine learning techniques to develop prediction models for detecting unpaid credit card customers

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189080 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6073-6087

Author(s):

Meltem Yontar ◽

Özge Hüsniye Namli ◽

Seda Yanik

Keyword(s):

Decision Tree ◽

Credit Card ◽

Banking Sector ◽

Performance Metrics ◽

Prediction Models ◽

Regression Tree ◽

Classification And Regression Tree ◽

Machine Learning Techniques ◽

Support Vector ◽

Cart Algorithm

Customer behavior prediction is gaining more importance in the banking sector like in any other sector recently. This study aims to propose a model to predict whether credit card users will pay their debts or not. Using the proposed model, potential unpaid risks can be predicted and necessary actions can be taken in time. For the prediction of customers’ payment status of next months, we use Artificial Neural Network (ANN), Support Vector Machine (SVM), Classification and Regression Tree (CART) and C4.5, which are widely used artificial intelligence and decision tree algorithms. Our dataset includes 10713 customer’s records obtained from a well-known bank in Taiwan. These records consist of customer information such as the amount of credit, gender, education level, marital status, age, past payment records, invoice amount and amount of credit card payments. We apply cross validation and hold-out methods to divide our dataset into two parts as training and test sets. Then we evaluate the algorithms with the proposed performance metrics. We also optimize the parameters of the algorithms to improve the performance of prediction. The results show that the model built with the CART algorithm, one of the decision tree algorithm, provides high accuracy (about 86%) to predict the customers’ payment status for next month. When the algorithm parameters are optimized, classification accuracy and performance are increased.

Download Full-text