classification and regression tree Latest Research Papers

Abstract Background Inappropriate antibiotics use in lower respiratory tract infections (LRTI) is a major contributor to resistance. We aimed to design an algorithm based on clinical signs and host biomarkers to identify bacterial community-acquired pneumonia (CAP) among patients with LRTI. Methods Participants with LRTI were selected in a prospective cohort of febrile (≥ 38 °C) adults presenting to outpatient clinics in Dar es Salaam. Participants underwent chest X-ray, multiplex PCR for respiratory pathogens, and measurements of 13 biomarkers. We evaluated the predictive accuracy of clinical signs and biomarkers using logistic regression and classification and regression tree analysis. Results Of 110 patients with LRTI, 17 had bacterial CAP. Procalcitonin (PCT), interleukin-6 (IL-6) and soluble triggering receptor expressed by myeloid cells-1 (sTREM-1) showed an excellent predictive accuracy to identify bacterial CAP (AUROC 0.88, 95%CI 0.78–0.98; 0.84, 0.72–0.99; 0.83, 0.74–0.92, respectively). Combining respiratory rate with PCT or IL-6 significantly improved the model compared to respiratory rate alone (p = 0.006, p = 0.033, respectively). An algorithm with respiratory rate (≥ 32/min) and PCT (≥ 0.25 μg/L) had 94% sensitivity and 82% specificity. Conclusions PCT, IL-6 and sTREM-1 had an excellent predictive accuracy in differentiating bacterial CAP from other LRTIs. An algorithm combining respiratory rate and PCT displayed even better performance in this sub-Sahara African setting.

Download Full-text

Main Factors That Explain the Use of Fertilisers on Farms in the European Union

10.4018/978-1-7998-9557-2.ch009 ◽

2022 ◽

pp. 155-184

Author(s):

Vítor João Pereira Domingues Martinho

Keyword(s):

European Union ◽

Agricultural Sector ◽

Regression Tree ◽

Classification And Regression Tree ◽

The European Union ◽

Sustainable Solutions ◽

Farm Accountancy Data Network ◽

Classification And Regression ◽

Main Factors ◽

Main Determinants

A deeper assessment of the main determinants associated with the use of fertilisers, for example, in the European Union farms may bring relevant insights about the respective frameworks and support to find more sustainable solutions. In this context, the main objective of this study is to identify factors that influence the use of fertilisers in the agricultural sector of the European Union regions. To achieve this objective, statistical information, at farm level, from the European Farm Accountancy Data Network was considered. These data were first analysed through exploratory approaches and after assessed with classification and regression tree methodologies. The results obtained provide interesting insights to promote a more sustainable development in the European farms, namely supporting the policymakers to design more adjusted measures and instruments. In addition, the fertilisers costs on the European Union farms are mainly explained by crop output, costs with inputs, current subsidies, utilised agricultural area, and gross investment.

Download Full-text

Missing Data Imputation – A Survey

International Journal of Decision Support System Technology ◽

10.4018/ijdsst.292446 ◽

2022 ◽

Vol 14 (1) ◽

pp. 0-0

Keyword(s):

Missing Data ◽

Linear Regression ◽

Missing Values ◽

Computational Cost ◽

Machine Learning Algorithms ◽

Classification And Regression Tree ◽

High Dimensional ◽

Missing Data Imputation ◽

Real World Datasets ◽

Incomplete Datasets

Many real world datasets may contain missing values for various reasons. These incomplete datasets can pose severe issues to the underlying machine learning algorithms and decision support systems. It may result in high computational cost, skewed output and invalid deductions. Various solutions exist to mitigate this issue; the most popular strategy is to estimate the missing values by applying inferential techniques such as linear regression, decision trees or Bayesian inference. In this paper, the missing data problem is discussed in detail with a comprehensive review of the approaches to tackle it. The paper concludes with a discussion on the effectiveness of three imputation methods namely, imputation based on Multiple Linear Regression (MLR), Predictive Mean Matching (PMM) and Classification And Regression Tree (CART) in the context of subspace clustering. The experimental results obtained on real benchmark datasets and high-dimensional synthetic datasets highlight that, MLR based imputation method is more efficient on high-dimensional incomplete datasets.

Download Full-text

Intraoperative lactic acid concentration during liver transplantation and cutoff values to predict early mortality: a retrospective analysis of 3,338 cases

Anesthesia and Pain Medicine ◽

10.17085/apm.21056 ◽

2021 ◽

Author(s):

Kyoung-Sun Kim ◽

Sang-Ho Lee ◽

Bo-Hyun Sang ◽

Gyu-Sam Hwang

Keyword(s):

Liver Transplantation ◽

Lactic Acid ◽

Regression Tree ◽

Mortality Rates ◽

Early Mortality ◽

Classification And Regression Tree ◽

Lactic Acid Concentration ◽

Optimal Cutoff ◽

Cutoff Values ◽

Classification And Regression

Background: We aimed to explore intraoperative lactic acid (LA) level distribution during liver transplantation (LT) and determine the optimal cutoff values to predict post-LT 30-day and 90-day mortality.Methods: Intraoperative LA data from 3,338 patients were collected between 2008 to 2019 and all-cause mortalities within 30 and 90 days were retrospectively reviewed. Of the three LA levels measured during preanhepatic, anhepatic, and neohepatic phase of LT, the peak LA level was selected to explore the distribution and predict early post-LT mortality. To determine the best cutoff values of LA, we used a classification and regression tree algorithm and maximally selected rank statistics with the smallest P value.Results: The median intraoperative LA level was 4.4 mmol/L (range: 0.5–34.7, interquartile range: 3.0–6.2 mmol/L). Of the 3,338 patients, 1,884 (56.4%) had LA levels > 4.0 mmol/L and 188 (5.6%) had LA levels > 10 mmol/L. Patients with LA levels > 16.7 mmol/L and 13.5–16.7 mmol/L showed significantly higher 30-day mortality rates of 58.3% and 21.2%, respectively. For the prediction of the 90-day mortality, 8.4 mmol/L of intraoperative LA was the best cutoff value.Conclusions: Approximately 6% of the LT recipients showed intraoperative hyperlactatemia of > 10 mmol/L during LT, and those with LA > 8.4 mmol/L were associated with significantly higher early post-LT mortality.

Download Full-text

Thresholding of prominent biomarkers of breast cancer on overall survival using classification and regression tree

Cancer Biomarkers ◽

10.3233/cbm-210470 ◽

2021 ◽

pp. 1-10

Author(s):

Pragya Kumari ◽

Gajendra K. Vishwakarma ◽

Atanu Bhattacharjee

Keyword(s):

Breast Cancer ◽

Overall Survival ◽

Survival Analysis ◽

Regression Tree ◽

Vital Role ◽

Classification And Regression Tree ◽

Brier Score ◽

Risk Of Death ◽

Cart Analysis ◽

Classification And Regression

BACKGROUND: HER2, ER, PR, and ERBB2 play a vital role in treating breast cancer. These are significant predictive and prognosis biomarkers of breast cancer. OBJECTIVE: We aim to obtain a unique biomarker-specific prediction on overall survival to know their survival and death risk. METHODS: Survival analysis is performed on classified data using Classification and Regression Tree (CART) analysis. Hazard ratio and Confidence Interval are computed using MLE and the Bayesian approach with the CPH model for univariate and multivariable illustrations. Validation of CART is executed with the Brier score, and accuracy and sensitivity are obtained using the k-nn classifier. RESULTS: Utilizing CART analysis, the cut-off value of continuous-valued biomarkers HER2, ER, PR, and ERBB2 are obtained as 14.707, 8.128, 13.153, and 6.884, respectively. Brier score of CART is 0.16 towards validation of methodology. Survival analysis gives a demonstration of the survival estimates with significant statistical strategies. CONCLUSIONS: Patients with breast cancer are at low risk of death, whose HER2 value is below its cut-off value, and ER, PR, and ERBB2 values are greater than their cut-off values. This comparison is with the patient having the opposite side of these cut-off values for the same biomarkers.

Download Full-text

Body Weight Prediction of Thalli Sheep Reared in Southern Punjab Using Different Data Mining Algorithms

Proceedings of the Pakistan Academy of Sciences: A. Physical and Computational Sciences ◽

10.53560/ppasa(58-2)603 ◽

2021 ◽

Vol 58 (2) ◽

pp. 29-38

Author(s):

Ansar Abbas ◽

Muhammad Aman Ullah ◽

Abdul Waheed

Keyword(s):

Data Mining ◽

Body Weight ◽

Goodness Of Fit ◽

The Body ◽

Classification And Regression Tree ◽

Body Measurements ◽

Data Set ◽

Data Mining Algorithms ◽

Exhaustive Chaid ◽

Mining Algorithms

This study is conducted to predict the body weight (BW) for Thalli sheep of southern Punjab from different body measurements. In the BW prediction, several body measurements viz., withers height, body length, head length, head width, ear length, ear width, neck length, neck width, heart girth, rump length, rump width, tail length, barrel depth and sacral pelvic width are used as predictors. The data mining algorithms such as Chi-square Automatic Interaction Detector (CHAID), Exhaustive CHAID, Classification and Regression Tree (CART) and Artificial Neural Network (ANN) are used to predict the BW for a total of 85 female Thalli sheep. The data set is partitioned into training (80 %) and test (20 %) sets before the algorithms are used. The minimum number of parent (4) and child nodes (2) are set in order to ensure their predictive ability. The R2 % and RMSE values for CHAID, Exhaustive CHAID, ANN and CART algorithms are 67.38(1.003), 64.37(1.049), 61.45(1.093) and 59.02(1.125), respectively. The mostsignificant predictor is BL in the BW prediction of Thalli sheep. The heaviest BW average of 9.596 kg is obtained from the subgroup of those having BL > 25.000 inches. On behalf of the several goodness of fit criteria, we conclude that the CHAID algorithm performance is better in order to predict the BW of Thalli sheep and more suitable decision tree diagram visually. Also, the obtained CHAID results may help to determine body measurements positively associated with BW for developing better selection strategies with the scope of indirect selection criteria.

Download Full-text

Flood Susceptibility Modeling in a Subtropical Humid Low-Relief Alluvial Plain Environment: Application of Novel Ensemble Machine Learning Approach

Frontiers in Earth Science ◽

10.3389/feart.2021.659296 ◽

2021 ◽

Vol 9 ◽

Author(s):

Manish Pandey ◽

Aman Arora ◽

Alireza Arabameri ◽

Romulus Costache ◽

Naveen Kumar ◽

...

Keyword(s):

Machine Learning ◽

Regression Tree ◽

Classification And Regression Tree ◽

Ground Subsidence ◽

Ensemble Model ◽

Ganga Plain ◽

Humid Climate ◽

Area Index ◽

Flood Susceptibility ◽

Middle Ganga Plain

This study has developed a new ensemble model and tested another ensemble model for flood susceptibility mapping in the Middle Ganga Plain (MGP). The results of these two models have been quantitatively compared for performance analysis in zoning flood susceptible areas of low altitudinal range, humid subtropical fluvial floodplain environment of the Middle Ganga Plain (MGP). This part of the MGP, which is in the central Ganga River Basin (GRB), is experiencing worse floods in the changing climatic scenario causing an increased level of loss of life and property. The MGP experiencing monsoonal subtropical humid climate, active tectonics induced ground subsidence, increasing population, and shifting landuse/landcover trends and pattern, is the best natural laboratory to test all the susceptibility prediction genre of models to achieve the choice of best performing model with the constant number of input parameters for this type of topoclimatic environmental setting. This will help in achieving the goal of model universality, i.e., finding out the best performing susceptibility prediction model for this type of topoclimatic setting with the similar number and type of input variables. Based on the highly accurate flood inventory and using 12 flood predictors (FPs) (selected using field experience of the study area and literature survey), two machine learning (ML) ensemble models developed by bagging frequency ratio (FR) and evidential belief function (EBF) with classification and regression tree (CART), CART-FR and CART-EBF, were applied for flood susceptibility zonation mapping. Flood and non-flood points randomly generated using flood inventory have been apportioned in 70:30 ratio for training and validation of the ensembles. Based on the evaluation performance using threshold-independent evaluation statistic, area under receiver operating characteristic (AUROC) curve, 14 threshold-dependent evaluation metrices, and seed cell area index (SCAI) meant for assessing different aspects of ensembles, the study suggests that CART-EBF (AUCSR = 0.843; AUCPR = 0.819) was a better performant than CART-FR (AUCSR = 0.828; AUCPR = 0.802). The variability in performances of these novel-advanced ensembles and their comparison with results of other published models espouse the need of testing these as well as other genres of susceptibility models in other topoclimatic environments also. Results of this study are important for natural hazard managers and can be used to compute the damages through risk analysis.

Download Full-text

MALARIA PREDICTION MODEL USING ADVANCED ENSEMBLE MACHINE LEARNING TECHNIQUES

Journal of medical pharmaceutical and allied sciences ◽

10.22270/jmpas.v10i6.1701 ◽

2021 ◽

Vol 10 (6) ◽

pp. 3794-3801

Author(s):

Yusuf Aliyu Adamu

Keyword(s):

Machine Learning ◽

Malaria Incidence ◽

Regression Tree ◽

Ensemble Method ◽

Classification And Regression Tree ◽

Machine Learning Techniques ◽

Ensemble Machine Learning ◽

Suggested Technique ◽

Life Threatening ◽

Classification And Regression

Malaria is a life-threatening disease that leads to death globally, its early prediction is necessary for preventing the rapid transmission. In this work, an enhanced ensemble learning approach for predicting malaria outbreaks is suggested. Using a mean-based splitting strategy, the dataset is randomly partitioned into smaller groups. The splits are then modelled using a classification and regression tree, and an accuracy-based weighted aging classifier ensemble is used to construct a homogenous ensemble from the several Classification and Regression Tree models. This approach ensures higher performance is achieved. Seven different Algorithms were tested and one ensemble method is used which combines all the seven classifiers together and finally, the accuracy, precision, and sensitivity achieved for the proposed method is 93%, 92%, and 100% respectively, which outperformed better than machine learning classifiers and ensemble method used in this research. The correlation between the variables used is established and how each factor contributes to the malaria incidence. The result indicates that malaria outbreaks can be predicted successfully using the suggested technique.

Download Full-text

Novel intelligent adjustment height method of Shearer drum based on adaptive fuzzy reasoning Petri net

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211193 ◽

2021 ◽

pp. 1-15

Author(s):

Weibing Wang ◽

Shenquan Wang ◽

Shuanfeng Zhao ◽

Zhengxiong Lu ◽

Haitao He

Keyword(s):

Petri Net ◽

Fuzzy Inference ◽

Regression Tree ◽

Fuzzy Reasoning ◽

Classification And Regression Tree ◽

Gradient Boosting ◽

Adaptive Fuzzy ◽

Fuzzy Petri Net ◽

Non Linear ◽

Height Model

The complexity of the coalface environment determines the non-linear and fuzzy characteristics of the drum adjustment height. To overcome this challenge, this study proposes an adaptive fuzzy reasoning Petri net (AFRPN) model based on fuzzy reasoning and fuzzy Petri net (FPN) and then applies it to the intelligent adjustment height of the shearer drum. This study constructs adaptive and reasoning algorithms. The former was used to optimize the AFRPN parameters, and the latter made the AFRPN model run. AFRPN could represent rules that had non-linear and attribute mapping relationships and could adjust the parameters adaptively to improve the accuracy of the output. Subsequently, the drum adjustment height model was established and compared to three models neural network (NN), classification and regression tree(CART) and gradient boosting decision tree (GBDT). The experimental results showed that this method is superior to other drum adjustment height methods and that AFRPN can achieve intelligent adjustment of the shearer drum height by constructing fuzzy inference rules.

Download Full-text

classification and regression tree
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Artificial intelligence for classification and regression tree based feature selection method for network intrusion detection system in various telecommunication technologies

Clinical sign and biomarker-based algorithm to identify bacterial pneumonia among outpatients with lower respiratory tract infection in Tanzania

Main Factors That Explain the Use of Fertilisers on Farms in the European Union

Missing Data Imputation – A Survey

Intraoperative lactic acid concentration during liver transplantation and cutoff values to predict early mortality: a retrospective analysis of 3,338 cases

Thresholding of prominent biomarkers of breast cancer on overall survival using classification and regression tree

Body Weight Prediction of Thalli Sheep Reared in Southern Punjab Using Different Data Mining Algorithms

Flood Susceptibility Modeling in a Subtropical Humid Low-Relief Alluvial Plain Environment: Application of Novel Ensemble Machine Learning Approach

MALARIA PREDICTION MODEL USING ADVANCED ENSEMBLE MACHINE LEARNING TECHNIQUES

Novel intelligent adjustment height method of Shearer drum based on adaptive fuzzy reasoning Petri net

Export Citation Format

classification and regression treeRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Artificial intelligence for classification and regression tree based feature selection method for network intrusion detection system in various telecommunication technologies

Clinical sign and biomarker-based algorithm to identify bacterial pneumonia among outpatients with lower respiratory tract infection in Tanzania

Main Factors That Explain the Use of Fertilisers on Farms in the European Union

Missing Data Imputation – A Survey

Intraoperative lactic acid concentration during liver transplantation and cutoff values to predict early mortality: a retrospective analysis of 3,338 cases

Thresholding of prominent biomarkers of breast cancer on overall survival using classification and regression tree

Body Weight Prediction of Thalli Sheep Reared in Southern Punjab Using Different Data Mining Algorithms

Flood Susceptibility Modeling in a Subtropical Humid Low-Relief Alluvial Plain Environment: Application of Novel Ensemble Machine Learning Approach

MALARIA PREDICTION MODEL USING ADVANCED ENSEMBLE MACHINE LEARNING TECHNIQUES

Novel intelligent adjustment height method of Shearer drum based on adaptive fuzzy reasoning Petri net

classification and regression tree
Recently Published Documents