classification and regression tree
Recently Published Documents


TOTAL DOCUMENTS

879
(FIVE YEARS 399)

H-INDEX

48
(FIVE YEARS 6)

2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Sarika K. L. Hogendoorn ◽  
Loïc Lhopitallier ◽  
Melissa Richard-Greenblatt ◽  
Estelle Tenisch ◽  
Zainab Mbarack ◽  
...  

Abstract Background Inappropriate antibiotics use in lower respiratory tract infections (LRTI) is a major contributor to resistance. We aimed to design an algorithm based on clinical signs and host biomarkers to identify bacterial community-acquired pneumonia (CAP) among patients with LRTI. Methods Participants with LRTI were selected in a prospective cohort of febrile (≥ 38 °C) adults presenting to outpatient clinics in Dar es Salaam. Participants underwent chest X-ray, multiplex PCR for respiratory pathogens, and measurements of 13 biomarkers. We evaluated the predictive accuracy of clinical signs and biomarkers using logistic regression and classification and regression tree analysis. Results Of 110 patients with LRTI, 17 had bacterial CAP. Procalcitonin (PCT), interleukin-6 (IL-6) and soluble triggering receptor expressed by myeloid cells-1 (sTREM-1) showed an excellent predictive accuracy to identify bacterial CAP (AUROC 0.88, 95%CI 0.78–0.98; 0.84, 0.72–0.99; 0.83, 0.74–0.92, respectively). Combining respiratory rate with PCT or IL-6 significantly improved the model compared to respiratory rate alone (p = 0.006, p = 0.033, respectively). An algorithm with respiratory rate (≥ 32/min) and PCT (≥ 0.25 μg/L) had 94% sensitivity and 82% specificity. Conclusions PCT, IL-6 and sTREM-1 had an excellent predictive accuracy in differentiating bacterial CAP from other LRTIs. An algorithm combining respiratory rate and PCT displayed even better performance in this sub-Sahara African setting.


2022 ◽  
pp. 155-184
Author(s):  
Vítor João Pereira Domingues Martinho

A deeper assessment of the main determinants associated with the use of fertilisers, for example, in the European Union farms may bring relevant insights about the respective frameworks and support to find more sustainable solutions. In this context, the main objective of this study is to identify factors that influence the use of fertilisers in the agricultural sector of the European Union regions. To achieve this objective, statistical information, at farm level, from the European Farm Accountancy Data Network was considered. These data were first analysed through exploratory approaches and after assessed with classification and regression tree methodologies. The results obtained provide interesting insights to promote a more sustainable development in the European farms, namely supporting the policymakers to design more adjusted measures and instruments. In addition, the fertilisers costs on the European Union farms are mainly explained by crop output, costs with inputs, current subsidies, utilised agricultural area, and gross investment.


2022 ◽  
Vol 14 (1) ◽  
pp. 0-0

Many real world datasets may contain missing values for various reasons. These incomplete datasets can pose severe issues to the underlying machine learning algorithms and decision support systems. It may result in high computational cost, skewed output and invalid deductions. Various solutions exist to mitigate this issue; the most popular strategy is to estimate the missing values by applying inferential techniques such as linear regression, decision trees or Bayesian inference. In this paper, the missing data problem is discussed in detail with a comprehensive review of the approaches to tackle it. The paper concludes with a discussion on the effectiveness of three imputation methods namely, imputation based on Multiple Linear Regression (MLR), Predictive Mean Matching (PMM) and Classification And Regression Tree (CART) in the context of subspace clustering. The experimental results obtained on real benchmark datasets and high-dimensional synthetic datasets highlight that, MLR based imputation method is more efficient on high-dimensional incomplete datasets.


Author(s):  
Kyoung-Sun Kim ◽  
Sang-Ho Lee ◽  
Bo-Hyun Sang ◽  
Gyu-Sam Hwang

Background: We aimed to explore intraoperative lactic acid (LA) level distribution during liver transplantation (LT) and determine the optimal cutoff values to predict post-LT 30-day and 90-day mortality.Methods: Intraoperative LA data from 3,338 patients were collected between 2008 to 2019 and all-cause mortalities within 30 and 90 days were retrospectively reviewed. Of the three LA levels measured during preanhepatic, anhepatic, and neohepatic phase of LT, the peak LA level was selected to explore the distribution and predict early post-LT mortality. To determine the best cutoff values of LA, we used a classification and regression tree algorithm and maximally selected rank statistics with the smallest P value.Results: The median intraoperative LA level was 4.4 mmol/L (range: 0.5–34.7, interquartile range: 3.0–6.2 mmol/L). Of the 3,338 patients, 1,884 (56.4%) had LA levels > 4.0 mmol/L and 188 (5.6%) had LA levels > 10 mmol/L. Patients with LA levels > 16.7 mmol/L and 13.5–16.7 mmol/L showed significantly higher 30-day mortality rates of 58.3% and 21.2%, respectively. For the prediction of the 90-day mortality, 8.4 mmol/L of intraoperative LA was the best cutoff value.Conclusions: Approximately 6% of the LT recipients showed intraoperative hyperlactatemia of > 10 mmol/L during LT, and those with LA > 8.4 mmol/L were associated with significantly higher early post-LT mortality.


2021 ◽  
pp. 1-10
Author(s):  
Pragya Kumari ◽  
Gajendra K. Vishwakarma ◽  
Atanu Bhattacharjee

BACKGROUND: HER2, ER, PR, and ERBB2 play a vital role in treating breast cancer. These are significant predictive and prognosis biomarkers of breast cancer. OBJECTIVE: We aim to obtain a unique biomarker-specific prediction on overall survival to know their survival and death risk. METHODS: Survival analysis is performed on classified data using Classification and Regression Tree (CART) analysis. Hazard ratio and Confidence Interval are computed using MLE and the Bayesian approach with the CPH model for univariate and multivariable illustrations. Validation of CART is executed with the Brier score, and accuracy and sensitivity are obtained using the k-nn classifier. RESULTS: Utilizing CART analysis, the cut-off value of continuous-valued biomarkers HER2, ER, PR, and ERBB2 are obtained as 14.707, 8.128, 13.153, and 6.884, respectively. Brier score of CART is 0.16 towards validation of methodology. Survival analysis gives a demonstration of the survival estimates with significant statistical strategies. CONCLUSIONS: Patients with breast cancer are at low risk of death, whose HER2 value is below its cut-off value, and ER, PR, and ERBB2 values are greater than their cut-off values. This comparison is with the patient having the opposite side of these cut-off values for the same biomarkers.


Author(s):  
Ansar Abbas ◽  
Muhammad Aman Ullah ◽  
Abdul Waheed

This study is conducted to predict the body weight (BW) for Thalli sheep of southern Punjab from different body measurements. In the BW prediction, several body measurements viz., withers height, body length, head length, head width, ear length, ear width, neck length, neck width, heart girth, rump length, rump width, tail length, barrel depth and sacral pelvic width are used as predictors. The data mining algorithms such as Chi-square Automatic Interaction Detector (CHAID), Exhaustive CHAID, Classification and Regression Tree (CART) and Artificial Neural Network (ANN) are used to predict the BW for a total of 85 female Thalli sheep. The data set is partitioned into training (80 %) and test (20 %) sets before the algorithms are used. The minimum number of parent (4) and child nodes (2) are set in order to ensure their predictive ability. The R2 % and RMSE values for CHAID, Exhaustive CHAID, ANN and CART algorithms are 67.38(1.003), 64.37(1.049), 61.45(1.093) and 59.02(1.125), respectively. The mostsignificant predictor is BL in the BW prediction of Thalli sheep. The heaviest BW average of 9.596 kg is obtained from the subgroup of those having BL > 25.000 inches. On behalf of the several goodness of fit criteria, we conclude that the CHAID algorithm performance is better in order to predict the BW of Thalli sheep and more suitable decision tree diagram visually. Also, the obtained CHAID results may help to determine body measurements positively associated with BW for developing better selection strategies with the scope of indirect selection criteria.


2021 ◽  
Vol 9 ◽  
Author(s):  
Manish Pandey ◽  
Aman Arora ◽  
Alireza Arabameri ◽  
Romulus Costache ◽  
Naveen Kumar ◽  
...  

This study has developed a new ensemble model and tested another ensemble model for flood susceptibility mapping in the Middle Ganga Plain (MGP). The results of these two models have been quantitatively compared for performance analysis in zoning flood susceptible areas of low altitudinal range, humid subtropical fluvial floodplain environment of the Middle Ganga Plain (MGP). This part of the MGP, which is in the central Ganga River Basin (GRB), is experiencing worse floods in the changing climatic scenario causing an increased level of loss of life and property. The MGP experiencing monsoonal subtropical humid climate, active tectonics induced ground subsidence, increasing population, and shifting landuse/landcover trends and pattern, is the best natural laboratory to test all the susceptibility prediction genre of models to achieve the choice of best performing model with the constant number of input parameters for this type of topoclimatic environmental setting. This will help in achieving the goal of model universality, i.e., finding out the best performing susceptibility prediction model for this type of topoclimatic setting with the similar number and type of input variables. Based on the highly accurate flood inventory and using 12 flood predictors (FPs) (selected using field experience of the study area and literature survey), two machine learning (ML) ensemble models developed by bagging frequency ratio (FR) and evidential belief function (EBF) with classification and regression tree (CART), CART-FR and CART-EBF, were applied for flood susceptibility zonation mapping. Flood and non-flood points randomly generated using flood inventory have been apportioned in 70:30 ratio for training and validation of the ensembles. Based on the evaluation performance using threshold-independent evaluation statistic, area under receiver operating characteristic (AUROC) curve, 14 threshold-dependent evaluation metrices, and seed cell area index (SCAI) meant for assessing different aspects of ensembles, the study suggests that CART-EBF (AUCSR = 0.843; AUCPR = 0.819) was a better performant than CART-FR (AUCSR = 0.828; AUCPR = 0.802). The variability in performances of these novel-advanced ensembles and their comparison with results of other published models espouse the need of testing these as well as other genres of susceptibility models in other topoclimatic environments also. Results of this study are important for natural hazard managers and can be used to compute the damages through risk analysis.


2021 ◽  
Vol 10 (6) ◽  
pp. 3794-3801
Author(s):  
Yusuf Aliyu Adamu

Malaria is a life-threatening disease that leads to death globally, its early prediction is necessary for preventing the rapid transmission. In this work, an enhanced ensemble learning approach for predicting malaria outbreaks is suggested. Using a mean-based splitting strategy, the dataset is randomly partitioned into smaller groups. The splits are then modelled using a classification and regression tree, and an accuracy-based weighted aging classifier ensemble is used to construct a homogenous ensemble from the several Classification and Regression Tree models. This approach ensures higher performance is achieved. Seven different Algorithms were tested and one ensemble method is used which combines all the seven classifiers together and finally, the accuracy, precision, and sensitivity achieved for the proposed method is 93%, 92%, and 100% respectively, which outperformed better than machine learning classifiers and ensemble method used in this research. The correlation between the variables used is established and how each factor contributes to the malaria incidence. The result indicates that malaria outbreaks can be predicted successfully using the suggested technique.


2021 ◽  
pp. 1-15
Author(s):  
Weibing Wang ◽  
Shenquan Wang ◽  
Shuanfeng Zhao ◽  
Zhengxiong Lu ◽  
Haitao He

The complexity of the coalface environment determines the non-linear and fuzzy characteristics of the drum adjustment height. To overcome this challenge, this study proposes an adaptive fuzzy reasoning Petri net (AFRPN) model based on fuzzy reasoning and fuzzy Petri net (FPN) and then applies it to the intelligent adjustment height of the shearer drum. This study constructs adaptive and reasoning algorithms. The former was used to optimize the AFRPN parameters, and the latter made the AFRPN model run. AFRPN could represent rules that had non-linear and attribute mapping relationships and could adjust the parameters adaptively to improve the accuracy of the output. Subsequently, the drum adjustment height model was established and compared to three models neural network (NN), classification and regression tree(CART) and gradient boosting decision tree (GBDT). The experimental results showed that this method is superior to other drum adjustment height methods and that AFRPN can achieve intelligent adjustment of the shearer drum height by constructing fuzzy inference rules.


Sign in / Sign up

Export Citation Format

Share Document