Distance-based Classification and Regression Trees for the analysis of complex predictors in health and medical research

2021 ◽  
pp. 096228022110327
Author(s):  
Hannah Johns ◽  
Julie Bernhardt ◽  
Leonid Churilov

Predicting patient outcomes based on patient characteristics and care processes is a common task in medical research. Such predictive features are often multifaceted and complex, and are usually simplified into one or more scalar variables to facilitate statistical analysis. This process, while necessary, results in a loss of important clinical detail. While this loss may be prevented by using distance-based predictive methods which better represent complex healthcare features, the statistical literature on such methods is limited, and the range of tools facilitating distance-based analysis is substantially smaller than those of other methods. Consequently, medical researchers must choose to either reduce complex predictive features to scalar variables to facilitate analysis, or instead use a limited number of distance-based predictive methods which may not fulfil the needs of the analysis problem at hand. We address this limitation by developing a Distance-Based extension of Classification and Regression Trees (DB-CART) capable of making distance-based predictions of categorical, ordinal and numeric patient outcomes. We also demonstrate how this extension is compatible with other extensions to CART, including a recently published method for predicting care trajectories in chronic disease. We demonstrate DB-CART by using it to expand upon previously published dose–response analysis of stroke rehabilitation data. Our method identified additional detail not captured by the previously published analysis, reinforcing previous conclusions. We also demonstrate how by combining DB-CART with other extensions to CART, the method is capable of making predictions about complex, multifaceted outcome data based on complex, multifaceted predictive features.

2021 ◽  
pp. 175045892096263
Author(s):  
Margaret O Lewen ◽  
Jay Berry ◽  
Connor Johnson ◽  
Rachael Grace ◽  
Laurie Glader ◽  
...  

Aim To assess the relationship of preoperative hematology laboratory results with intraoperative estimated blood loss and transfusion volumes during posterior spinal fusion for pediatric neuromuscular scoliosis. Methods Retrospective chart review of 179 children with neuromuscular scoliosis undergoing spinal fusion at a tertiary children’s hospital between 2012 and 2017. The main outcome measure was estimated blood loss. Secondary outcomes were volumes of packed red blood cells, fresh frozen plasma, and platelets transfused intraoperatively. Independent variables were preoperative blood counts, coagulation studies, and demographic and surgical characteristics. Relationships between estimated blood loss, transfusion volumes, and independent variables were assessed using bivariable analyses. Classification and Regression Trees were used to identify variables most strongly correlated with outcomes. Results In bivariable analyses, increased estimated blood loss was significantly associated with higher preoperative hematocrit and lower preoperative platelet count but not with abnormal coagulation studies. Preoperative laboratory results were not associated with intraoperative transfusion volumes. In Classification and Regression Trees analysis, binary splits associated with the largest increase in estimated blood loss were hematocrit ≥44% vs. <44% and platelets ≥308 vs. <308 × 109/L. Conclusions Preoperative blood counts may identify patients at risk of increased bleeding, though do not predict intraoperative transfusion requirements. Abnormal coagulation studies often prompted preoperative intervention but were not associated with increased intraoperative bleeding or transfusion needs.


2021 ◽  
Vol 13 (12) ◽  
pp. 2300
Author(s):  
Samy Elmahdy ◽  
Tarig Ali ◽  
Mohamed Mohamed

Mapping of groundwater potential in remote arid and semi-arid regions underneath sand sheets over a very regional scale is a challenge and requires an accurate classifier. The Classification and Regression Trees (CART) model is a robust machine learning classifier used in groundwater potential mapping over a very regional scale. Ten essential groundwater conditioning factors (GWCFs) were constructed using remote sensing data. The spatial relationship between these conditioning factors and the observed groundwater wells locations was optimized and identified by using the chi-square method. A total of 185 groundwater well locations were randomly divided into 129 (70%) for training the model and 56 (30%) for validation. The model was applied for groundwater potential mapping by using optimal parameters values for additive trees were 186, the value for the learning rate was 0.1, and the maximum size of the tree was five. The validation result demonstrated that the area under the curve (AUC) of the CART was 0.920, which represents a predictive accuracy of 92%. The resulting map demonstrated that the depressions of Mondafan, Khujaymah and Wajid Mutaridah depression and the southern gulf salt basin (SGSB) near Saudi Arabia, Oman and the United Arab Emirates (UAE) borders reserve fresh fossil groundwater as indicated from the observed lakes and recovered paleolakes. The proposed model and the new maps are effective at enhancing the mapping of groundwater potential over a very regional scale obtained using machine learning algorithms, which are used rarely in the literature and can be applied to the Sahara and the Kalahari Desert.


2010 ◽  
Vol 57 (4) ◽  
pp. 560-561
Author(s):  
Alberto Briganti ◽  
Umberto Capitanio ◽  
Nazareno Suardi ◽  
Andrea Gallina ◽  
Patrizio Rigatti ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document