Local Additive Regression of Decision Stumps

Author(s):  
Sotiris B. Kotsiantis ◽  
Dimitris Kanellopoulos ◽  
Panayiotis E. Pintelas
Keyword(s):  
Author(s):  
О.Ю. Бушуева

Распространенные и зачастую сочетающиеся кардио- и цереброваскулярные заболевания (КЦВЗ), включающие артериальную гипертензию (АГ), ишемическую болезнь сердца (ИБС) и мозговой инсульт (МИ), представляют собой основную причину смертности во всем мире. Окислительный стресс имеет множество патологических эффектов на сосудистый гомеостаз и в настоящее время рассматривается как один из общих механизмов развития КЦВЗ. Целью исследования было изучение ассоциации однонуклеотидных полиморфизмов генов редокс-гомеостаза rs2070424 SOD1, rs4880 SOD2, rs769214 CAT, rs713041 GPX4, rs41303970 GCLM, rs17883901 GCLC, rs854560 PON1, rs7493 PON2, rs1695 GSTP1, rs2266782 FMO3 с развитием изолированных и сочетанных форм КЦВЗ. Материалом для исследования послужила выборка неродственных индивидов славянского происхождения, общей численностью 2702 человека. В исследование вошли 1815 пациентов с различными кардио- и цереброваскулярными заболеваниями и их сочетаниями: с изолированной АГ (иАГ), с изолированной ишемической болезнью сердца (иИБС), с сочетанием АГ и ИБС (АГ+ИБС), с мозговым инсультом (МИ) на фоне АГ (АГ+МИ); с коморбидной кардио- и цереброваскулярной патологией (АГ+ИБС+МИ). Из общей выборки здоровых лиц (N=887) были сформированы 5 контрольных групп, соответствующих по полу и возрасту каждой из групп нозологических форм заболеваний. Генотипирование SNP проводили методом ПЦР в режиме реального времени путем дискриминации аллелей с помощью TaqMan-зондов. Для анализа ассоциаций генотипов с развитием заболеваний пользовались лог-аддитивной регрессионной моделью. Все расчеты выполнены относительно минорного аллеля; введены поправки на пол и возраст. SNP rs1695 GSTP1 был связан исключительно с развитием иАГ (OR=1,19, 95%CI=1,01-1,39, р=0,034). SNP rs7493 PON2 был связан с развитием всех исследованных коморбидных кардио- и цереброваскулярных заболеваний: АГ+ИБС (adjOR=1,32, adj95%CI=1,07-1,63, adjp=0,01); АГ+МИ (adjOR=1,79, adj95%CI=1,45-2,21, adjp<0,0001); АГ+ИБС+МИ (adjOR=1,51, adj95%CI=1,09-2,09, adjp=0,01), а также с укорочением протромбинового времени (adjDifference=-0,35; adjp=0,01). SNP rs2266782 FMO3 был связан с фенотипом АГ+МИ (adjOR=1,24, adj95%CI=1,02-1,51, adjp=0,03), а также снижал возраст манифестации МИ (adjDifference=-2,31; adjp=0,03). Таким образом, установлено, что однонуклеотидные полиморфизмы генов редокс-гомеостаза могут представлять важную генетическую компоненту формирования дифференцированности кардио- и цереброваскулярных фенотипов. Common and often comorbid cardio- and cerebrovascular diseases (CCVD), including arterial hypertension (AH), coronary heart disease (CHD), and cerebral stroke (CS), are the leading cause of death worldwide. Oxidative stress has many pathological effects on vascular homeostasis and is currently regarded as one of the common mechanisms for the development of CCVD. The aim of our study was to investigate the association of single nucleotide polymorphisms of the redox-homeostasis genes rs2070424 SOD1, rs4880 SOD2, rs769214 CAT, rs713041 GPX4, rs41303970 GCLM, rs17883901 GCLC, rs854560 PON1, rs7493 PON2, rs1695 GSTP1, rs2266782 FMO3 with the development of isolated and comorbid CCVD. A total 2702 individuals of Slavic origin were included for this study. The patients group included 1815 subjects with various CCVD and their combinations: isolated AH (IAH); isolated IHD (IIHD), combination of AH and IHD (AH+IHD); combination of AH and CS (AH+CS); comorbid cardio- and cerebrovascular pathology (AH+IHD+CS). From the total sample of healthy individuals (N=887), 5 sex- and age-matched control groups were formed. Genotyping was performed using TaqMan-based PCR. To analyze the associations of genotypes with the risk of diseases, a log-additive regression model was used. All calculations were performed relative to the minor allele; corrections for gender and age have been introduced. SNP rs1695 GSTP1 was associated with IAH exclusively (OR=1.19, 95%CI=1.01-1.39, P=0.034). SNP rs7493 PON2 was associated with the development of all studied comorbid CCVD: AH+IHD (adjOR=1.32, adj95%CI=1.07-1.63, adjP=0.01); AH+CS (adjOR=1.79, adj95%CI=1.45-2.21, adjP<0.0001); AH+IHD+CS (adjOR=1.51, adj95%CI=1.09-2.09, adjP=0.01), as well as shortening of prothrombin time (adjDifference=-0.35; adjP=0.01). SNP rs2266782 FMO3 was associated with the development of AH+CS (adjOR=1.24, adj95%CI=1.02-1.51, adjP=0.03), as well as decreased age of manifestation of CS (adjDifference=-2.31; adjP=0.03). Thus, it was found that genes involved in regulation of redox-homeostasis, can represent an important genetic component in the formation of differentiation of cardio- and cerebrovascular phenotypes.


2019 ◽  
Vol 5 ◽  
pp. 237802311982588 ◽  
Author(s):  
Nicole Bohme Carnegie ◽  
James Wu

Our goal for the Fragile Families Challenge was to develop a hands-off approach that could be applied in many settings to identify relationships that theory-based models might miss. Data processing was our first and most time-consuming task, particularly handling missing values. Our second task was to reduce the number of variables for modeling, and we compared several techniques for variable selection: least absolute selection and shrinkage operator, regression with a horseshoe prior, Bayesian generalized linear models, and Bayesian additive regression trees (BART). We found minimal differences in final performance based on the choice of variable selection method. We proceeded with BART for modeling because it requires minimal assumptions and permits great flexibility in fitting surfaces and based on previous success using BART in black-box modeling competitions. In addition, BART allows for probabilistic statements about the predictions and other inferences, which is an advantage over most machine learning algorithms. A drawback to BART, however, is that it is often difficult to identify or characterize individual predictors that have strong influences on the outcome variable.


2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 1252.2-1253
Author(s):  
R. Garofoli ◽  
M. Resche-Rigon ◽  
M. Dougados ◽  
D. Van der Heijde ◽  
C. Roux ◽  
...  

Background:Axial spondyloarthritis (axSpA) is a chronic rheumatic disease that encompasses various clinical presentations: inflammatory chronic back pain, peripheral manifestations and extra-articular manifestations. The current nomenclature divides axSpA in radiographic (in the presence of radiographic sacroiliitis) and non-radiographic (in the absence of radiographic sacroiliitis, with or without MRI sacroiliitis. Given that the functional burden of the disease appears to be greater in patients with radiographic forms, it seems crucial to be able to predict which patients will be more likely to develop structural damage over time. Predictive factors for radiographic progression in axSpA have been identified through use of traditional statistical models like logistic regression. However, these models present some limitations. In order to overcome these limitations and to improve the predictive performance, machine learning (ML) methods have been developed.Objectives:To compare ML models to traditional models to predict radiographic progression in patients with early axSpA.Methods:Study design: prospective French multicentric cohort study (DESIR cohort) with 5years of follow-up. Patients: all patients included in the cohort, i.e. 708 patients with inflammatory back pain for >3 months but <3 years, highly suggestive of axSpA. Data on the first 5 years of follow-up was used. Statistical analyses: radiographic progression was defined as progression either at the spine (increase of at least 1 point per 2 years of mSASSS scores) or at the sacroiliac joint (worsening of at least one grade of the mNY score between 2 visits). Traditional modelling: we first performed a bivariate analysis between our outcome (radiographic progression) and explanatory variables at baseline to select the variables to be included in our models and then built a logistic regression model (M1). Variable selection for traditional models was performed with 2 different methods: stepwise selection based on Akaike Information Criterion (stepAIC) method (M2), and the Least Absolute Shrinkage and Selection Operator (LASSO) method (M3). We also performed sensitivity analysis on all patients with manual backward method (M4) after multiple imputation of missing data. Machine learning modelling: using the “SuperLearner” package on R, we modelled radiographic progression with stepAIC, LASSO, random forest, Discrete Bayesian Additive Regression Trees Samplers (DBARTS), Generalized Additive Models (GAM), multivariate adaptive polynomial spline regression (polymars), Recursive Partitioning And Regression Trees (RPART) and Super Learner. Finally, the accuracy of traditional and ML models was compared based on their 10-foldcross-validated AUC (cv-AUC).Results:10-fold cv-AUC for traditional models were 0.79 and 0.78 for M2 and M3, respectively. The 3 best models in the ML algorithm were the GAM, the DBARTS and the Super Learner models, with 10-fold cv-AUC of: 0.77, 0.76 and 0.74, respectively (Table 1).Table 1.Comparison of 10-fold cross-validated AUC between best traditional and machine learning models.Best modelsCross-validated AUCTraditional models M2 (step AIC method)0.79 M3 (LASSO method)0.78Machine learning approach SL Discrete Bayesian Additive Regression Trees Samplers (DBARTS)0.76 SL Generalized Additive Models (GAM)0.77 Super Learner0.74AUC: Area Under the Curve; AIC: Akaike Information Criterion; LASSO: Least Absolute Shrinkage and Selection Operator; SL: SuperLearner. N = 295.Conclusion:Traditional models predicted better radiographic progression than ML models in this early axSpA population. Further ML algorithms image-based or with other artificial intelligence methods (e.g. deep learning) might perform better than traditional models in this setting.Acknowledgments:Thanks to the French National Society of Rheumatology and the DESIR cohort.Disclosure of Interests:Romain Garofoli: None declared, Matthieu resche-rigon: None declared, Maxime Dougados Grant/research support from: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Consultant of: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Speakers bureau: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Désirée van der Heijde Consultant of: AbbVie, Amgen, Astellas, AstraZeneca, BMS, Boehringer Ingelheim, Celgene, Cyxone, Daiichi, Eisai, Eli-Lilly, Galapagos, Gilead Sciences, Inc., Glaxo-Smith-Kline, Janssen, Merck, Novartis, Pfizer, Regeneron, Roche, Sanofi, Takeda, UCB Pharma; Director of Imaging Rheumatology BV, Christian Roux: None declared, Anna Moltó Grant/research support from: Pfizer, UCB, Consultant of: Abbvie, BMS, MSD, Novartis, Pfizer, UCB


Author(s):  
Scott A. McDonald ◽  
Fuminari Miura ◽  
Eric R. A. Vos ◽  
Michiel van Boven ◽  
Hester E. de Melker ◽  
...  

Abstract Background The proportion of SARS-CoV-2 positive persons who are asymptomatic—and whether this proportion is age-dependent—are still open research questions. Because an unknown proportion of reported symptoms among SARS-CoV-2 positives will be attributable to another infection or affliction, the observed, or 'crude' proportion without symptoms may underestimate the proportion of persons without symptoms that are caused by SARS-CoV-2 infection. Methods Based on two rounds of a large population-based serological study comprising test results on seropositivity and self-reported symptom history conducted in April/May and June/July 2020 in the Netherlands (n = 7517), we estimated the proportion of reported symptoms among those persons infected with SARS-CoV-2 that is attributable to this infection, where the set of relevant symptoms fulfills the ECDC case definition of COVID-19, using inferential methods for the attributable risk (AR). Generalised additive regression modelling was used to estimate the age-dependent relative risk (RR) of reported symptoms, and the AR and asymptomatic proportion (AP) were calculated from the fitted RR. Results Using age-aggregated data, the 'crude' AP was 37% but the model-estimated AP was 65% (95% CI 63–68%). The estimated AP varied with age, from 74% (95% CI 65–90%) for < 20 years, to 61% (95% CI 57–65%) for the 50–59 years age-group. Conclusion Whereas the 'crude' AP represents a lower bound for the proportion of persons infected with SARS-CoV-2 without COVID-19 symptoms, the AP as estimated via an attributable risk approach represents an upper bound. Age-specific AP estimates can inform the implementation of public health actions such as targetted virological testing and therefore enhance containment strategies.


2020 ◽  
Vol 71 (Supplement_3) ◽  
pp. S266-S275
Author(s):  
Caitlin Hemlock ◽  
Stephen P Luby ◽  
Shampa Saha ◽  
Farah Qamar ◽  
Jason R Andrews ◽  
...  

Abstract Background Blood culture is the current standard for diagnosing bacteremic illnesses, yet it is not clear how physicians in many low- and middle-income countries utilize blood culture for diagnostic purposes and to inform treatment decisions. Methods We screened suspected enteric fever cases from 6 hospitals in Bangladesh, Nepal, and Pakistan, and enrolled patients if blood culture was prescribed by the treating physician. We used generalized additive regression models to analyze the probability of receiving blood culture by age, and linear regression models to analyze changes by month to the proportion of febrile cases prescribed a blood culture compared with the burden of febrile illness, stratified by hospital. We used logistic regression to analyze predictors for receiving antibiotics empirically. We descriptively reviewed changes in antibiotic therapy by susceptibility patterns and coverage, stratified by country. Results We screened 30 809 outpatients resulting in 1819 enteric fever cases; 1935 additional cases were enrolled from other hospital locations. Younger outpatients were less likely to receive a blood culture. The association between the number of febrile outpatients and the proportion prescribed blood culture varied by hospital. Antibiotics prescribed empirically were associated with severity and provisional diagnoses, but 31% (1147/3754) of enteric fever cases were not covered by initial therapy; this was highest in Pakistan (50%) as many isolates were resistant to cephalosporins, which were commonly prescribed empirically. Conclusions Understanding hospital-level communication between laboratories and physicians may improve patient care and timeliness of appropriate antibiotics, which is important considering the rise of antimicrobial resistance.


Author(s):  
Alexander Razen ◽  
Wolfgang Brunauer ◽  
Nadja Klein ◽  
Stefan Lang ◽  
Nikolaus Umlauf

2019 ◽  
Vol 116 (10) ◽  
pp. 4156-4165 ◽  
Author(s):  
Sören R. Künzel ◽  
Jasjeet S. Sekhon ◽  
Peter J. Bickel ◽  
Bin Yu

There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies. We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Metaalgorithms build on base algorithms—such as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networks—to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz-continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favorably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our X-learner can be used to target treatment regimes and to shed light on underlying mechanisms. A software package is provided that implements our methods.


Sign in / Sign up

Export Citation Format

Share Document