Multiple smoothing parameters selection in additive regression quantiles

2020 ◽  
pp. 1471082X2092980
Author(s):  
Vito M.R. Muggeo ◽  
Federico Torretta ◽  
Paul H. C. Eilers ◽  
Mariangela Sciandra ◽  
Massimo Attanasio

We propose an iterative algorithm to select the smoothing parameters in additive quantile regression, wherein the functional forms of the covariate effects are unspecified and expressed via B-spline bases with difference penalties on the spline coefficients. The proposed algorithm relies on viewing the penalized coefficients as random effects from the symmetric Laplace distribution, and it turns out to be very efficient and particularly attractive with multiple smooth terms. Through simulations we compare our proposal with some alternative approaches, including the traditional ones based on minimization of the Schwarz Information Criterion. A real-data analysis is presented to illustrate the method in practice.

2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 1252.2-1253
Author(s):  
R. Garofoli ◽  
M. Resche-Rigon ◽  
M. Dougados ◽  
D. Van der Heijde ◽  
C. Roux ◽  
...  

Background:Axial spondyloarthritis (axSpA) is a chronic rheumatic disease that encompasses various clinical presentations: inflammatory chronic back pain, peripheral manifestations and extra-articular manifestations. The current nomenclature divides axSpA in radiographic (in the presence of radiographic sacroiliitis) and non-radiographic (in the absence of radiographic sacroiliitis, with or without MRI sacroiliitis. Given that the functional burden of the disease appears to be greater in patients with radiographic forms, it seems crucial to be able to predict which patients will be more likely to develop structural damage over time. Predictive factors for radiographic progression in axSpA have been identified through use of traditional statistical models like logistic regression. However, these models present some limitations. In order to overcome these limitations and to improve the predictive performance, machine learning (ML) methods have been developed.Objectives:To compare ML models to traditional models to predict radiographic progression in patients with early axSpA.Methods:Study design: prospective French multicentric cohort study (DESIR cohort) with 5years of follow-up. Patients: all patients included in the cohort, i.e. 708 patients with inflammatory back pain for >3 months but <3 years, highly suggestive of axSpA. Data on the first 5 years of follow-up was used. Statistical analyses: radiographic progression was defined as progression either at the spine (increase of at least 1 point per 2 years of mSASSS scores) or at the sacroiliac joint (worsening of at least one grade of the mNY score between 2 visits). Traditional modelling: we first performed a bivariate analysis between our outcome (radiographic progression) and explanatory variables at baseline to select the variables to be included in our models and then built a logistic regression model (M1). Variable selection for traditional models was performed with 2 different methods: stepwise selection based on Akaike Information Criterion (stepAIC) method (M2), and the Least Absolute Shrinkage and Selection Operator (LASSO) method (M3). We also performed sensitivity analysis on all patients with manual backward method (M4) after multiple imputation of missing data. Machine learning modelling: using the “SuperLearner” package on R, we modelled radiographic progression with stepAIC, LASSO, random forest, Discrete Bayesian Additive Regression Trees Samplers (DBARTS), Generalized Additive Models (GAM), multivariate adaptive polynomial spline regression (polymars), Recursive Partitioning And Regression Trees (RPART) and Super Learner. Finally, the accuracy of traditional and ML models was compared based on their 10-foldcross-validated AUC (cv-AUC).Results:10-fold cv-AUC for traditional models were 0.79 and 0.78 for M2 and M3, respectively. The 3 best models in the ML algorithm were the GAM, the DBARTS and the Super Learner models, with 10-fold cv-AUC of: 0.77, 0.76 and 0.74, respectively (Table 1).Table 1.Comparison of 10-fold cross-validated AUC between best traditional and machine learning models.Best modelsCross-validated AUCTraditional models M2 (step AIC method)0.79 M3 (LASSO method)0.78Machine learning approach SL Discrete Bayesian Additive Regression Trees Samplers (DBARTS)0.76 SL Generalized Additive Models (GAM)0.77 Super Learner0.74AUC: Area Under the Curve; AIC: Akaike Information Criterion; LASSO: Least Absolute Shrinkage and Selection Operator; SL: SuperLearner. N = 295.Conclusion:Traditional models predicted better radiographic progression than ML models in this early axSpA population. Further ML algorithms image-based or with other artificial intelligence methods (e.g. deep learning) might perform better than traditional models in this setting.Acknowledgments:Thanks to the French National Society of Rheumatology and the DESIR cohort.Disclosure of Interests:Romain Garofoli: None declared, Matthieu resche-rigon: None declared, Maxime Dougados Grant/research support from: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Consultant of: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Speakers bureau: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Désirée van der Heijde Consultant of: AbbVie, Amgen, Astellas, AstraZeneca, BMS, Boehringer Ingelheim, Celgene, Cyxone, Daiichi, Eisai, Eli-Lilly, Galapagos, Gilead Sciences, Inc., Glaxo-Smith-Kline, Janssen, Merck, Novartis, Pfizer, Regeneron, Roche, Sanofi, Takeda, UCB Pharma; Director of Imaging Rheumatology BV, Christian Roux: None declared, Anna Moltó Grant/research support from: Pfizer, UCB, Consultant of: Abbvie, BMS, MSD, Novartis, Pfizer, UCB


2017 ◽  
Vol 7 (1) ◽  
pp. 72 ◽  
Author(s):  
Lamya A Baharith

Truncated type I generalized logistic distribution has been used in a variety of applications. In this article, a new bivariate truncated type I generalized logistic (BTTGL) distributional models driven from three different copula functions are introduced. A study of some properties is illustrated. Parametric and semiparametric methods are used to estimate the parameters of the BTTGL models. Maximum likelihood and inference function for margin estimates of the BTTGL parameters are compared with semiparametric estimates using real data set. Further, a comparison between BTTGL, bivariate generalized exponential and bivariate exponentiated Weibull models is conducted using Akaike information criterion and the maximized log-likelihood. Extensive Monte Carlo simulation study is carried out for different values of the parameters and different sample sizes to compare the performance of parametric and semiparametric estimators based on relative mean square error.


2015 ◽  
Vol 2015 ◽  
pp. 1-23 ◽  
Author(s):  
Francesco Cartella ◽  
Jan Lemeire ◽  
Luca Dimiccoli ◽  
Hichem Sahli

Realistic predictive maintenance approaches are essential for condition monitoring and predictive maintenance of industrial machines. In this work, we propose Hidden Semi-Markov Models (HSMMs) with (i) no constraints on the state duration density function and (ii) being applied to continuous or discrete observation. To deal with such a type of HSMM, we also propose modifications to the learning, inference, and prediction algorithms. Finally, automatic model selection has been made possible using the Akaike Information Criterion. This paper describes the theoretical formalization of the model as well as several experiments performed on simulated and real data with the aim of methodology validation. In all performed experiments, the model is able to correctly estimate the current state and to effectively predict the time to a predefined event with a low overall average absolute error. As a consequence, its applicability to real world settings can be beneficial, especially where in real time the Remaining Useful Lifetime (RUL) of the machine is calculated.


Geophysics ◽  
2009 ◽  
Vol 74 (4) ◽  
pp. J35-J48 ◽  
Author(s):  
Bernard Giroux ◽  
Abderrezak Bouchedda ◽  
Michel Chouteau

We introduce two new traveltime picking schemes developed specifically for crosshole ground-penetrating radar (GPR) applications. The main objective is to automate, at least partially, the traveltime picking procedure and to provide first-arrival times that are closer in quality to those of manual picking approaches. The first scheme is an adaptation of a method based on cross-correlation of radar traces collated in gathers according to their associated transmitter-receiver angle. A detector is added to isolate the first cycle of the radar wave and to suppress secon-dary arrivals that might be mistaken for first arrivals. To improve the accuracy of the arrival times obtained from the crosscorrelation lags, a time-rescaling scheme is implemented to resize the radar wavelets to a common time-window length. The second method is based on the Akaike information criterion(AIC) and continuous wavelet transform (CWT). It is not tied to the restrictive criterion of waveform similarity that underlies crosscorrelation approaches, which is not guaranteed for traces sorted in common ray-angle gathers. It has the advantage of being automated fully. Performances of the new algorithms are tested with synthetic and real data. In all tests, the approach that adds first-cycle isolation to the original crosscorrelation scheme improves the results. In contrast, the time-rescaling approach brings limited benefits, except when strong dispersion is present in the data. In addition, the performance of crosscorrelation picking schemes degrades for data sets with disparate waveforms despite the high signal-to-noise ratio of the data. In general, the AIC-CWT approach is more versatile and performs well on all data sets. Only with data showing low signal-to-noise ratios is the AIC-CWT superseded by the modified crosscorrelation picker.


2022 ◽  
Vol 13 (1) ◽  
Author(s):  
Zachary R. McCaw ◽  
Thomas Colthurst ◽  
Taedong Yun ◽  
Nicholas A. Furlotte ◽  
Andrew Carroll ◽  
...  

AbstractGenome-wide association studies (GWASs) examine the association between genotype and phenotype while adjusting for a set of covariates. Although the covariates may have non-linear or interactive effects, due to the challenge of specifying the model, GWAS often neglect such terms. Here we introduce DeepNull, a method that identifies and adjusts for non-linear and interactive covariate effects using a deep neural network. In analyses of simulated and real data, we demonstrate that DeepNull maintains tight control of the type I error while increasing statistical power by up to 20% in the presence of non-linear and interactive effects. Moreover, in the absence of such effects, DeepNull incurs no loss of power. When applied to 10 phenotypes from the UK Biobank (n = 370K), DeepNull discovered more hits (+6%) and loci (+7%), on average, than conventional association analyses, many of which are biologically plausible or have previously been reported. Finally, DeepNull improves upon linear modeling for phenotypic prediction (+23% on average).


2019 ◽  
Vol 13 (4) ◽  
pp. 317-328
Author(s):  
Johannes Bureick ◽  
Hamza Alkhatib ◽  
Ingo Neumann

Abstract B-spline curve approximation is a crucial task in many applications and disciplines. The most challenging part of B-spline curve approximation is the determination of a suitable knot vector. The finding of a solution for this multimodal and multivariate continuous nonlinear optimization problem, known as knot adjustment problem, gets even more complicated when data gaps occur. We present a new approach in this paper called an elitist genetic algorithm, which solves the knot adjustment problem in a faster and more precise manner than existing approaches. We demonstrate the performance of our elitist genetic algorithm by applying it to two challenging test functions and a real data set. We demonstrate that our algorithm is more efficient and robust against data gaps than existing approaches.


2019 ◽  
Vol 08 (04) ◽  
pp. 1950014 ◽  
Author(s):  
Yunlong Wang ◽  
Changliang Zou ◽  
Zhaojun Wang ◽  
Guosheng Yin

Change-point detection is an integral component of statistical modeling and estimation. For high-dimensional data, classical methods based on the Mahalanobis distance are typically inapplicable. We propose a novel testing statistic by combining a modified Euclidean distance and an extreme statistic, and its null distribution is asymptotically normal. The new method naturally strikes a balance between the detection abilities for both dense and sparse changes, which gives itself an edge to potentially outperform existing methods. Furthermore, the number of change-points is determined by a new Schwarz’s information criterion together with a pre-screening procedure, and the locations of the change-points can be estimated via the dynamic programming algorithm in conjunction with the intrinsic order structure of the objective function. Under some mild conditions, we show that the new method provides consistent estimation with an almost optimal rate. Simulation studies show that the proposed method has satisfactory performance of identifying multiple change-points in terms of power and estimation accuracy, and two real data examples are used for illustration.


2017 ◽  
Vol 29 (2) ◽  
pp. 332-367 ◽  
Author(s):  
Takeru Matsuda ◽  
Fumiyasu Komaki

Many time series are naturally considered as a superposition of several oscillation components. For example, electroencephalogram (EEG) time series include oscillation components such as alpha, beta, and gamma. We propose a method for decomposing time series into such oscillation components using state-space models. Based on the concept of random frequency modulation, gaussian linear state-space models for oscillation components are developed. In this model, the frequency of an oscillator fluctuates by noise. Time series decomposition is accomplished by this model like the Bayesian seasonal adjustment method. Since the model parameters are estimated from data by the empirical Bayes’ method, the amplitudes and the frequencies of oscillation components are determined in a data-driven manner. Also, the appropriate number of oscillation components is determined with the Akaike information criterion (AIC). In this way, the proposed method provides a natural decomposition of the given time series into oscillation components. In neuroscience, the phase of neural time series plays an important role in neural information processing. The proposed method can be used to estimate the phase of each oscillation component and has several advantages over a conventional method based on the Hilbert transform. Thus, the proposed method enables an investigation of the phase dynamics of time series. Numerical results show that the proposed method succeeds in extracting intermittent oscillations like ripples and detecting the phase reset phenomena. We apply the proposed method to real data from various fields such as astronomy, ecology, tidology, and neuroscience.


2017 ◽  
Vol 5 (1) ◽  
pp. 48-73
Author(s):  
Zhaoyuan Li ◽  
Maozai Tian

AbstractIt’s well-known that change-point problem is an important part of model statistical analysis. Most of the existing methods are not robust to criteria of the evaluation of change-point problem. In this article, we consider “mean-shift” problem in change-point studies. A quantile test of single quantile is proposed based on saddlepoint approximation method. In order to utilize the information at different quantile of the sequence, we further construct a “composite quantile test” to calculate the probability of every location of the sequence to be a change-point. The location of change-point can be pinpointed rather than estimated within a interval. The proposed tests make no assumptions about the functional forms of the sequence distribution and work sensitively on both large and small size samples, the case of change-point in the tails, and multiple change-points situation. The good performances of the tests are confirmed by simulations and real data analysis. The saddlepoint approximation based distribution of the test statistic that is developed in the paper is of independent interest and appealing. This finding may be of independent interest to the readers in this research area.


Sign in / Sign up

Export Citation Format

Share Document