A Comparison Study on the Parameter Estimates and Prediction Performance of Penalized Regression Models for the Multilevel Data

2021 ◽  
Vol 34 (1) ◽  
pp. 183-205
Author(s):  
Hyewon Chung ◽  
Soyoung Park
2019 ◽  
Author(s):  
Josh Colston ◽  
Pablo Peñataro Yori ◽  
Lawrence H. Moulton ◽  
Maribel Paredes Olortegui ◽  
Peter S. Kosek ◽  
...  

Author(s):  
Jeremy Freese

This article presents a method and program for identifying poorly fitting observations for maximum-likelihood regression models for categorical dependent variables. After estimating a model, the program leastlikely will list the observations that have the lowest predicted probabilities of observing the value of the outcome category that was actually observed. For example, when run after estimating a binary logistic regression model, leastlikely will list the observations with a positive outcome that had the lowest predicted probabilities of a positive outcome and the observations with a negative outcome that had the lowest predicted probabilities of a negative outcome. These can be considered the observations in which the outcome is most surprising given the values of the independent variables and the parameter estimates and, like observations with large residuals in ordinary least squares regression, may warrant individual inspection. Use of the program is illustrated with examples using binary and ordered logistic regression.


2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Yeuntyng Lai ◽  
Morihiro Hayashida ◽  
Tatsuya Akutsu

Because every disease has its unique survival pattern, it is necessary to find a suitable model to simulate followups. DNA microarray is a useful technique to detect thousands of gene expressions at one time and is usually employed to classify different types of cancer. We propose combination methods of penalized regression models and nonnegative matrix factorization (NMF) for predicting survival. We triedL1- (lasso),L2- (ridge), andL1-L2combined (elastic net) penalized regression for diffuse large B-cell lymphoma (DLBCL) patients' microarray data and found thatL1-L2combined method predicts survival best with the smallest logrankPvalue. Furthermore, 80% of selected genes have been reported to correlate with carcinogenesis or lymphoma. Through NMF we found that DLBCL patients can be divided into 4 groups clearly, and it implies that DLBCL may have 4 subtypes which have a little different survival patterns. Next we excluded some patients who were indicated hard to classify in NMF and executed three penalized regression models again. We found that the performance of survival prediction has been improved with lower logrankPvalues. Therefore, we conclude that after preselection of patients by NMF, penalized regression models can predict DLBCL patients' survival successfully.


Author(s):  
Taylor Arnold ◽  
Michael Kane ◽  
Bryan W. Lewis

2007 ◽  
Vol 13 (2) ◽  
pp. 261-272 ◽  
Author(s):  
Helmut Küchenhoff ◽  
Ralf Bender ◽  
Ingo Langner

1997 ◽  
Vol 1 (1) ◽  
pp. 71-80 ◽  
Author(s):  
P. S. P. Cowpertwait ◽  
P. E. O'Connell

Abstract. A single-site Neyman-Scott Poisson cluster model of rainfall, with convective and stratiform cells, is fitted to data for 112 sites scattered throughout the UK using harmonic variables to account for seasonality. The model is regionalised by regressing the estimates of the harmonic variables on site dependent variables (e.g. altitude) to enable rainfall to be simulated at any ungauged site in the UK. An assessment of the residual errors indicates that the regression models can be used with reasonable confidence for urban sites. Furthermore, the regional variations of the model parameter estimates are found to be in agreement with meteorological knowledge and observation. Simulated I h extreme rainfalls are found to compare favourably with observed historical values, although some lack-of-fit is evident for higher aggregation levels.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Lorenzo Dall’Olio ◽  
Nico Curti ◽  
Daniel Remondini ◽  
Yosef Safi Harb ◽  
Folkert W. Asselbergs ◽  
...  

AbstractPhotoplethysmography (PPG) measured by smartphone has the potential for a large scale, non-invasive, and easy-to-use screening tool. Vascular aging is linked to increased arterial stiffness, which can be measured by PPG. We investigate the feasibility of using PPG to predict healthy vascular aging (HVA) based on two approaches: machine learning (ML) and deep learning (DL). We performed data preprocessing, including detrending, demodulating, and denoising on the raw PPG signals. For ML, ridge penalized regression has been applied to 38 features extracted from PPG, whereas for DL several convolutional neural networks (CNNs) have been applied to the whole PPG signals as input. The analysis has been conducted using the crowd-sourced Heart for Heart data. The prediction performance of ML using two features (AUC of 94.7%) – the a wave of the second derivative PPG and tpr, including four covariates, sex, height, weight, and smoking – was similar to that of the best performing CNN, 12-layer ResNet (AUC of 95.3%). Without having the heavy computational cost of DL, ML might be advantageous in finding potential biomarkers for HVA prediction. The whole workflow of the procedure is clearly described, and open software has been made available to facilitate replication of the results.


Sign in / Sign up

Export Citation Format

Share Document