scholarly journals A L1 Regularized Logistic Regression Model for Highdimensional Questionnaire Data Analysis

2021 ◽  
Vol 2078 (1) ◽  
pp. 012052
Author(s):  
Jiasheng Wang

Abstract The LI regularization method, or Lasso, is a technique for feature selection in high-dimensional statistical analysis. This method compresses the coefficients of the model by using the absolute value of the coefficient function as a penalty term. By adding L1 regularization to log-likelihood function of Logistic model, variable screening method based on the logistic regression model can be realized. The process of variable selection via Lasso is illustrated in Figure 1. The purpose of the experiment is to figure out the important factors that influence interviewees' subjective well-being using L1 regularized logistic regression. Experiments have been performed on CGSS 2017 data. Important features have been successfully selected by using the L1 regularization method.

2020 ◽  
Vol 30 (Supplement_5) ◽  
Author(s):  
J Matos ◽  
C Matias Dias ◽  
A Félix

Abstract Background Studies on the impact of patients with multimorbidity in the absence of work indicate that the number and type of chronic diseases may increase absenteeism and that the risk of absence from work is higher in people with two or more chronic diseases. This study analyzed the association between multimorbidity and greater frequency and duration of work absence in the portuguese population between the ages of 25 and 65 during 2015. Methods This is an epidemiological, observational, cross-sectional study with an analytical component that has its source of information from the 1st National Health Examination Survey. The study analyzed univariate, bivariate and multivariate variables under study. A multivariate logistic regression model was constructed. Results The prevalence of absenteeism was 55,1%. Education showed an association with absence of work (p = 0,0157), as well as professional activity (p = 0,0086). It wasn't possible to verify association between the presence of chronic diseases (p = 0,9358) or the presence of multimorbidity (p = 0,4309) with absence of work. The prevalence of multimorbidity was 31,8%. There was association between age (p < 0,0001), education (p < 0,001) and yield (p = 0,0009) and multimorbidity. There is no increase in the number of days of absence from work due to the increase in the number of chronic diseases. In the optimized logistic regression model the only variables that demonstrated association with the variable labor absence were age (p = 0,0391) and education (0,0089). Conclusions The scientific evidence generated will contribute to the current discussion on the need for the health and social security system to develop policies to patients with multimorbidity. Key messages The prevalence of absenteeism and multimorbidity in Portugal was respectively 55,1% and 31,8%. In the optimized model age and education demonstrated association with the variable labor absence.


2021 ◽  
Vol 11 (14) ◽  
pp. 6594
Author(s):  
Yu-Chia Hsu

The interdisciplinary nature of sports and the presence of various systemic and non-systemic factors introduce challenges in predicting sports match outcomes using a single disciplinary approach. In contrast to previous studies that use sports performance metrics and statistical models, this study is the first to apply a deep learning approach in financial time series modeling to predict sports match outcomes. The proposed approach has two main components: a convolutional neural network (CNN) classifier for implicit pattern recognition and a logistic regression model for match outcome judgment. First, the raw data used in the prediction are derived from the betting market odds and actual scores of each game, which are transformed into sports candlesticks. Second, CNN is used to classify the candlesticks time series on a graphical basis. To this end, the original 1D time series are encoded into 2D matrix images using Gramian angular field and are then fed into the CNN classifier. In this way, the winning probability of each matchup team can be derived based on historically implied behavioral patterns. Third, to further consider the differences between strong and weak teams, the CNN classifier adjusts the probability of winning the match by using the logistic regression model and then makes a final judgment regarding the match outcome. We empirically test this approach using 18,944 National Football League game data spanning 32 years and find that using the individual historical data of each team in the CNN classifier for pattern recognition is better than using the data of all teams. The CNN in conjunction with the logistic regression judgment model outperforms the CNN in conjunction with SVM, Naïve Bayes, Adaboost, J48, and random forest, and its accuracy surpasses that of betting market prediction.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Hang-Yu Chen ◽  
Wei-Long Zhang ◽  
Lei Zhang ◽  
Ping Yang ◽  
Fang Li ◽  
...  

Abstract Background Although R-CHOP (rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone) remains the standard chemotherapy regimen for diffuse large B cell lymphoma (DLBCL) patients, not all patients are responsive to the scheme, and there is no effective method to predict treatment response. Methods We utilized 5hmC-Seal to generate genome-wide 5hmC profiles in plasma cell-free DNA (cfDNA) from 86 DLBCL patients before they received R-CHOP chemotherapy. To investigate the correlation between 5hmC modifications and curative effectiveness, we separated patients into training (n = 56) and validation (n = 30) cohorts and developed a 5hmC-based logistic regression model from the training cohort to predict the treatment response in the validation cohort. Results In this study, we identified thirteen 5hmC markers associated with treatment response. The prediction performance of the logistic regression model, achieving 0.82 sensitivity and 0.75 specificity (AUC = 0.78), was superior to existing clinical indicators, such as LDH and stage. Conclusions Our findings suggest that the 5hmC modifications in cfDNA at the time before R-CHOP treatment are associated with treatment response and that 5hmC-Seal may potentially serve as a clinical-applicable, minimally invasive approach to predict R-CHOP treatment response for DLBCL patients.


Author(s):  
Yusuke Katayama ◽  
Tetsuhisa Kitamura ◽  
Kosuke Kiyohara ◽  
Kenichiro Ishida ◽  
Tomoya Hirose ◽  
...  

Abstract Purpose The aim of this study was to assess the effect of fluid administration by emergency life-saving technicians (ELST) on the prognosis of traffic accident patients by using a propensity score (PS)-matching method. Methods The study included traffic accident patients registered in the JTDB database from January 2016 to December 2017. The main outcome was hospital mortality, and the secondary outcome was cardiopulmonary arrest on hospital arrival (CPAOA). To reduce potential confounding effects in the comparisons between two groups, we estimated a propensity score (PS) by fitting a logistic regression model that was adjusted for 17 variables before the implementation of fluid administration by ELST at the scene. Results During the study period, 10,908 traffic accident patients were registered in the JTDB database, and we included 3502 patients in this study. Of these patients, 142 were administered fluid by ELST and 3360 were not administered fluid by ELST. After PS matching, 141 patients were selected from each group. In the PS-matched model, fluid administration by ELST at the scene was not associated with discharge to death (crude OR: 0.859 [95% CI, 0.500–1.475]; p = 0.582). However, the fluid group showed statistically better outcome for CPAOA than the no fluid group in the multiple logistic regression model (adjusted OR: 0.231 [95% CI, 0.055–0.967]; p = 0.045). Conclusion In this study, fluid administration to traffic accident patients by ELST was associated not with hospital mortality but with a lower proportion of CPAOA.


Sign in / Sign up

Export Citation Format

Share Document