elastic net
Recently Published Documents


TOTAL DOCUMENTS

660
(FIVE YEARS 304)

H-INDEX

28
(FIVE YEARS 5)

2023 ◽  
Author(s):  
Jingyi Kenneth Tay ◽  
Nima Aghaeepour ◽  
Trevor Hastie ◽  
Robert Tibshirani
Keyword(s):  

Author(s):  
Xavier Bry ◽  
Ndèye Niang ◽  
Thomas Verron ◽  
Stéphanie Bougeard

2022 ◽  
Vol 24 (1) ◽  
Author(s):  
K. S. ARAVIND ◽  
ANANTA VASHISTH ◽  
P. KRISHANAN ◽  
B.DAS

Wheat yield production is largely attributed by weather parameters. Model developed by multiple linear, neural network and penalised regression techniques using weather data have the potential to provide reliable, timely and cost-effective prediction of wheat yield. Wheat yield data and weather parameter during crop growing period (46th to 15th SMW) for more than 30 years were collected for study area and model was developed using stepwise multiple linear regression (SMLR), principal component analysis (PCA) in combination with SMLR, artificial neural network (ANN) alone and in combination with PCA, least absolute shrinkage and selection operator (LASSO) and elastic net (ENET) techniques.  Analysis was carried out by fixing 70% of the data for calibration and remaining dataset for validation. On examining these models, LASSO and elastic net are performing excellent having nRMSE value less than 10 % for four out of five location and good for one location, because of prevention in over fitting and reducing regression coefficient by penalization.


Author(s):  
Meghna Chakraborty ◽  
Md Shakir Mahmud ◽  
Timothy J. Gates ◽  
Subhrajit Sinha

Since the United States started grappling with the COVID-19 pandemic, with the highest number of confirmed cases and deaths in the world as of August 2020, most states have enforced travel restrictions resulting in drastic reductions in mobility and travel. However, the long-term implications of this crisis to mobility still remain uncertain. To this end, this study proposes an analytical framework that determines the most significant factors affecting human mobility in the United States during the early days of the pandemic. Particularly, the study uses least absolute shrinkage and selection operator (LASSO) regularization to identify the most significant variables influencing human mobility and uses linear regularization algorithms, including ridge, LASSO, and elastic net modeling techniques, to predict human mobility. State-level data were obtained from various sources from January 1, 2020 to June 13, 2020. The entire data set was divided into a training and a test data set, and the variables selected by LASSO were used to train models by the linear regularization algorithms, using the training data set. Finally, the prediction accuracy of the developed models was examined on the test data. The results indicate that several factors, including the number of new cases, social distancing, stay-at-home orders, domestic travel restrictions, mask-wearing policy, socioeconomic status, unemployment rate, transit mode share, percent of population working from home, and percent of older (60+ years) and African and Hispanic American populations, among others, significantly influence daily trips. Moreover, among all models, ridge regression provides the most superior performance with the least error, whereas both LASSO and elastic net performed better than the ordinary linear model.


2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Vahid Ebrahimi ◽  
Mehrdad Sharifi ◽  
Razieh Sadat Mousavi-Roknabadi ◽  
Robab Sadegh ◽  
Mohammad Hossein Khademian ◽  
...  

Abstract Background Narrowing a large set of features to a smaller one can improve our understanding of the main risk factors for in-hospital mortality in patients with COVID-19. This study aimed to derive a parsimonious model for predicting overall survival (OS) among re-infected COVID-19 patients using machine-learning algorithms. Methods The retrospective data of 283 re-infected COVID-19 patients admitted to twenty-six medical centers (affiliated with Shiraz University of Medical Sciences) from 10 June to 26 December 2020 were reviewed and analyzed. An elastic-net regularized Cox proportional hazards (PH) regression and model approximation via backward elimination were utilized to optimize a predictive model of time to in-hospital death. The model was further reduced to its core features to maximize simplicity and generalizability. Results The empirical in-hospital mortality rate among the re-infected COVID-19 patients was 9.5%. In addition, the mortality rate among the intubated patients was 83.5%. Using the Kaplan-Meier approach, the OS (95% CI) rates for days 7, 14, and 21 were 87.5% (81.6-91.6%), 78.3% (65.0-87.0%), and 52.2% (20.3-76.7%), respectively. The elastic-net Cox PH regression retained 8 out of 35 candidate features of death. Transfer by Emergency Medical Services (EMS) (HR=3.90, 95% CI: 1.63-9.48), SpO2≤85% (HR=8.10, 95% CI: 2.97-22.00), increased serum creatinine (HR=1.85, 95% CI: 1.48-2.30), and increased white blood cells (WBC) count (HR=1.10, 95% CI: 1.03-1.15) were associated with higher in-hospital mortality rates in the re-infected COVID-19 patients. Conclusion The results of the machine-learning analysis demonstrated that transfer by EMS, profound hypoxemia (SpO2≤85%), increased serum creatinine (more than 1.6 mg/dL), and increased WBC count (more than 8.5 (×109 cells/L)) reduced the OS of the re-infected COVID-19 patients. We recommend that future machine-learning studies should further investigate these relationships and the associated factors in these patients for a better prediction of OS.


2022 ◽  
Author(s):  
Dichen Quan ◽  
Jiahui Ren ◽  
Hao Ren ◽  
Liqin Linghu ◽  
Xuchun Wang ◽  
...  

Abstract This study aimed to construct Bayesian networks(BNs) to analyze the network relationship between those influencing factors and COPD, and to explore their intensity of effect on COPD through network reasoning. Elastic Net and Max-Min Hill-Climbing(MMHC) hybrid algorithm were adopted to screen the variables on the monitoring data of COPD among residents in Shanxi Province, China from 2014 to 2015, and construct BNs respectively. After variables selection by Elastic Net, 10 variables closely related to COPD were selected finally. The BNs constructed by MMHC showed that smoking status, household air pollution, family history, cough, air hunger or dyspnea were directly related to COPD, and Gender was indirectly linked to COPD through smoking status. Moreover, smoking status, household air pollution and family history were the parent nodes of COPD, and cough, air hunger or dyspnea represented the child nodes of COPD. In other words, smoking status, household air pollution and family history were related to the occurrence of COPD, and COPD would make patients’ cough, air hunger or dyspnea worse. Generally speaking, BNs could reveal the complex network relationship between COPD and its relevant factors well, making it more convenient to carry out targeted prevention and control of COPD.


Genes ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 87
Author(s):  
Sean M. Burnard ◽  
Rodney A. Lea ◽  
Miles Benton ◽  
David Eccles ◽  
Daniel W. Kennedy ◽  
...  

Conventional genome-wide association studies (GWASs) of complex traits, such as Multiple Sclerosis (MS), are reliant on per-SNP p-values and are therefore heavily burdened by multiple testing correction. Thus, in order to detect more subtle alterations, ever increasing sample sizes are required, while ignoring potentially valuable information that is readily available in existing datasets. To overcome this, we used penalised regression incorporating elastic net with a stability selection method by iterative subsampling to detect the potential interaction of loci with MS risk. Through re-analysis of the ANZgene dataset (1617 cases and 1988 controls) and an IMSGC dataset as a replication cohort (1313 cases and 1458 controls), we identified new association signals for MS predisposition, including SNPs above and below conventional significance thresholds while targeting two natural killer receptor loci and the well-established HLA loci. For example, rs2844482 (98.1% iterations), otherwise ignored by conventional statistics (p = 0.673) in the same dataset, was independently strongly associated with MS in another GWAS that required more than 40 times the number of cases (~45 K). Further comparison of our hits to those present in a large-scale meta-analysis, confirmed that the majority of SNPs identified by the elastic net model reached conventional statistical GWAS thresholds (p < 5 × 10−8) in this much larger dataset. Moreover, we found that gene variants involved in oxidative stress, in addition to innate immunity, were associated with MS. Overall, this study highlights the benefit of using more advanced statistical methods to (re-)analyse subtle genetic variation among loci that have a biological basis for their contribution to disease risk.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Vahid Ebrahimi ◽  
Zahra Bagheri ◽  
Zahra Shayan ◽  
Peyman Jafari

Assessing differential item functioning (DIF) using the ordinal logistic regression (OLR) model highly depends on the asymptotic sampling distribution of the maximum likelihood (ML) estimators. The ML estimation method, which is often used to estimate the parameters of the OLR model for DIF detection, may be substantially biased with small samples. This study is aimed at proposing a new application of the elastic net regularized OLR model, as a special type of machine learning method, for assessing DIF between two groups with small samples. Accordingly, a simulation study was conducted to compare the powers and type I error rates of the regularized and nonregularized OLR models in detecting DIF under various conditions including moderate and severe magnitudes of DIF ( DIF = 0.4   and   0.8 ), sample size ( N ), sample size ratio ( R ), scale length ( I ), and weighting parameter ( w ). The simulation results revealed that for I = 5 and regardless of R , the elastic net regularized OLR model with w = 0.1 , as compared with the nonregularized OLR model, increased the power of detecting moderate uniform DIF ( DIF = 0.4 ) approximately 35% and 21% for N = 100   and   150 , respectively. Moreover, for I = 10 and severe uniform DIF ( DIF = 0.8 ), the average power of the elastic net regularized OLR model with 0.03 ≤ w ≤ 0.06 , as compared with the nonregularized OLR model, increased approximately 29.3% and 11.2% for N = 100   and   150 , respectively. In these cases, the type I error rates of the regularized and nonregularized OLR models were below or close to the nominal level of 0.05. In general, this simulation study showed that the elastic net regularized OLR model outperformed the nonregularized OLR model especially in extremely small sample size groups. Furthermore, the present research provided a guideline and some recommendations for researchers who conduct DIF studies with small sample sizes.


Sign in / Sign up

Export Citation Format

Share Document