lasso regression
Recently Published Documents





Prince Nathan S

Abstract: Travelling Salesmen problem is a very popular problem in the world of computer programming. It deals with the optimization of algorithms and an ever changing scenario as it gets more and more complex as the number of variables goes on increasing. The solutions which exist for this problem are optimal for a small and definite number of cases. One cannot take into consideration of the various factors which are included when this specific problem is tried to be solved for the real world where things change continuously. There is a need to adapt to these changes and find optimized solutions as the application goes on. The ability to adapt to any kind of data, whether static or ever-changing, understand and solve it is a quality that is shown by Machine Learning algorithms. As advances in Machine Learning take place, there has been quite a good amount of research for how to solve NP-hard problems using Machine Learning. This reportis a survey to understand what types of machine algorithms can be used to solve with TSP. Different types of approaches like Ant Colony Optimization and Q-learning are explored and compared. Ant Colony Optimization uses the concept of ants following pheromone levels which lets them know where the most amount of food is. This is widely used for TSP problems where the path is with the most pheromone is chosen. Q-Learning is supposed to use the concept of awarding an agent when taking the right action for a state it is in and compounding those specific rewards. This is very much based on the exploiting concept where the agent keeps on learning onits own to maximize its own reward. This can be used for TSP where an agentwill be rewarded for having a short path and will be rewarded more if the path chosen is the shortest. Keywords: LINEAR REGRESSION, LASSO REGRESSION, RIDGE REGRESSION, DECISION TREE REGRESSOR, MACHINE LEARNING, HYPERPARAMETER TUNING, DATA ANALYSIS

2022 ◽  
Vol 11 ◽  
Yingyun Guo ◽  
Yuan Li ◽  
Jiao Li ◽  
Weiping Tao ◽  
Weiguo Dong

Low-grade gliomas (LGG) are heterogeneous, and the current predictive models for LGG are either unsatisfactory or not user-friendly. The objective of this study was to establish a nomogram based on methylation-driven genes, combined with clinicopathological parameters for predicting prognosis in LGG. Differential expression, methylation correlation, and survival analysis were performed in 516 LGG patients using RNA and methylation sequencing data, with accompanying clinicopathological parameters from The Cancer Genome Atlas. LASSO regression was further applied to select optimal prognosis-related genes. The final prognostic nomogram was implemented together with prognostic clinicopathological parameters. The predictive efficiency of the nomogram was internally validated in training and testing groups, and externally validated in the Chinese Glioma Genome Atlas database. Three DNA methylation-driven genes, ARL9, CMYA5, and STEAP3, were identified as independent prognostic factors. Together with IDH1 mutation status, age, and sex, the final prognostic nomogram achieved the highest AUC value of 0.930, and demonstrated stable consistency in both internal and external validations. The prognostic nomogram could predict personal survival probabilities for patients with LGG, and serve as a user-friendly tool for prognostic evaluation, optimizing therapeutic regimes, and managing LGG patients.

2022 ◽  
Vol 11 ◽  
Lingge Yang ◽  
Yuan Wu ◽  
Huan Xu ◽  
Jingnan Zhang ◽  
Xinjie Zheng ◽  

ObjectiveThis study was conducted in order to establish a long non-coding RNA (lncRNA)-based model for predicting overall survival (OS) in patients with lung adenocarcinoma (LUAD).MethodsOriginal RNA-seq data of LUAD samples were extracted from The Cancer Genome Atlas (TCGA) database. Univariate Cox survival analysis was performed to select lncRNAs associated with OS. The least absolute shrinkage and selection operator (LASSO) regression analysis and multivariate Cox analysis were performed for building an OS-associated lncRNA prognostic model. Moreover, receiver operating characteristic (ROC) curves were generated to assess predictive values of the hub lncRNAs. Consequently, qRT-PCR was conducted to validate its prognostic value. The potential roles of these lncRNAs in immunotherapy and anti-angiogenic therapy were also investigated.ResultsThe lncRNA-associated risk score of OS (LARSO) was established based on the LASSO coefficient of six individual lncRNAs, including CTD-2124B20.2, CTD-2168K21.1, DEPDC1-AS1, RP1-290I10.3, RP11-454K7.3, and RP11-95M5.1. Kaplan–Meier analysis revealed that LUAD patients with higher LARSO values had a shorter OS. Furthermore, a new risk score (NRS), including LARSO, stage, and N stage, could better predict the prognosis of LUAD patients compared with LARSO alone. Evaluation of the prognostic model in our cohort demonstrated that patients with higher scores had a worse prognosis. In addition, correlation analysis between these six lncRNAs and immune checkpoints or anti-angiogenic targets suggested that LUAD patients with high LARSO might not be sensitive to immunotherapy or anti-angiogenic therapy.ConclusionsThis robust six-lncRNA prognostic signature may be used as a novel and powerful prognostic biomarker for lung adenocarcinoma.

2022 ◽  
Vol 22 (1) ◽  
Jihua Yang ◽  
XiaoHong Wei ◽  
Fang Hu ◽  
Wei Dong ◽  
Liao Sun

Abstract Background Molecular markers play an important role in predicting clinical outcomes in pancreatic adenocarcinoma (PAAD) patients. Analysis of the ferroptosis-related genes may provide novel potential targets for the prognosis and treatment of PAAD. Methods RNA-sequence and clinical data of PAAD was downloaded from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) public databases. The PAAD samples were clustered by a non-negative matrix factorization (NMF) algorithm. The differentially expressed genes (DEGs) between different subtypes were used by “limma_3.42.2” package. The R software package clusterProfiler was used for functional enrichment analysis. Then, a multivariate Cox proportional and LASSO regression were used to develop a ferroptosis-related gene signature for pancreatic adenocarcinoma. A nomogram and corrected curves were constructed. Finally, the expression and function of these signature genes were explored by qRT-PCR, immunohistochemistry (IHC) and proliferation, migration and invasion assays. Results The 173 samples were divided into 3 categories (C1, C2, and C3) and a 3-gene signature model (ALOX5, ALOX12, and CISD1) was constructed. The prognostic model showed good independent prognostic ability in PAAD. In the GSE62452 external validation set, the molecular model also showed good risk prediction. KM-curve analysis showed that there were significant differences between the high and low-risk groups, samples with a high-risk score had a worse prognosis. The predictive efficiency of the 3-gene signature-based nomogram was significantly better than that of traditional clinical features. For comparison with other models, that our model, with a reasonable number of genes, yields a more effective result. The results obtained with qPCR and IHC assays showed that ALOX5 was highly expressed, whether ALOX12 and CISD1 were expressed at low levels in tissue samples. Finally, function assays results suggested that ALOX5 may be an oncogene and ALOX12 and CISD1 may be tumor suppressor genes. Conclusions We present a novel prognostic molecular model for PAAD based on ferroptosis-related genes, which serves as a potentially effective tool for prognostic differentiation in pancreatic cancer patients.

2022 ◽  
Vol 20 (1) ◽  
Jianqiu Kong ◽  
Junjiong Zheng ◽  
Jieying Wu ◽  
Shaoxu Wu ◽  
Jinhua Cai ◽  

Abstract Background Preoperative diagnosis of pheochromocytoma (PHEO) accurately impacts preoperative preparation and surgical outcome in PHEO patients. Highly reliable model to diagnose PHEO is lacking. We aimed to develop a magnetic resonance imaging (MRI)-based radiomic-clinical model to distinguish PHEO from adrenal lesions. Methods In total, 305 patients with 309 adrenal lesions were included and divided into different sets. The least absolute shrinkage and selection operator (LASSO) regression model was used for data dimension reduction, feature selection, and radiomics signature building. In addition, a nomogram incorporating the obtained radiomics signature and selected clinical predictors was developed by using multivariable logistic regression analysis. The performance of the radiomic-clinical model was assessed with respect to its discrimination, calibration, and clinical usefulness. Results Seven radiomics features were selected among the 1301 features obtained as they could differentiate PHEOs from other adrenal lesions in the training (area under the curve [AUC], 0.887), internal validation (AUC, 0.880), and external validation cohorts (AUC, 0.807). Predictors contained in the individualized prediction nomogram included the radiomics signature and symptom number (symptoms include headache, palpitation, and diaphoresis). The training set yielded an AUC of 0.893 for the nomogram, which was confirmed in the internal and external validation sets with AUCs of 0.906 and 0.844, respectively. Decision curve analyses indicated the nomogram was clinically useful. In addition, 25 patients with 25 lesions were recruited for prospective validation, which yielded an AUC of 0.917 for the nomogram. Conclusion We propose a radiomic-based nomogram incorporating clinically useful signatures as an easy-to-use, predictive and individualized tool for PHEO diagnosis.

2022 ◽  
Vol 11 ◽  
Shengsen Chen ◽  
Chao Wang ◽  
Yuwei Gu ◽  
Rongwei Ruan ◽  
Jiangping Yu ◽  

Background and AimsAs a key pathological factor, microvascular invasion (MVI), especially its M2 grade, greatly affects the prognosis of liver cancer patients. Accurate preoperative prediction of MVI and its M2 classification can help clinicians to make the best treatment decision. Therefore, we aimed to establish effective nomograms to predict MVI and its M2 grade.MethodsA total of 111 patients who underwent radical resection of hepatocellular carcinoma (HCC) from January 2015 to September 2020 were retrospectively collected. We utilized logistic regression and least absolute shrinkage and selection operator (LASSO) regression to identify the independent predictive factors of MVI and its M2 classification. Integrated discrimination improvement (IDI) and net reclassification improvement (NRI) were calculated to select the potential predictive factors from the results of LASSO and logistic regression. Nomograms for predicting MVI and its M2 grade were then developed by incorporating these factors. Area under the curve (AUC), calibration curve, and decision curve analysis (DCA) were respectively used to evaluate the efficacy, accuracy, and clinical utility of the nomograms.ResultsCombined with the results of LASSO regression, logistic regression, and IDI and NRI analyses, we founded that clinical tumor-node-metastasis (TNM) stage, tumor size, Edmondson–Steiner classification, α-fetoprotein (AFP), tumor capsule, tumor margin, and tumor number were independent risk factors for MVI. Among the MVI-positive patients, only clinical TNM stage, tumor capsule, tumor margin, and tumor number were highly correlated with M2 grade. The nomograms established by incorporating the above variables had a good performance in predicting MVI (AUCMVI = 0.926) and its M2 classification (AUCM2 = 0.803). The calibration curve confirmed that predictions and actual observations were in good agreement. Significant clinical utility of our nomograms was demonstrated by DCA.ConclusionsThe nomograms of this study make it possible to do individualized predictions of MVI and its M2 classification, which may help us select an appropriate treatment plan.

2022 ◽  
Yu Lin ◽  
Zhenyu Wang ◽  
Gang Chen ◽  
Wenge Liu

Abstract Background:Spinal and pelvic osteosarcoma is a rare type of all osteosarcomas,and distant metastasis is an important factor for poor prognosis of this disease. There are no similar studies on prediction of distant metastasis of spinal and pelvic osteosarcoma. We aim to construct and validate a nomogram to predict the risk of distant metastasis of spinal and pelvic osteosarcoma.Methods:We collected the data on patients with spinal and pelvic osteosarcoma from the Surveillance, Epidemiology, and End Results(SEER) database retrospectively. The Kaplan-Meier curve was used to compare differences in survival time between patients with metastasis and non-metastasis. Total patients were randomly divided into training cohort and validation cohort. The risk factor of distant metastasis were identified via the least absolute shrinkage and selection operator(LASSO) regression and multivariate logistic analysis. The nomogram we constructed were validated internally and externally by C-index, calibration curves,receiver operating characteristic(ROC) curve and Decision curve analysis (DCA).Results:The Kaplan-Meier curve showed that the survival time of non-metastatic patients was longer than that of metastatic patients(P<0.001).All patients(n=358) were divided into training cohort(n=269) and validation cohort(n=89).The LASSO regression selected five meaningful variables in the training cohort. The multivariate logistic regression analysis demonstrated that surgery(yes,OR=0.175, 95%CI=0.095-0.321,p=0.000) was the independent risk factors for distant metastasis of patients with spinal and pelvic osteosarcoma. The C-index and calibration curves showed the good agreement between the predicted results and the actual results. The area under the receiver operating characteristic curve(AUC) values were 0.748(95%CI=0.687-0.817) and 0.758(95%CI=0.631-0.868) in the training and validation cohorts respectively. The DCA showed that the nomogram has a good clinical usefulness and net benefit.Conclusion:No surgery is the independent risk factor of distant metastasis of spinal and pelvic osteosarcoma. The nomogram we constructed to predict the probability of distant metastasis of patients with spinal and pelvic osteosarcoma is reliable and effective by internal and external verification.

2022 ◽  
Vol 12 ◽  
Yanbing Hou ◽  
Lingyu Zhang ◽  
Qianqian Wei ◽  
Ruwei Ou ◽  
Jing Yang ◽  

Background: Idiopathic blepharospasm (BSP) is a common adult-onset focal dystonia. Neuroimaging technology can be used to visualize functional and microstructural changes of the whole brain.Method: We used resting-state functional MRI (rs-fMRI) and graph theoretical analysis to explore the functional connectome in patients with BSP. Altogether 20 patients with BSP and 20 age- and gender-matched healthy controls (HCs) were included in the study. Measures of network topology were calculated, such as small-world parameters (clustering coefficient [Cp], the shortest path length [Lp]), network efficiency parameters (global efficiency [Eglob], local efficiency [Eloc]), and the nodal parameter (nodal efficiency [Enod]). In addition, the least absolute shrinkage and selection operator (LASSO) regression was adopted to determine the most critical imaging features, and the classification model using critical imaging features was constructed.Results: Compared with HCs, the BSP group showed significantly decreased Eloc. Imaging features of nodal centrality (Enod) were entered into the LASSO method, and the classification model was constructed with nine imaging nodes. The area under the curve (AUC) was 0.995 (95% CI: 0.973–1.000), and the sensitivity and specificity were 95% and 100%, respectively. Specifically, four imaging nodes within the sensorimotor network (SMN), cerebellum, and default mode network (DMN) held the prominent information. Compared with HCs, the BSP group showed significantly increased Enod in the postcentral region within the SMN, decreased Enod in the precentral region within the SMN, increased Enod in the medial cerebellum, and increased Enod in the precuneus within the DMN.Conclusion: The network model in BSP showed reduced local connectivity. Baseline connectomic measures derived from rs-fMRI data may be capable of identifying patients with BSP, and regions from the SMN, cerebellum, and DMN may provide key insights into the underlying pathophysiology of BSP.

2022 ◽  
Vol 12 ◽  
Mengjing Cui ◽  
Qianqian Xia ◽  
Xing Zhang ◽  
Wenjing Yan ◽  
Dan Meng ◽  

Ovarian cancer (OC), one of the most common malignancies of the female reproductive system, is characterized by high incidence and poor prognosis. Tumor mutation burden (TMB), as an important biomarker that can represent the degree of tumor mutation, is emerging as a key indicator for predicting the efficacy of tumor immunotherapy. In our study, the gene expression profiles of OC were downloaded from TCGA and GEO databases. Subsequently, we analyzed the prognostic value of TMB in OC and found that a higher TMB score was significantly associated with a better prognosis (p = 0.004). According to the median score of TMB, 9 key TMB related immune prognostic genes were selected by LASSO regression for constructing a TMB associated immune risk score (TMB-IRS) signature, which can effectively predict the prognosis of OC patients (HR = 2.32, 95% CI = 1.68–3.32; AUC = 0.754). Interestingly, TMB-IRS is also closely related to the level of immune cell infiltration and immune checkpoint molecules (PD1, PD-L1, CTLA4, PD-L2) in OC. Furthermore, the nomogram combined with TMB-IRS and a variety of clinicopathological features can more comprehensively evaluate the prognosis of patients. In conclusion, we explored the relationship between TMB and prognosis and validated the TMB-IRS signature based on TMB score in an independent database (HR = 1.60, 95% CI = 1.13–2.27; AUC = 0.639), which may serve as a novel biomarker for predicting OC prognosis as well as possible therapeutic targets.

Guangyu Chen ◽  
Junyu Long ◽  
Ruizhe Zhu ◽  
Gang Yang ◽  
Jiangdong Qiu ◽  

Background: Pancreatic cancer (PC) is a highly aggressive gastrointestinal tumor and has a poor prognosis. Evaluating the prognosis validly is urgent for PC patients. In this study, we utilized the RNA-sequencing (RNA-seq) profiles and DNA methylation expression data comprehensively to develop and validate a prognostic signature in patients with PC.Methods: The integrated analysis of RNA-seq, DNA methylation expression profiles, and relevant clinical information was performed to select four DNA methylation-driven genes. Then, a prognostic signature was established by the univariate, multivariate Cox, and least absolute shrinkage and selection operator (LASSO) regression analyses in The Cancer Genome Atlas (TCGA) dataset. GSE62452 cohort was utilized for external validation. Finally, a nomogram model was set up and evaluated by calibration curves.Results: Nine DNA methylation-driven genes that were related to overall survival (OS) were identified. After multivariate Cox and LASSO regression analyses, four of these genes (RIC3, MBOAT2, SEZ6L, and OAS2) were selected to establish the predictive signature. The PC patients were stratified into two groups according to the median risk score, of which the low-risk group displayed a prominently favorable OS compared with the high-risk group, whether in the training (p < 0.001) or validation (p < 0.01) cohort. Then, the univariate and multivariate Cox regression analyses showed that age, grade, risk score, and the number of positive lymph nodes were significantly associated with OS in PC patients. Therefore, we used these clinical variables to construct a nomogram; and its performance in predicting the 1-, 2-, and 3-year OS of patients with PC was assessed via calibration curves.Conclusion: A prognostic risk score signature was built with the four alternative DNA methylation-driven genes. Furthermore, in combination with the risk score, age, grade, and the number of positive lymph nodes, a nomogram was established for conveniently predicting the individualized prognosis of PC patients.

Sign in / Sign up

Export Citation Format

Share Document