scholarly journals Feature selection and validated predictive performance in the domain of Legionella pneumophila: a comparative study

2016 ◽  
Vol 9 (1) ◽  
Author(s):  
Tjeerd van der Ploeg ◽  
Ewout W. Steyerberg
2021 ◽  
pp. 153537022199201
Author(s):  
Runmin Li ◽  
Guosheng Wang ◽  
ZhouJie Wu ◽  
HuaGuang Lu ◽  
Gen Li ◽  
...  

Multiple-omics sequencing information with high-throughput has laid a solid foundation to identify genes associated with cancer prognostic process. Multiomics information study is capable of revealing the cancer occurring and developing system according to several aspects. Currently, the prognosis of osteosarcoma is still poor, so a genetic marker is needed for predicting the clinically related overall survival result. First, Office of Cancer Genomics (OCG Target) provided RNASeq, copy amount variations information, and clinically related follow-up data. Genes associated with prognostic process and genes exhibiting copy amount difference were screened in the training group, and the mentioned genes were integrated for feature selection with least absolute shrinkage and selection operator (Lasso). Eventually, effective biomarkers received the screening process. Lastly, this study built and demonstrated one gene-associated prognosis mode according to the set of the test and gene expression omnibus validation set; 512 prognosis-related genes ( P < 0.01), 336 copies of amplified genes ( P < 0.05), and 36 copies of deleted genes ( P < 0.05) were obtained, and those genes of the mentioned genomic variants display close associations with tumor occurring and developing mechanisms. This study generated 10 genes for candidates through the integration of genomic variant genes as well as prognosis-related genes. Six typical genes (i.e. MYC, CHIC2, CCDC152, LYL1, GPR142, and MMP27) were obtained by Lasso feature selection and stepwise multivariate regression study, many of which are reported to show a relationship to tumor progressing process. The authors conducted Cox regression study for building 6-gene sign, i.e. one single prognosis-related element, in terms of cases carrying osteosarcoma. In addition, the samples were able to be risk stratified in the training group, test set, and externally validating set. The AUC of five-year survival according to the training group and validation set reached over 0.85, with superior predictive performance as opposed to the existing researches. Here, 6-gene sign was built to be new prognosis-related marking elements for assessing osteosarcoma cases’ surviving state.


2021 ◽  
Author(s):  
Zahra Mungloo-Dilmohamud ◽  
Yasmina Jaufeerally-Fakim ◽  
Carlos Pena-Reyes

2010 ◽  
Vol 68 (2) ◽  
pp. 74 ◽  
Author(s):  
Jiwon Lyu ◽  
Jin Woo Song ◽  
Chang-Min Choi ◽  
Yeon-Mok Oh ◽  
Sang Do Lee ◽  
...  

2021 ◽  
Author(s):  
Hyeyoung Koh ◽  
Hannah Beth Blum

This study presents a machine learning-based approach for sensitivity analysis to examine how parameters affect a given structural response while accounting for uncertainty. Reliability-based sensitivity analysis involves repeated evaluations of the performance function incorporating uncertainties to estimate the influence of a model parameter, which can lead to prohibitive computational costs. This challenge is exacerbated for large-scale engineering problems which often carry a large quantity of uncertain parameters. The proposed approach is based on feature selection algorithms that rank feature importance and remove redundant predictors during model development which improve model generality and training performance by focusing only on the significant features. The approach allows performing sensitivity analysis of structural systems by providing feature rankings with reduced computational effort. The proposed approach is demonstrated with two designs of a two-bay, two-story planar steel frame with different failure modes: inelastic instability of a single member and progressive yielding. The feature variables in the data are uncertainties including material yield strength, Young’s modulus, frame sway imperfection, and residual stress. The Monte Carlo sampling method is utilized to generate random realizations of the frames from published distributions of the feature parameters, and the response variable is the frame ultimate strength obtained from finite element analyses. Decision trees are trained to identify important features. Feature rankings are derived by four feature selection techniques including impurity-based, permutation, SHAP, and Spearman's correlation. Predictive performance of the model including the important features are discussed using the evaluation metric for imbalanced datasets, Matthews correlation coefficient. Finally, the results are compared with those from reliability-based sensitivity analysis on the same example frames to show the validity of the feature selection approach. As the proposed machine learning-based approach produces the same results as the reliability-based sensitivity analysis with improved computational efficiency and accuracy, it could be extended to other structural systems.


1993 ◽  
Vol 31 (5) ◽  
pp. 1213-1216 ◽  
Author(s):  
F F Reinthaler ◽  
J Sattler ◽  
K Schaffler-Dullnig ◽  
B Weinmayr ◽  
E Marth

Sign in / Sign up

Export Citation Format

Share Document