scholarly journals A simple and efficient algorithm for gene selection using sparse logistic regression

2003 ◽  
Vol 19 (17) ◽  
pp. 2246-2253 ◽  
Author(s):  
S. K. Shevade ◽  
S. S. Keerthi
2018 ◽  
Vol 8 (9) ◽  
pp. 1569 ◽  
Author(s):  
Shengbing Wu ◽  
Hongkun Jiang ◽  
Haiwei Shen ◽  
Ziyi Yang

In recent years, gene selection for cancer classification based on the expression of a small number of gene biomarkers has been the subject of much research in genetics and molecular biology. The successful identification of gene biomarkers will help in the classification of different types of cancer and improve the prediction accuracy. Recently, regularized logistic regression using the L 1 regularization has been successfully applied in high-dimensional cancer classification to tackle both the estimation of gene coefficients and the simultaneous performance of gene selection. However, the L 1 has a biased gene selection and dose not have the oracle property. To address these problems, we investigate L 1 / 2 regularized logistic regression for gene selection in cancer classification. Experimental results on three DNA microarray datasets demonstrate that our proposed method outperforms other commonly used sparse methods ( L 1 and L E N ) in terms of classification performance.


2013 ◽  
Vol 14 (1) ◽  
Author(s):  
Yong Liang ◽  
Cheng Liu ◽  
Xin-Ze Luan ◽  
Kwong-Sak Leung ◽  
Tak-Ming Chan ◽  
...  

2005 ◽  
Vol 01 (01) ◽  
pp. 129-145 ◽  
Author(s):  
XIAOBO ZHOU ◽  
XIAODONG WANG ◽  
EDWARD R. DOUGHERTY

In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables (gene expressions) and the small number of experimental conditions. Many gene-selection and classification methods have been proposed; however most of these treat gene selection and classification separately, and not under the same model. We propose a Bayesian approach to gene selection using the logistic regression model. The Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the minimum description length (MDL) principle are used in constructing the posterior distribution of the chosen genes. The same logistic regression model is then used for cancer classification. Fast implementation issues for these methods are discussed. The proposed methods are tested on several data sets including those arising from hereditary breast cancer, small round blue-cell tumors, lymphoma, and acute leukemia. The experimental results indicate that the proposed methods show high classification accuracies on these data sets. Some robustness and sensitivity properties of the proposed methods are also discussed. Finally, mixing logistic-regression based gene selection with other classification methods and mixing logistic-regression-based classification with other gene-selection methods are considered.


2021 ◽  
Vol 29 ◽  
pp. 287-295
Author(s):  
Zhiming Zhou ◽  
Haihui Huang ◽  
Yong Liang

BACKGROUND: In genome research, it is particularly important to identify molecular biomarkers or signaling pathways related to phenotypes. Logistic regression model is a powerful discrimination method that can offer a clear statistical explanation and obtain the classification probability of classification label information. However, it is unable to fulfill biomarker selection. OBJECTIVE: The aim of this paper is to give the model efficient gene selection capability. METHODS: In this paper, we propose a new penalized logsum network-based regularization logistic regression model for gene selection and cancer classification. RESULTS: Experimental results on simulated data sets show that our method is effective in the analysis of high-dimensional data. For a large data set, the proposed method has achieved 89.66% (training) and 90.02% (testing) AUC performances, which are, on average, 5.17% (training) and 4.49% (testing) better than mainstream methods. CONCLUSIONS: The proposed method can be considered a promising tool for gene selection and cancer classification of high-dimensional biological data.


Author(s):  
M. A. Al-Shabi

Fraudulent credit card transaction is still one of problems that face the companies and banks sectors; it causes them to lose billions of dollars every year. The design of efficient algorithm is one of the most important challenges in this area. This paper aims to propose an efficient approach that automatic detects fraud credit card related to insurance companies using deep learning algorithm called Autoencoders. The effectiveness of the proposed method has been proved in identifying fraud in actual data from transactions made by credit cards in September 2013 by European cardholders. In addition, a solution for data unbalancing is provided in this paper, which affects most current algorithms. The suggested solution relies on training for the autoencoder for the reconstruction normal data. Anomalies are detected by defining a reconstruction error threshold and considering the cases with a superior threshold as anomalies. The algorithm's performance was able to detected fraudulent transactions between 64% at the threshold = 5, 79% at the threshold = 3 and 91% at threshold= 0.7, it is better in performance compare with logistic regression 57% in unbalanced dataset.


Sign in / Sign up

Export Citation Format

Share Document