scholarly journals Classification of mislabelled microarrays using robust sparse logistic regression

2013 ◽  
Vol 29 (7) ◽  
pp. 870-877 ◽  
Author(s):  
Jakramate Bootkrajang ◽  
Ata Kabán
2018 ◽  
Vol 8 (9) ◽  
pp. 1569 ◽  
Author(s):  
Shengbing Wu ◽  
Hongkun Jiang ◽  
Haiwei Shen ◽  
Ziyi Yang

In recent years, gene selection for cancer classification based on the expression of a small number of gene biomarkers has been the subject of much research in genetics and molecular biology. The successful identification of gene biomarkers will help in the classification of different types of cancer and improve the prediction accuracy. Recently, regularized logistic regression using the L 1 regularization has been successfully applied in high-dimensional cancer classification to tackle both the estimation of gene coefficients and the simultaneous performance of gene selection. However, the L 1 has a biased gene selection and dose not have the oracle property. To address these problems, we investigate L 1 / 2 regularized logistic regression for gene selection in cancer classification. Experimental results on three DNA microarray datasets demonstrate that our proposed method outperforms other commonly used sparse methods ( L 1 and L E N ) in terms of classification performance.


2018 ◽  
Vol 45 (9) ◽  
pp. 4112-4124 ◽  
Author(s):  
Hoda Nemat ◽  
Hamid Fehri ◽  
Nasrin Ahmadinejad ◽  
Alejandro F. Frangi ◽  
Ali Gooya

NeuroImage ◽  
2010 ◽  
Vol 51 (2) ◽  
pp. 752-764 ◽  
Author(s):  
Srikanth Ryali ◽  
Kaustubh Supekar ◽  
Daniel A. Abrams ◽  
Vinod Menon

Now a day’s cancer has become a deathly disease due to the abnormal growth of the cell. Many researchers are working in this area for the early prediction of cancer. For the proper classification of cancer data, demands for the identification of proper set of genes by analyzing the genomic data. Most of the researchers used microarrays to identify the cancerous genomes. However, such kind of data is high dimensional where number of genes are more compared to samples. Also the data consists of many irrelevant features and noisy data. The classification technique deal with such kind of data influences the performance of algorithm. A popular classification algorithm (i.e., Logistic Regression) is considered in this work for gene classification. Regularization techniques like Lasso with L1 penalty, Ridge with L2 penalty, and hybrid Lasso with L1/2+2 penalty used to minimize irrelevant features and avoid overfitting. However, these methods are of sparse parametric and limits to linear data. Also methods have not produced promising performance when applied to high dimensional genome data. For solving these problems, this paper presents an Additive Sparse Logistic Regression with Additive Regularization (ASLR) method to discriminate linear and non-linear variables in gene classification. The results depicted that the proposed method proved to be the best-regularized method for classifying microarray data compared to standard methods


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 3044-3044
Author(s):  
David Haan ◽  
Anna Bergamaschi ◽  
Yuhong Ning ◽  
William Gibb ◽  
Michael Kesling ◽  
...  

3044 Background: Epigenomics assays have recently become popular tools for identification of molecular biomarkers, both in tissue and in plasma. In particular 5-hydroxymethyl-cytosine (5hmC) method, has been shown to enable the epigenomic regulation of gene expression and subsequent gene activity, with different patterns, across several tumor and normal tissues types. In this study we show that 5hmC profiles enable discrete classification of tumor and normal tissue for breast, colorectal, lung ovary and pancreas. Such classification was also recapitulated in cfDNA from patient with breast, colorectal, lung, ovarian and pancreatic cancers. Methods: DNA was isolated from 176 fresh frozen tissues from breast, colorectal, lung, ovary and pancreas (44 per tumor per tissue type and up to 11 tumor tissues for each stage (I-IV)) and up to 10 normal tissues per tissue type. cfDNA was isolated from plasma from 783 non-cancer individuals and 569 cancer patients. Plasma-isolated cfDNA and tumor genomic DNA, were enriched for the 5hmC fraction using chemical labelling, sequenced, and aligned to a reference genome to construct features sets of 5hmC patterns. Results: 5hmC multinomial logistic regression analysis was employed across tumor and normal tissues and identified a set of specific and discrete tumor and normal tissue gene-based features. This indicates that we can classify samples regardless of source, with a high degree of accuracy, based on tissue of origin and also distinguish between normal and tumor status.Next, we employed a stacked ensemble machine learning algorithm combining multiple logistic regression models across diverse feature sets to the cfDNA dataset composed of 783 non cancers and 569 cancers comprising 67 breast, 118 colorectal, 210 Lung, 71 ovarian and 100 pancreatic cancers. We identified a genomic signature that enable the classification of non-cancer versus cancers with an outer fold cross validation sensitivity of 49% (CI 45%-53%) at 99% specificity. Further, individual cancer outer fold cross validation sensitivity at 99% specificity, was measured as follows: breast 30% (CI 119% -42%); colorectal 41% (CI 32%-50%); lung 49% (CI 42%-56%); ovarian 72% (CI 60-82%); pancreatic 56% (CI 46%-66%). Conclusions: This study demonstrates that 5hmC profiles can distinguish cancer and normal tissues based on their origin. Further, 5hmC changes in cfDNA enables detection of the several cancer types: breast, colorectal, lung, ovarian and pancreatic cancers. Our technology provides a non-invasive tool for cancer detection with low risk sample collection enabling improved compliance than current screening methods. Among other utilities, we believe our technology could be applied to asymptomatic high-risk individuals thus enabling enrichment for those subjects that most need a diagnostic imaging follow up.


Author(s):  
Lina Li ◽  
Xinpei Wang ◽  
Xiaping Du ◽  
Yuanyuan Liu ◽  
Changchun Liu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document