scholarly journals Deep learning-based gene selection in comprehensive gene analysis in pancreatic cancer

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yasukuni Mori ◽  
Hajime Yokota ◽  
Isamu Hoshino ◽  
Yosuke Iwatate ◽  
Kohei Wakamatsu ◽  
...  

AbstractThe selection of genes that are important for obtaining gene expression data is challenging. Here, we developed a deep learning-based feature selection method suitable for gene selection. Our novel deep learning model includes an additional feature-selection layer. After model training, the units in this layer with high weights correspond to the genes that worked effectively in the processing of the networks. Cancer tissue samples and adjacent normal pancreatic tissue samples were collected from 13 patients with pancreatic ductal adenocarcinoma during surgery and subsequently frozen. After processing, gene expression data were extracted from the specimens using RNA sequencing. Task 1 for the model training was to discriminate between cancerous and normal pancreatic tissue in six patients. Task 2 was to discriminate between patients with pancreatic cancer (n = 13) who survived for more than one year after surgery. The most frequently selected genes were ACACB, ADAMTS6, NCAM1, and CADPS in Task 1, and CD1D, PLA2G16, DACH1, and SOWAHA in Task 2. According to The Cancer Genome Atlas dataset, these genes are all prognostic factors for pancreatic cancer. Thus, the feasibility of using our deep learning-based method for the selection of genes associated with pancreatic cancer development and prognosis was confirmed.

Author(s):  
Samarendra Das ◽  
Shesh N. Rai

Selection of biologically relevant genes from high dimensional expression data is a key research problem in gene expression genomics. Most of the available gene selection methods are either based on relevancy or redundancy measure, which are usually adjudged through post selection classification accuracy. Through these methods the ranking of genes was done on a single high-dimensional expression data, which leads to the selection of spuriously associated and redundant genes. Hence, we developed a statistical approach through combining Support Vector Machine with Maximum Relevance and Minimum Redundancy under a sound statistical setup for the selection of biologically relevant genes. Here, the genes are selected through statistical significance values computed using a non-parametric test statistic under a bootstrap based subject sampling model. Further, a systematic and rigorous evaluation of the proposed approach with nine existing competitive methods was carried on six different real crop gene expression datasets. This performance analysis was carried out under three comparison settings, i.e. subject classification, biological relevant criteria based on quantitative trait loci, and gene ontology. Our analytical results showed that the proposed approach selects genes that are more biologically relevant as compared to the existing methods. Moreover, the proposed approach was also found to be better with respect to the competitive existing methods. The proposed statistical approach provides a framework for combining filter, and wrapper methods of gene selection.


Gut ◽  
2017 ◽  
Vol 67 (3) ◽  
pp. 521-533 ◽  
Author(s):  
Mingfeng Zhang ◽  
Soren Lykke-Andersen ◽  
Bin Zhu ◽  
Wenming Xiao ◽  
Jason W Hoskins ◽  
...  

ObjectiveTo elucidate the genetic architecture of gene expression in pancreatic tissues.DesignWe performed expression quantitative trait locus (eQTL) analysis in histologically normal pancreatic tissue samples (n=95) using RNA sequencing and the corresponding 1000 genomes imputed germline genotypes. Data from pancreatic tumour-derived tissue samples (n=115) from The Cancer Genome Atlas were included for comparison.ResultsWe identified 38 615 cis-eQTLs (in 484 genes) in histologically normal tissues and 39 713 cis-eQTL (in 237 genes) in tumour-derived tissues (false discovery rate <0.1), with the strongest effects seen near transcriptional start sites. Approximately 23% and 42% of genes with significant cis-eQTLs appeared to be specific for tumour-derived and normal-derived tissues, respectively. Significant enrichment of cis-eQTL variants was noted in non-coding regulatory regions, in particular for pancreatic tissues (1.53-fold to 3.12-fold, p≤0.0001), indicating tissue-specific functional relevance. A common pancreatic cancer risk locus on 9q34.2 (rs687289) was associated with ABO expression in histologically normal (p=5.8×10−8) and tumour-derived (p=8.3×10−5) tissues. The high linkage disequilibrium between this variant and the O blood group generating deletion variant in ABO (exon 6) suggested that nonsense-mediated decay (NMD) of the ‘O’ mRNA might explain this finding. However, knockdown of crucial NMD regulators did not influence decay of the ABO ‘O’ mRNA, indicating that a gene regulatory element influenced by pancreatic cancer risk alleles may underlie the eQTL.ConclusionsWe have identified cis-eQTLs representing potential functional regulatory variants in the pancreas and generated a rich data set for further studies on gene expression and its regulation in pancreatic tissues.


Entropy ◽  
2020 ◽  
Vol 22 (11) ◽  
pp. 1205
Author(s):  
Samarendra Das ◽  
Shesh N. Rai

Selection of biologically relevant genes from high-dimensional expression data is a key research problem in gene expression genomics. Most of the available gene selection methods are either based on relevancy or redundancy measure, which are usually adjudged through post selection classification accuracy. Through these methods the ranking of genes was conducted on a single high-dimensional expression data, which led to the selection of spuriously associated and redundant genes. Hence, we developed a statistical approach through combining a support vector machine with Maximum Relevance and Minimum Redundancy under a sound statistical setup for the selection of biologically relevant genes. Here, the genes were selected through statistical significance values and computed using a nonparametric test statistic under a bootstrap-based subject sampling model. Further, a systematic and rigorous evaluation of the proposed approach with nine existing competitive methods was carried on six different real crop gene expression datasets. This performance analysis was carried out under three comparison settings, i.e., subject classification, biological relevant criteria based on quantitative trait loci and gene ontology. Our analytical results showed that the proposed approach selects genes which are more biologically relevant as compared to the existing methods. Moreover, the proposed approach was also found to be better with respect to the competitive existing methods. The proposed statistical approach provides a framework for combining filter and wrapper methods of gene selection.


2009 ◽  
Vol 13 (2) ◽  
pp. 410-413 ◽  
Author(s):  
Mohd Saberi Mohamad ◽  
Sigeru Omatu ◽  
Safaai Deris ◽  
Muhammad Faiz Misman ◽  
Michifumi Yoshioka

Gene expression profiling using microarray technology has done with the chip based phenomena. For studying gene expression data are more helpful in knowing various diseases and more useful in finding diseases. Recently in the bioinformatics field, cancer prediction using gene expression data had made the assuring area. Samples having the gene attributes will not surely give the efficient amount of classification. Overcoming these contribution, a strong method is required for selecting the relevant gene features for building the classification model effectively. Basically least absolute shrinkage and selection operator (LASSO) and Recursive feature elimination (RFE) are automatic gene feature selection methods used for classification. Here in our proposed work, we use these two methods as a hybrid one for selecting the features and later it applied into the Support vector machine (SVM) for easy classification. It made best when compared to the existing techniques by their performance measures, were regulated on six publically available cancer datasets. Just out it gives the good awareness in the selection of features.


Sign in / Sign up

Export Citation Format

Share Document