scholarly journals MicroPath-A pathway-based pipeline for the comparison of multiple gene expression profiles to identify common biological signatures

2009 ◽  
Vol 02 (02) ◽  
pp. 106-116
Author(s):  
Mohsin Khan ◽  
Chandrasekhar Babu Gorle ◽  
Ping Wang ◽  
Xiao-Hui Liu ◽  
Su-Ling Li
PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5285 ◽  
Author(s):  
Mei Sze Tan ◽  
Siow-Wee Chang ◽  
Phaik Leng Cheah ◽  
Hwa Jen Yap

Although most of the cervical cancer cases are reported to be closely related to the Human Papillomavirus (HPV) infection, there is a need to study genes that stand up differentially in the final actualization of cervical cancers following HPV infection. In this study, we proposed an integrative machine learning approach to analyse multiple gene expression profiles in cervical cancer in order to identify a set of genetic markers that are associated with and may eventually aid in the diagnosis or prognosis of cervical cancers. The proposed integrative analysis is composed of three steps: namely, (i) gene expression analysis of individual dataset; (ii) meta-analysis of multiple datasets; and (iii) feature selection and machine learning analysis. As a result, 21 gene expressions were identified through the integrative machine learning analysis which including seven supervised and one unsupervised methods. A functional analysis with GSEA (Gene Set Enrichment Analysis) was performed on the selected 21-gene expression set and showed significant enrichment in a nine-potential gene expression signature, namely PEG3, SPON1, BTD and RPLP2 (upregulated genes) and PRDX3, COPB2, LSM3, SLC5A3 and AS1B (downregulated genes).


2021 ◽  
Author(s):  
Taguchi Y-h. ◽  
Turki Turki

Abstract The integrated analysis of multiple gene expression profiles measured in distinct studies is always problematic. Especially, missing sample matching and missing common labeling between distinct studies prevent the integration of multiple studies in fully data-driven and unsupervised manner. In this study, we propose a strategy enabling the integration of multiple gene expression profiles among multiple independent studies without either labeling or sample matching, using tensor decomposition-based unsupervised feature extraction. As an example, we applied this strategy to Alzheimer’s disease (AD)-related gene expression profiles that lack exact correspondence among samples as well as AD single-cell RNA-seq (scRNA-seq) data. We found that we could select biologically reasonable genes with integrated analysis. Overall, integrated gene expression profiles can function analogously to prior learning and/or transfer learning strategies in other machine learning applications. For scRNA-seq, the proposed approach was able to drastically reduce the required computational memory.


2006 ◽  
Vol 24 (18_suppl) ◽  
pp. 7026-7026
Author(s):  
D. H. Harpole ◽  
R. Petersen ◽  
S. Mukherjee ◽  
A. Bild ◽  
H. Dressman ◽  
...  

7026 Background. Although stage-specific classification identifies appropriate populations for adjuvant chemotherapy, this is likely an imprecise predictor for the individual patient with early stage NSCLC. Methods. Using previously-described methodologies that employ DNA microarray data, multiple gene expression profiles (‘metagenes’) that predict risk of recurrence in patients with stage I disease were identified. This analysis used an initial ‘test’ cohort of patients with NSCLC (n = 89) that represented an equal mix of squamous cell and adenocarcinoma. Also, each histologic subset had equal number of patients who survived more than 5 years and those who died within 2.5 years of initial diagnosis. The performance of the metagene-based model generated on the training cohort was then evaluated in independent ‘validation’ sets, including two multi-center cooperative group studies (ACOSOG Z0030 and CALGB 9761). Importantly, the CALGB validation was performed in a blinded fashion. Results. Classification tree analyses that sample multiple gene expression profiles were used to develop a model of recurrence, termed the Lung Metagene Model, that accurately assesses prognosis (risk of recurrence and survival), performing significantly (p<0.001, odds ratio: 16.1, multivariate analysis) better than pathologic stage, T-size, nodal status, age, gender, histologic subtype and smoking history. The accuracy of prognosis using the Lung Metagene Model exceeded 90% (leave-one-out cross validation) in the initial training set (n = 89), 72% in the ACOSOG (n = 25), and 81% in the CALGB (n = 84) datasets. The prognostic accuracy was consistent across histologic subtypes and stages of NSCLC. Importantly, this provides an opportunity to re-classify stage IA patients to identify a subset of ‘high risk’ patients that may benefit from adjuvant chemotherapy. Further, stage IB and II patients identified as ‘low risk’ for recurrence, and who present co-morbidities, could potentially be candidates for observation, and those patients predicted to be at ‘high risk’ may benefit from novel therapeutic trials. Conclusions. The Lung Metagene Model provides a mechanism to refine the estimation of an individual patient’s risk for disease recurrence and thus guide the use of adjuvant chemotherapy in NSCLC. No significant financial relationships to disclose.


FEBS Letters ◽  
2004 ◽  
Vol 565 (1-3) ◽  
pp. 93-100 ◽  
Author(s):  
Jung Kyoon Choi ◽  
Jong Young Choi ◽  
Dae Ghon Kim ◽  
Dong Wook Choi ◽  
Bu Yeo Kim ◽  
...  

2021 ◽  
Author(s):  
Taguchi Y-h. ◽  
Turki Turki

Abstract The integrated analysis of multiple gene expression profiles measured in distinct studies is always problematic. Especially, missing sample matching and missing common labeling between distinct studies prevent the integration of multiple studies in fully data-driven and unsupervised manner. In this study, we propose a strategy enabling the integration of multiple gene expression profiles among multiple independent studies without either labeling or sample matching, using tensor decomposition-based unsupervised feature extraction. As an example, we applied this strategy to Alzheimer’s disease (AD)-related gene expression profiles that lack exact correspondence among samples as well as AD single-cell RNA-seq (scRNA-seq) data. We found that we could select biologically reasonable genes with integrated analysis. Overall, integrated gene expression profiles can function analogously to prior learning and/or transfer learning strategies in other machine learning applications. For scRNA-seq, the proposed approach was able to drastically reduce the required computational memory.


2020 ◽  
Vol 26 ◽  
Author(s):  
Jianhua Zhai ◽  
Anlong Qi ◽  
Yan Zhang ◽  
Lina Jiao ◽  
Yancun Liu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document