scholarly journals Gene Expression Network Reconstruction by LEP Method Using Microarray Data

2012 ◽  
Vol 2012 ◽  
pp. 1-6
Author(s):  
Na You ◽  
Peng Mou ◽  
Ting Qiu ◽  
Qiang Kou ◽  
Huaijin Zhu ◽  
...  

Gene expression network reconstruction using microarray data is widely studied aiming to investigate the behavior of a gene cluster simultaneously. Under the Gaussian assumption, the conditional dependence between genes in the network is fully described by the partial correlation coefficient matrix. Due to the high dimensionality and sparsity, we utilize the LEP method to estimate it in this paper. Compared to the existing methods, the LEP reaches the highest PPV with the sensitivity controlled at the satisfactory level. A set of gene expression data from the HapMap project is analyzed for illustration.


2008 ◽  
Vol 5 (2) ◽  
Author(s):  
Krzysztof Borowski ◽  
Jung Soh ◽  
Christoph W. Sensen

SummaryThe need for novel methods of visualizing microarray data is growing. New perspectives are beneficial to finding patterns in expression data. The Bluejay genome browser provides an integrative way of visualizing gene expression datasets in a genomic context. We have now developed the functionality to display multiple microarray datasets simultaneously in Bluejay, in order to provide researchers with a comprehensive view of their datasets linked to a graphical representation of gene function. This will enable biologists to obtain valuable insights on expression patterns, by allowing them to analyze the expression values in relation to the gene locations as well as to compare expression profiles of related genomes or of di erent experiments for the same genome.



2005 ◽  
Vol 03 (02) ◽  
pp. 303-316 ◽  
Author(s):  
ZHENQIU LIU ◽  
DECHANG CHEN ◽  
HALIMA BENSMAIL ◽  
YING XU

Kernel principal component analysis (KPCA) has been applied to data clustering and graphic cut in the last couple of years. This paper discusses the application of KPCA to microarray data clustering. A new algorithm based on KPCA and fuzzy C-means is proposed. Experiments with microarray data show that the proposed algorithms is in general superior to traditional algorithms.



2013 ◽  
Vol 11 (03) ◽  
pp. 1341006
Author(s):  
QIANG LOU ◽  
ZORAN OBRADOVIC

In order to more accurately predict an individual's health status, in clinical applications it is often important to perform analysis of high-dimensional gene expression data that varies with time. A major challenge in predicting from such temporal microarray data is that the number of biomarkers used as features is typically much larger than the number of labeled subjects. One way to address this challenge is to perform feature selection as a preprocessing step and then apply a classification method on selected features. However, traditional feature selection methods cannot handle multivariate temporal data without applying techniques that flatten temporal data into a single matrix in advance. In this study, a feature selection filter that can directly select informative features from temporal gene expression data is proposed. In our approach, we measure the distance between multivariate temporal data from two subjects. Based on this distance, we define the objective function of temporal margin based feature selection to maximize each subject's temporal margin in its own relevant subspace. The experimental results on synthetic and two real flu data sets provide evidence that our method outperforms the alternatives, which flatten the temporal data in advance.



2011 ◽  
Vol 39 (2) ◽  
pp. 1627-1637 ◽  
Author(s):  
Yan Xu ◽  
Huizi DuanMu ◽  
Zhiqiang Chang ◽  
Shanzhen Zhang ◽  
Zhenqi Li ◽  
...  


2019 ◽  
Vol 9 (6) ◽  
pp. 1294-1300 ◽  
Author(s):  
A. Sampathkumar ◽  
P. Vivekanandan

In the field of bioinformatics research, a large volume of genetic data has been generated. Availability of higher throughput devices at lower cost has contributed to this generation of huge volumetric data. Handling such numerous data has become extremely challenging for selecting the relevant disease-causing gene. The development of microarray technology provides higher chances of cancer diagnosis, by enabling to measure the expression level of multiple genes at the same stretch. Selecting the relevant gene by using classifiers for investigation of gene expression data is a complicated process. Proper identification of gene from the gene expression datasets plays a vital role in improving the accuracy of classification. In this article, identification of the highly relevant gene from the gene expression data for cancer treatment is discussed in detail. By using modified meta-heuristic approach, known as 'parallel lion optimization' (PLOA) for selecting genes from microarray data that can classify various cancer sub-types with more accuracy. The experimental results depict that PLOA outperforms than LOA and other well-known approaches, considering the five benchmark cancer gene expression dataset. It returns 99% classification accuracy for the dataset namely Prostate, Lung, Leukemia and Central Nervous system (CNS) for top 200 genes. Prostate and Lymphoma dataset PLOA is 99.19% and 99.93% respectively. On evaluating the result with other algorithm, the higher level of accuracy in gene selection is achieved by the proposed algorithm.



2011 ◽  
Vol 1 (1) ◽  
pp. 27 ◽  
Author(s):  
Konstantina Dimitrakopoulou ◽  
Charalampos Tsimpouris ◽  
George Papadopoulos ◽  
Claudia Pommerenke ◽  
Esther Wilk ◽  
...  


2017 ◽  
Author(s):  
Princy Parsana ◽  
Claire Ruberman ◽  
Andrew E. Jaffe ◽  
Michael C. Schatz ◽  
Alexis Battle ◽  
...  

AbstractBackgroundGene co-expression networks capture diverse biological relationships between genes, and are important tools in predicting gene function and understanding disease mechanisms. Functional interactions between genes have not been fully characterized for most organisms, and therefore reconstruction of gene co-expression networks has been of common interest in a variety of settings. However, methods routinely used for reconstruction of gene co-expression networks do not account for confounding artifacts known to affect high dimensional gene expression measurements.ResultsIn this study, we show that artifacts such as batch effects in gene expression data confound commonly used network reconstruction algorithms. Both theoretically and empirically, we demonstrate that removing the effects of top principal components from gene expression measurements prior to network inference can reduce false discoveries, especially when well annotated technical covariates are not available. Using expression data from the GTEx project in multiple tissues and hundreds of individuals, we show that this latent factor residualization approach often reduces false discoveries in the reconstructed networks.ConclusionNetwork reconstruction is susceptible to confounders that affect measurements of gene expression. Even controlling for major individual known technical covariates fails to fully eliminate confounding variation from the data. In studies where a wide range of annotated technical factors are measured and available, correcting gene expression data with multiple covariates can also improve network reconstruction, but such extensive annotations are not always available. Our study shows that principal component correction, which does not depend on study design or annotation of all relevant confounders, removes patterns of artifactual variation and improves network reconstruction in both simulated data, and gene expression data from GTEx project. We have implemented our PC correction approach in the Bioconductor package sva which can be used prior to network reconstruction with a range of methods.



2020 ◽  
Vol 15 (4) ◽  
pp. 359-367
Author(s):  
Yong-Jing Hao ◽  
Mi-Xiao Hou ◽  
Ying-Lian Gao ◽  
Jin-Xing Liu ◽  
Xiang-Zhen Kong

Background: Non-negative Matrix Factorization (NMF) has been extensively used in gene expression data. However, most NMF-based methods have single-layer structures, which may achieve poor performance for complex data. Deep learning, with its carefully designed hierarchical structure, has shown significant advantages in learning data features. Objective: In bioinformatics, on the one hand, to discover differentially expressed genes in gene expression data; on the other hand, to obtain higher sample clustering results. It can provide the reference value for the prevention and treatment of cancer. Method: In this paper, we apply a deep NMF method called Deep Semi-NMF on the integrated gene expression data. In each layer, the coefficient matrix is directly decomposed into the basic and coefficient matrix of the next layer. We apply this factorization model on The Cancer Genome Atlas (TCGA) genomic data. Results: The experimental results demonstrate the superiority of Deep Semi-NMF method in identifying differentially expressed genes and clustering samples. Conclusion: The Deep Semi-NMF model decomposes a matrix into multiple matrices and multiplies them to form a matrix. It can also improve the clustering performance of samples while digging out more accurate key genes for disease treatment.



Sign in / Sign up

Export Citation Format

Share Document