Access and Visualise High Quality Gene Expression Data with Stemformatics

In recent years, clustering analysis has even become a valuable and useful tool for in-silico analysis of microarray or gene expression data. Although a number of clustering methods have been proposed, they are confronted with difficulties in meeting the requirements of automation, high quality, and high efficiency at the same time. In this chapter, we discuss the issue of parameterless clustering technique for gene expression analysis. We introduce two novel, parameterless and efficient clustering methods that fit for analysis of gene expression data. The unique feature of our methods is they incorporate the validation techniques into the clustering process so that high quality results can be obtained. Through experimental evaluation, these methods are shown to outperform other clustering methods greatly in terms of clustering quality, efficiency, and automation on both of synthetic and real data sets.

Download Full-text

Biclustering of Gene Expression Data Using Cuckoo Search and Genetic Algorithm

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001418500398 ◽

2018 ◽

Vol 32 (11) ◽

pp. 1850039 ◽

Cited By ~ 1

Author(s):

Lu Yin ◽

Junlin Qiu ◽

Shangbing Gao

Keyword(s):

Gene Expression ◽

Genetic Algorithm ◽

Local Search ◽

Gene Expression Data ◽

Heuristic Algorithms ◽

Expression Patterns ◽

Cuckoo Search ◽

Global Search ◽

Expression Data ◽

High Quality

Biclustering analysis of gene expression data can reveal a large number of biologically significant local gene expression patterns. Therefore, a large number of biclustering algorithms apply meta-heuristic algorithms such as genetic algorithm (GA) and cuckoo search (CS) to analyze the biclusters. However, different meta-heuristic algorithms have different applicability and characteristics. For example, the CS algorithm can obtain high-quality bicluster and strong global search ability, but its local search ability is relatively poor. In contrast to the CS algorithm, the GA has strong local search ability, but its global search ability is poor. In order to not only improve the global search ability of a bicluster and its coverage, but also improve the local search ability of the bicluster and its quality, this paper proposed a meta-heuristic algorithm based on GA and CS algorithm (GA-CS Biclustering, Georgia Association of Community Service Boards (GACSB)) to solve the problem of gene expression data clustering. The algorithm uses the CS algorithm as the main framework, and uses the tournament strategy and the elite retention strategy based on the GA to generate the next generation of the population. Compared with the experimental results of common biclustering analysis algorithms such as correlated correspondence (CC), fast, local clustering (FLOC), interior search algorithm (ISA), Securities Exchange Board of India (SEBI), sum of squares between (SSB) and coordinated scheduling/beamforming (CSB), the GACSB algorithm can not only obtain biclusters of high quality, but also obtain biclusters of high-biologic significance. In addition, we also use different bicluster evaluation indicators, such as Average Correlation Value (ACV), Mean-Squared Residue (MSR) and Virtual Error (VE), and verify that the GACSB algorithm has a strong scalability.

Download Full-text