Biclustering Gene Expression Data Using Genetic Simulated Annealing Algorithm

DNA microarray technology produces gene expression matrix that consists of an inexorably missing entries due to poor experimental procedures. The missing values are predicted in the matrix for gene expression data are considered to be essential, since most algorithms analyse the gene expression that usually needs a matrix without missing values. In order to address this issue, the present study biclustering Genetic based Simulated Annealing (Genetic SA) algorithm to predict the items that are missing in the gene expression data. The present study uses biclustering method that is considered to be essential for clustering the gene expression data. The performance evaluation shows that the proposed Genetic SA for gene data expression predicts the missing items in an accurate manner than the existing methods.

Download Full-text

Hybrid Genetic Algorithm and Simulated Annealing for Clustering Microarray Gene Expression data

Journal of Physics Conference Series ◽

10.1088/1742-6596/1767/1/012034 ◽

2021 ◽

Vol 1767 (1) ◽

pp. 012034

Author(s):

M Pandi ◽

T Sivakumar ◽

N Senthil Madasamy ◽

N Sadhasivam

Keyword(s):

Gene Expression ◽

Genetic Algorithm ◽

Simulated Annealing ◽

Gene Expression Data ◽

Hybrid Genetic Algorithm ◽

Microarray Gene Expression Data ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

Effectiveness of Different Partition Based Clustering Algorithms for Estimation of Missing Values in Microarray Gene Expression Data

Advances in Computing and Information Technology - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-642-31552-7_5 ◽

2013 ◽

pp. 37-47 ◽

Cited By ~ 2

Author(s):

Shilpi Bose ◽

Chandra Das ◽

Abirlal Chakraborty ◽

Samiran Chattopadhyay

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Missing Values ◽

Clustering Algorithms ◽

Microarray Gene Expression Data ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

Missing-Values Imputation Algorithms for Microarray Gene Expression Data

Methods in Molecular Biology - Microarray Bioinformatics ◽

10.1007/978-1-4939-9442-7_12 ◽

2019 ◽

pp. 255-266 ◽

Cited By ~ 3

Author(s):

Kohbalan Moorthy ◽

Aws Naser Jaber ◽

Mohd Arfian Ismail ◽

Ferda Ernawan ◽

Mohd Saberi Mohamad ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Missing Values ◽

Microarray Gene Expression Data ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

A Simulated Annealing and Resampling Method for Training Perceptrons to Classify Gene-Expression Data

Artificial Neural Networks — ICANN 2002 - Lecture Notes in Computer Science ◽

10.1007/3-540-46084-5_65 ◽

2002 ◽

pp. 401-407

Author(s):

Andreas A. Albrecht ◽

Staal A. Vinterbo ◽

C. K. Wong ◽

Lucila Ohno-Machado

Keyword(s):

Gene Expression ◽

Simulated Annealing ◽

Gene Expression Data ◽

Expression Data ◽

Resampling Method

Download Full-text

SBi-MSREimpute: A Sequential Biclustering Technique Based on Mean Squared Residue and Euclidean Distance to Predict Missing Values in Microarray Gene Expression Data

Advances in Intelligent Systems and Computing - Emerging Technologies in Data Mining and Information Security ◽

10.1007/978-981-13-1498-8_59 ◽

2018 ◽

pp. 673-685 ◽

Cited By ~ 1

Author(s):

Sourav Dutta ◽

Mithila Hore ◽

Faraz Ahmad ◽

Anam Saba ◽

Manuraj Kumar ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Euclidean Distance ◽

Missing Values ◽

Microarray Gene Expression Data ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

Mining the Gene Expression Matrix: Inferring Gene Relationships from Large Scale Gene Expression Data

Information Processing in Cells and Tissues ◽

10.1007/978-1-4615-5345-8_22 ◽

1998 ◽

pp. 203-212 ◽

Cited By ~ 35

Author(s):

Patrik D’haeseleer ◽

Xiling Wen ◽

Stefanie Fuhrman ◽

Roland Somogyi

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Large Scale ◽

Expression Data ◽

Gene Expression Matrix ◽

Expression Matrix

Download Full-text

A novel interpolation based missing value estimation method to predict missing values in microarray gene expression data

2012 International Conference on Communications, Devices and Intelligent Systems (CODIS) ◽

10.1109/codis.2012.6422202 ◽

2012 ◽

Cited By ~ 3

Author(s):

Shilpi Bose ◽

Chandra Das ◽

Sourav Dutta ◽

Samiran Chattopadhyay

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Missing Values ◽

Estimation Method ◽

Microarray Gene Expression Data ◽

Expression Data ◽

Microarray Gene Expression ◽

Value Estimation ◽

Missing Value Estimation ◽

Microarray Gene

Download Full-text

Missing Values Estimation for Time Course Gene Expression Data Using the Sequential Partial Least Squares Regression Fitting

Korean Journal of Applied Statistics ◽

10.5351/kjas.2008.21.2.275 ◽

2008 ◽

Vol 21 (2) ◽

pp. 275-290

Keyword(s):

Gene Expression ◽

Least Squares ◽

Partial Least Squares ◽

Gene Expression Data ◽

Time Course ◽

Partial Least Squares Regression ◽

Missing Values ◽

Expression Data ◽

Least Squares Regression

Download Full-text

Maximizing the Reusability of Public Gene Expression Data by Predicting Missing Metadata

10.1101/792382 ◽

2019 ◽

Author(s):

Pei-Yau Lung ◽

Xiaodong Pang ◽

Yan Li ◽

Jinfeng Zhang

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Gene Expression Data ◽

Missing Values ◽

Expression Data ◽

New Approach ◽

Machine Learning Methods ◽

Differential Gene ◽

Missing Variables ◽

Better Than

AbstractReusability is part of the FAIR data principle, which aims to make data Findable, Accessible, Interoperable, and Reusable. One of the current efforts to increase the reusability of public genomics data has been to focus on the inclusion of quality metadata associated with the data. When necessary metadata are missing, most researchers will consider the data useless. In this study, we develop a framework to predict the missing metadata of gene expression datasets to maximize their reusability. We propose a new metric called Proportion of Cases Accurately Predicted (PCAP), which is optimized in our specifically-designed machine learning pipeline. The new approach performed better than pipelines using commonly used metrics such as F1-score in terms of maximizing the reusability of data with missing values. We also found that different variables might need to be predicted using different machine learning methods and/or different data processing protocols. Using differential gene expression analysis as an example, we show that when missing variables are accurately predicted, the corresponding gene expression data can be reliably used in downstream analyses.

Download Full-text

Microarray missing values imputation methods: Critical analysis review

Computer Science and Information Systems ◽

10.2298/csis0902165h ◽

2009 ◽

Vol 6 (2) ◽

pp. 165-190 ◽

Cited By ~ 6

Author(s):

Mou'ath Hourani ◽

Emary El

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Missing Values ◽

Gene Array ◽

Estimation Method ◽

Least Square ◽

Support Vector ◽

Expression Data ◽

Imputation Methods ◽

Value Estimation

Gene expression data often contain missing expression values. For the purpose of conducting an effective clustering analysis and since many algorithms for gene expression data analysis require a complete matrix of gene array values, choosing the most effective missing value estimation method is necessary. In this paper, the most commonly used imputation methods from literature are critically reviewed and analyzed to explain the proper use, weakness and point the observations on each published method. From the conducted analysis, we conclude that the Local Least Square (LLS) and Support Vector Regression (SVR) algorithms have achieved the best performances. SVR can be considered as a complement algorithm for LLS especially when applied to noisy data. However, both algorithms suffer from some deficiencies presented in choosing the value of Number of Selected Genes (K) and the appropriate kernel function. To overcome these drawbacks, the need for new method that automatically chooses the parameters of the function and it also has an appropriate computational complexity is imperative.

Download Full-text