Probability Based Most Informative Gene Selection From Microarray Data

2018 ◽  
Vol 5 (1) ◽  
pp. 1-12
Author(s):  
Sunanda Das ◽  
Asit Kumar Das

Microarray datasets have a wide application in bioinformatics research. Analysis to measure the expression level of thousands of genes of this kind of high-throughput data can help for finding the cause and subsequent treatment of any disease. There are many techniques in gene analysis to extract biologically relevant information from inconsistent and ambiguous data. In this paper, the concepts of functional dependency and closure of an attribute of database technology are used for finding the most important set of genes for cancer detection. Firstly, the method computes similarity factor between each pair of genes. Based on the similarity factors a set of gene dependency is formed from which closure set is obtained. Subsequently, conditional probability based interestingness measurements are used to determine the most informative gene for disease classification. The proposed method is applied on some publicly available cancerous gene expression dataset. The result shows the effectiveness and robustness of the algorithm.

2014 ◽  
Author(s):  
Hong-Dong Li ◽  
Qing-Song Xu ◽  
Yi-Zeng Liang

Identifying a small subset of discriminate genes is important for predicting clinical outcomes and facilitating disease diagnosis. Based on the model population analysis framework, we present a method, called PHADIA, which is able to output a phase diagram displaying the predictive ability of each variable, which provides an intuitive way for selecting informative variables. Using two publicly available microarray datasets, it’s demonstrated that our method can selects a few informative genes and achieves significantly better or comparable classification accuracy compared to the reported results in the literature. The source codes are freely available at: www.libpls.net.


2021 ◽  
Vol 5 (2) ◽  
pp. 15-21
Author(s):  
Fathima Fajila ◽  
Yuhanis Yusof

Although numerous methods of using microarray data analysis for classification have been reported, there is space in the field of cancer classification for new inventions in terms of informative gene selection. This study introduces a new incremental search-based gene selection approach for cancer classification. The strength of wrappers in determining relevant genes in a gene pool can be increased as they evaluate each possible gene’s subset. Nevertheless, the searching algorithms play a major role in gene’s subset selection. Hence, there is the possibility of finding more informative genes with incremental application. Thus, we introduce an approach which utilizes two searching algorithms in gene’s subset selection. The approach was efficient enough to classify five out of six microarray datasets with 100% accuracy using only a few biomarkers while the rest classified with only one misclassification.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gulden Olgun ◽  
Afshan Nabi ◽  
Oznur Tastan

Abstract Background While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint at a functional association. Results We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast. Conclusions NoRCE is a platform-independent, user-friendly, comprehensive R package that can be used to gain insight into the functional importance of a list of ncRNAs of any type. The tool offers flexibility to conduct the users’ preferred set of analyses by designing their own pipeline of analysis. NoRCE is available in Bioconductor and https://github.com/guldenolgun/NoRCE.


2018 ◽  
Vol 14 (6) ◽  
pp. 868-880 ◽  
Author(s):  
Shilan S. Hameed ◽  
Fahmi F. Muhammad ◽  
Rohayanti Hassan ◽  
Faisal Saeed

2018 ◽  
Vol 8 (9) ◽  
pp. 1569 ◽  
Author(s):  
Shengbing Wu ◽  
Hongkun Jiang ◽  
Haiwei Shen ◽  
Ziyi Yang

In recent years, gene selection for cancer classification based on the expression of a small number of gene biomarkers has been the subject of much research in genetics and molecular biology. The successful identification of gene biomarkers will help in the classification of different types of cancer and improve the prediction accuracy. Recently, regularized logistic regression using the L 1 regularization has been successfully applied in high-dimensional cancer classification to tackle both the estimation of gene coefficients and the simultaneous performance of gene selection. However, the L 1 has a biased gene selection and dose not have the oracle property. To address these problems, we investigate L 1 / 2 regularized logistic regression for gene selection in cancer classification. Experimental results on three DNA microarray datasets demonstrate that our proposed method outperforms other commonly used sparse methods ( L 1 and L E N ) in terms of classification performance.


2021 ◽  
Vol 288 (1953) ◽  
pp. 20210774
Author(s):  
Beth Mortimer ◽  
James A. Walker ◽  
David S. Lolchuragi ◽  
Michael Reinwald ◽  
David Daballen

African elephants ( Loxodonta africana ) use many sensory modes to gather information about their environment, including the detection of seismic, or ground-based, vibrations. Seismic information is known to include elephant-generated signals, but also potentially encompasses biotic cues that are commonly referred to as ‘noise’. To investigate seismic information transfer in elephants beyond communication, here we tested the hypothesis that wild elephants detect and discriminate between seismic vibrations that differ in their noise types, whether elephant- or human-generated. We played three types of seismic vibrations to elephants: seismic recordings of elephants (elephant-generated), white noise (human-generated) and a combined track (elephant- and human-generated). We found evidence of both detection of seismic noise and discrimination between the two treatments containing human-generated noise. In particular, we found evidence of retreat behaviour, where seismic tracks with human-generated noise caused elephants to move further away from the trial location. We conclude that seismic noise are cues that contain biologically relevant information for elephants that they can associate with risk. This expands our understanding of how elephants use seismic information, with implications for elephant sensory ecology and conservation management.


Author(s):  
Rollin McCraty ◽  
Stephen Brock Schafer

The earth's magnetic fields are carriers of biologically relevant information that connects all living systems. The electromagnetic coupling of the human brain, cardiovascular and nervous systems, and geomagnetic frequencies supports the hypothesis that the mediated reality of electromagnetic bandwidths can be correlated with bio-energetic and geomagnetic frequencies. Understood as bio-energetic functions (Thinking, Feeling, Sensing, & Intuiting), the media-sphere becomes measurable according to principles of coherency (measured as heart-rate variability, HRV) and principles of Jungian dream analysis (compensation and dramatic structure). It has been demonstrated that the rhythmic patterns in beat-to-beat heart rate variability reflect emotional functions, permeate every bodily cell, and play a central role in the generation and transmission of system-wide information via the electromagnetic field. So, the “media dream” becomes susceptible to psychological analysis leading to a better understanding of unconscious cognitive archetypal patterns of contextual collectives.


Sign in / Sign up

Export Citation Format

Share Document