motif finding algorithm Latest Research Papers

Abstract Motivation The availability of numerous ChIP-seq datasets for transcription factors (TF) has provided an unprecedented opportunity to identify all TF binding sites in genomes. However, the progress has been hindered by the lack of a highly efficient and accurate tool to find not only the target motifs, but also cooperative motifs in very big datasets. Results We herein present an ultrafast and accurate motif-finding algorithm, ProSampler, based on a novel numeration method and Gibbs sampler. ProSampler runs orders of magnitude faster than the fastest existing tools while often more accurately identifying motifs of both the target TFs and cooperators. Thus, ProSampler can greatly facilitate the efforts to identify the entire cis-regulatory code in genomes. Availability and implementation Source code and binaries are freely available for download at https://github.com/zhengchangsulab/prosampler. It was implemented in C++ and supported on Linux, macOS and MS Windows platforms. Supplementary information Supplementary materials are available at Bioinformatics online.

Download Full-text

De Novo Motif Prediction Using the Fireworks Algorithm

Biotechnology ◽

10.4018/978-1-5225-8903-7.ch041 ◽

2019 ◽

pp. 1069-1085

Author(s):

Andrei Lihu ◽

Ștefan Holban

Keyword(s):

Gene Expression ◽

Motif Discovery ◽

De Novo ◽

Fireworks Algorithm ◽

Regulatory Processes ◽

Motif Prediction ◽

Leibler Divergence ◽

De Novo Motif Discovery ◽

Low Performance ◽

Motif Finding Algorithm

De novo motif discovery is essential in understanding the cis-regulatory processes that play a role in gene expression. Finding unknown patterns of unknown lengths in massive amounts of data has long been a major challenge in computational biology. Because algorithms for motif prediction have always suffered of low performance issues, there is a constant effort to find better techniques. Evolutionary methods, including swarm intelligence algorithms, have been applied with limited success for motif prediction. However, recently developed methods, such as the Fireworks Algorithm (FWA) which simulates the explosion process of fireworks, may show better prospects. This paper describes a motif finding algorithm based on FWA that maximizes the Kullback-Leibler divergence between candidate solutions and the background noise. Following the terminology of FWA's framework, the candidate motifs are fireworks that generate additional sparks (i.e. derived motifs) in their neighborhood. During the iterations, better sparks can replace the fireworks, as the Fireworks Motif Finder (FW-MF) assumes a one occurrence per sequence mode. The results obtained on a standard benchmark for promoter analysis show that our proof of concept is promising.

Download Full-text

IndeCut evaluates performance of network motif discovery algorithms

10.1101/156836 ◽

2017 ◽

Author(s):

Mitra Ansariola ◽

Molly Megraw ◽

David Koslicki

Keyword(s):

Motif Discovery ◽

Network Motif ◽

Practical Method ◽

Algorithm Performance ◽

Complex Processes ◽

Open Source Software Package ◽

Discovery Algorithms ◽

Motif Discovery Algorithm ◽

Motif Finding Algorithm

AbstractGenomic networks represent a complex map of molecular interactions which are descriptive of the biological processes occurring in living cells. Identifying the small over-represented circuitry patterns in these networks helps generate hypotheses about the functional basis of such complex processes. Network motif discovery is a systematic way of achieving this goal. However, a reliable network motif discovery outcome requires generating random background networks which are the result of a uniform and independent graph sampling method. To date, there has been no sound practical method to numerically evaluate whether any network motif discovery algorithm performs as intended—thus it was not possible to assess the validity of resulting network motifs. In this work, we present IndeCut, the first and only method that allows characterization of network motif finding algorithm performance on any network of interest. We demonstrate that it is critical to use IndeCut prior to running any network motif finder for two reasons. First, IndeCut estimates the minimally required number of samples that each network motif discovery tool needs in order to produce an outcome that is both reproducible and accurate. Second, IndeCut allows users to choose the most accurate network motif discovery tool for their network of interest among many available options. IndeCut is an open source software package and is available at https://github.com/megrawlab/IndeCut.

Download Full-text

Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification

BioMed Research International ◽

10.1155/2016/6598307 ◽

2016 ◽

Vol 2016 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Yin Wang ◽

Rudong Li ◽

Yuhua Zhou ◽

Zongxin Ling ◽

Xiaokui Guo ◽

...

Keyword(s):

Dental Caries ◽

16S Rrna ◽

High Dimension ◽

Disease Classification ◽

Motif Finding ◽

Text Data ◽

Feature Spaces ◽

Efficiency And Reliability ◽

Motif Finding Algorithm ◽

Better Than

Background. Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty.Results. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF) to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods.Conclusions. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data.

Download Full-text

Comparison of result differences in multiple implementations of a stochastic motif finding algorithm

2015 E-Health and Bioengineering Conference (EHB) ◽

10.1109/ehb.2015.7391409 ◽

2015 ◽

Author(s):

Mihai Isaroiu ◽

Luca Dan Serbanati

Keyword(s):

Motif Finding ◽

Motif Finding Algorithm

Download Full-text

De novo Motif Prediction using the Fireworks Algorithm

International Journal of Swarm Intelligence Research ◽

10.4018/ijsir.2015070102 ◽

2015 ◽

Vol 6 (3) ◽

pp. 24-40 ◽

Cited By ~ 6

Author(s):

Andrei Lihu ◽

Ștefan Holban

Keyword(s):

Motif Discovery ◽

De Novo ◽

Proof Of Concept ◽

Fireworks Algorithm ◽

Regulatory Processes ◽

Motif Prediction ◽

Leibler Divergence ◽

De Novo Motif Discovery ◽

Low Performance ◽

Motif Finding Algorithm

De novo motif discovery is essential in understanding the cis-regulatory processes that play a role in gene expression. Finding unknown patterns of unknown lengths in massive amounts of data has long been a major challenge in computational biology. Because algorithms for motif prediction have always suffered of low performance issues, there is a constant effort to find better techniques. Evolutionary methods, including swarm intelligence algorithms, have been applied with limited success for motif prediction. However, recently developed methods, such as the Fireworks Algorithm (FWA) which simulates the explosion process of fireworks, may show better prospects. This paper describes a motif finding algorithm based on FWA that maximizes the Kullback-Leibler divergence between candidate solutions and the background noise. Following the terminology of FWA's framework, the candidate motifs are fireworks that generate additional sparks (i.e. derived motifs) in their neighborhood. During the iterations, better sparks can replace the fireworks, as the Fireworks Motif Finder (FW-MF) assumes a one occurrence per sequence mode. The results obtained on a standard benchmark for promoter analysis show that our proof of concept is promising.

Download Full-text

A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets

BioMed Research International ◽

10.1155/2015/218068 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 4

Author(s):

Yipu Zhang ◽

Ping Wang

Keyword(s):

High Throughput ◽

Motif Discovery ◽

Large Scale ◽

High Throughput Sequencing ◽

Es Cells ◽

Motif Finding ◽

Data Sets ◽

Data Set ◽

Binding Motifs ◽

Motif Finding Algorithm

New high-throughput technique ChIP-seq, coupling chromatin immunoprecipitation experiment with high-throughput sequencing technologies, has extended the identification of binding locations of a transcription factor to the genome-wide regions. However, the most existing motif discovery algorithms are time-consuming and limited to identify binding motifs in ChIP-seq data which normally has the significant characteristics of large scale data. In order to improve the efficiency, we propose a fast cluster motif finding algorithm, named as FCmotif, to identify the(l, d)motifs in large scale ChIP-seq data set. It is inspired by the emerging substrings mining strategy to find the enriched substrings and then searching the neighborhood instances to construct PWM and cluster motifs in different length. FCmotif is not following the OOPS model constraint and can find long motifs. The effectiveness of proposed algorithm has been proved by experiments on the ChIP-seq data sets from mouse ES cells. The whole detection of the real binding motifs and processing of the full size data of several megabytes finished in a few minutes. The experimental results show that FCmotif has advantageous to deal with the(l, d)motif finding in the ChIP-seq data; meanwhile it also demonstrates better performance than other current widely-used algorithms such as MEME, Weeder, ChIPMunk, and DREME.

Download Full-text

A multi-class discriminative motif finding algorithm for autosomal genomic data. (c2015)

10.26756/th.2015.49 ◽

2015 ◽

Author(s):

Gioia Wahib Wehbe

Keyword(s):

Genomic Data ◽

Motif Finding ◽

Motif Finding Algorithm

Download Full-text

motif finding algorithm
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Developing a motif finding algorithm using Suffix Tree and Hash Table

An Identical String Motif Finding Algorithm Through Dynamic Programming

ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery

De Novo Motif Prediction Using the Fireworks Algorithm

IndeCut evaluates performance of network motif discovery algorithms

Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification

Comparison of result differences in multiple implementations of a stochastic motif finding algorithm

De novo Motif Prediction using the Fireworks Algorithm

A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets

A multi-class discriminative motif finding algorithm for autosomal genomic data. (c2015)

Export Citation Format

motif finding algorithmRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Developing a motif finding algorithm using Suffix Tree and Hash Table

An Identical String Motif Finding Algorithm Through Dynamic Programming

ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery

De Novo Motif Prediction Using the Fireworks Algorithm

IndeCut evaluates performance of network motif discovery algorithms

Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification

Comparison of result differences in multiple implementations of a stochastic motif finding algorithm

De novo Motif Prediction using the Fireworks Algorithm

A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets

A multi-class discriminative motif finding algorithm for autosomal genomic data. (c2015)

motif finding algorithm
Recently Published Documents