Mining Sequence Patterns in Evolving Databases

Temporal and Spatio-Temporal Data Mining ◽

10.4018/978-1-59904-387-6.ch004 ◽

2008 ◽

pp. 63-86

Author(s):

Wynne Hsu ◽

Mong Li Lee ◽

Junmei Wang

Keyword(s):

Incremental Maintenance ◽

Sequence Patterns ◽

Mining Sequence

In this chapter, we analyze and improve the I/O performance of the GSP algorithm (Agrawal & Srikant, 1996). We also study the problem of incremental maintenance of frequent sequences.

Download Full-text

Mining Sequence Patterns from Wind Tunnel Experimental Data for Flight Control

Advances in Knowledge Discovery and Data Mining - Lecture Notes in Computer Science ◽

10.1007/3-540-45357-1_30 ◽

2001 ◽

pp. 270-281

Author(s):

Zhenyu Liu ◽

Wesley W. Chu ◽

Adam Huang ◽

Chris Folk ◽

Chih-Ming Ho

Keyword(s):

Experimental Data ◽

Wind Tunnel ◽

Flight Control ◽

Sequence Patterns ◽

Mining Sequence

Download Full-text

Analyzing the intervening sequences of pluripotency associated genes to identify conserved Sequence patterns in human using Integrative bioinformatics

International Journal of Pharma and Bio Sciences ◽

10.22376/ijpbs.2017.8.1.b582-587 ◽

2017 ◽

Vol 8 (1) ◽

Author(s):

PRIYANKA NARAD ◽

GULSHAN WADHWA ◽

KAILASH C UPADHYAYA

Keyword(s):

Conserved Sequence ◽

Intervening Sequences ◽

Sequence Patterns

Download Full-text

Flexible k-mers with variable-length indels for identifying binding sequences of protein dimers

Briefings in Bioinformatics ◽

10.1093/bib/bbz101 ◽

2019 ◽

Vol 21 (5) ◽

pp. 1787-1797

Author(s):

Chenyang Hong ◽

Kevin Y Yip

Keyword(s):

Binding Proteins ◽

Affinity Purification ◽

Classification Performance ◽

Variable Regions ◽

Binding Motifs ◽

New Class ◽

Protein Dimers ◽

Sequence Patterns ◽

Standard Chip ◽

Exponential Enrichment

Abstract Many DNA-binding proteins interact with partner proteins. Recently, based on the high-throughput consecutive affinity-purification systematic evolution of ligands by exponential enrichment (CAP-SELEX) method, many such protein pairs have been found to bind DNA with flexible spacing between their individual binding motifs. Most existing motif representations were not designed to capture such flexibly spaced regions. In order to computationally discover more co-binding events without prior knowledge about the identities of the co-binding proteins, a new representation is needed. We propose a new class of sequence patterns that flexibly model such variable regions and corresponding algorithms that identify co-bound sequences using these patterns. Based on both simulated and CAP-SELEX data, features derived from our sequence patterns lead to better classification performance than patterns that do not explicitly model the variable regions. We also show that even for standard ChIP-seq data, this new class of sequence patterns can help discover co-bound events in a subset of sequences in an unsupervised manner. The open-source software is available at https://github.com/kevingroup/glk-SVM.

Download Full-text