Toward Optimizing the Cache Performance of Suffix Trees for Sequence Analysis Algorithms Suffix Tree Cache Performance Optimization

We discuss the suffix tree generalization to matrices in this chapter. We extend the suffix tree notion (described in Chapter 3) from text strings to text matrices whose entries are taken from an ordered alphabet with the aim of solving pattern-matching problems. This suffix tree generalization can be efficiently used to implement low-level routines for Computer Vision, Data Compression, Geographic Information Systems and Visual Databases. We examine the submatrices in the form of the text’s contiguous parts that still have a matrix shape. Representing these text submatrices as “suitably formatted” strings stored in a compacted trie is the rationale behind suffix trees for matrices. The choice of the format inevitably influences suffix tree construction time and space complexity. We first deal with square matrices and show that many suffix tree families can be defined for the same input matrix according to the matrix’s string representations. We can store each suffix tree in linear space and give an efficient construction algorithm whose input is both the matrix and the string representation chosen. We then treat rectangular matrices and define their corresponding suffix trees by means of some general rules which we list formally. We show that there is a super-linear lower bound to the space required (in contrast with the linear space required by suffix trees for square matrices). We give a simple example of one of these suffix trees. The last part of the chapter illustrates some technical results regarding suffix trees for square matrices: we show how to achieve an expected linear-time suffix tree construction for a constant-size alphabet under some mild probabilistic assumptions about the input distribution. We begin by defining a wide class of string representations for square matrices. We let Σ denote an ordered alphabet of characters and introduce another alphabet of five special characters, called shapes. A shape is one of the special characters taken from set {IN,SW,NW,SE,NE}. Shape IN encodes the 1x1 matrix generated from the empty matrix by creating a square.

Download Full-text

From Suffix Trees to Suffix Vectors

International Journal of Foundations of Computer Science ◽

10.1142/s0129054106004479 ◽

2006 ◽

Vol 17 (06) ◽

pp. 1385-1402 ◽

Cited By ~ 1

Author(s):

Élise Prieur ◽

Thierry Lecroq

Keyword(s):

Data Structures ◽

Suffix Tree ◽

Suffix Trees ◽

Linear Algorithms ◽

Economical Alternative

We present a first formal setting for suffix vectors that are space economical alternative data structures to suffix trees. We give two linear algorithms for converting a suffix tree into a suffix vector and conversely. We enrich suffix vectors with formulas for counting the number of occurrences of repeated substrings. We also propose an alternative implementation for suffix vectors that should outperform the existing one.

Download Full-text

OPTIMAL PARALLEL CONSTRUCTION OF MINIMAL SUFFIX AND FACTOR AUTOMATA

Parallel Processing Letters ◽

10.1142/s0129626496000054 ◽

1996 ◽

Vol 06 (01) ◽

pp. 35-44 ◽

Cited By ~ 5

Author(s):

DANY BRESLAUER ◽

RAMESH HARIHARAN

Keyword(s):

Parallel Algorithms ◽

Data Structures ◽

Suffix Tree ◽

Finite Automata ◽

Suffix Trees ◽

Deterministic Finite Automata ◽

Tree Construction ◽

Parallel Construction ◽

Construction Algorithms

This paper gives optimal parallel algorithms for the construction of the smallest deterministic finite automata recognizing all the suffixes and the factors of a string. The algorithms use recently discovered optimal parallel suffix tree construction algorithms together with data structures for the efficient manipulation of trees, exploiting the well known relation between suffix and factor automata and suffix trees.

Download Full-text

Reversed Lempel–Ziv Factorization with Suffix Trees

Algorithms ◽

10.3390/a14060161 ◽

2021 ◽

Vol 14 (6) ◽

pp. 161

Author(s):

Dominik Köppl

Keyword(s):

Suffix Tree ◽

Linear Time ◽

Suffix Trees ◽

Tree Representations ◽

Linear Time Algorithms

We present linear-time algorithms computing the reversed Lempel–Ziv factorization [Kolpakov and Kucherov, TCS’09] within the space bounds of two different suffix tree representations. We can adapt these algorithms to compute the longest previous non-overlapping reverse factor table [Crochemore et al., JDA’12] within the same space but pay a multiplicative logarithmic time penalty.

Download Full-text

Research on the Cache Performance Optimization Technology of Multi-Core Processor Chip

Proceedings of the 2016 4th International Conference on Electrical & Electronics Engineering and Computer Science (ICEEECS 2016) ◽

10.2991/iceeecs-16.2016.49 ◽

2016 ◽

Author(s):

Su Zhang

Keyword(s):

Performance Optimization ◽

Cache Performance ◽

Multi Core Processor

Download Full-text

Cache Performance Optimization for SoC Vedio Applications

Journal of Multimedia ◽

10.4304/jmm.9.7.926-933 ◽

2014 ◽

Vol 9 (7) ◽

Cited By ~ 1

Author(s):

Lei Li ◽

Wei Zhang ◽

HuiYao An ◽

Xing Zhang ◽

HuaiQi Zhu

Keyword(s):

Performance Optimization ◽

Cache Performance

Download Full-text

Efficient Data Transfer in a Heterogeneous Multicore-Based CE Systems using Cache Performance Optimization

IEEE Consumer Electronics Magazine ◽

10.1109/mce.2019.2923928 ◽

2019 ◽

Vol 8 (5) ◽

pp. 46-50 ◽

Cited By ~ 1

Author(s):

Juan Fang ◽

Xiaoting Hao ◽

Qingwen Fan ◽

Kai Li ◽

Hui Zhao

Keyword(s):

Performance Optimization ◽

Data Transfer ◽

Cache Performance ◽

Heterogeneous Multicore ◽

Efficient Data

Download Full-text

Linearized Suffix Tree: an Efficient Index Data Structure with the Capabilities of Suffix Trees and Suffix Arrays

Algorithmica ◽

10.1007/s00453-007-9061-2 ◽

2007 ◽

Vol 52 (3) ◽

pp. 350-377 ◽

Cited By ~ 8

Author(s):

Dong Kyue Kim ◽

Minhwan Kim ◽

Heejin Park

Keyword(s):

Data Structure ◽

Suffix Tree ◽

Suffix Trees ◽

Suffix Arrays ◽

Index Data

Download Full-text

THE VIRTUAL SUFFIX TREE

International Journal of Foundations of Computer Science ◽

10.1142/s0129054109007066 ◽

2009 ◽

Vol 20 (06) ◽

pp. 1109-1133 ◽

Cited By ~ 2

Author(s):

JIE LIN ◽

YUE JIANG ◽

DON ADJEROH

Keyword(s):

Suffix Tree ◽

Linear Time ◽

Suffix Array ◽

Intermediate Step ◽

Suffix Trees ◽

String Length ◽

Space Requirement ◽

Suffix Arrays ◽

Tree Construction ◽

Efficient Data

We introduce the VST (virtual suffix tree), an efficient data structure for suffix trees and suffix arrays. Starting from the suffix array, we construct the suffix tree, from which we derive the virtual suffix tree. Later, we remove the intermediate step of suffix tree construction, and build the VST directly from the suffix array. The VST provides the same functionality as the suffix tree, including suffix links, but at a much smaller space requirement. It has the same linear time construction even for large alphabets, Σ, requires O(n) space to store (n is the string length), and allows searching for a pattern of length m to be performed in O(m log |Σ|) time, the same time needed for a suffix tree. Given the VST, we show an algorithm that computes all the suffix links in linear time, independent of Σ. The VST requires less space than other recently proposed data structures for suffix trees and suffix arrays, such as the enhanced suffix array [1], and the linearized suffix tree [17]. On average, the space requirement (including that for suffix arrays and suffix links) is 13.8n bytes for the regular VST, and 12.05n bytes in its compact form.

Download Full-text

Toward Optimizing the Cache Performance of Suffix Trees for Sequence Analysis Algorithms Suffix Tree Cache Performance Optimization

Cache Performance Optimization for Processing XML-Based Application Data on Multi-core Processors

Suffix Tree Data Structures for Matrices

From Suffix Trees to Suffix Vectors

OPTIMAL PARALLEL CONSTRUCTION OF MINIMAL SUFFIX AND FACTOR AUTOMATA

Reversed Lempel–Ziv Factorization with Suffix Trees

Research on the Cache Performance Optimization Technology of Multi-Core Processor Chip

Cache Performance Optimization for SoC Vedio Applications

Efficient Data Transfer in a Heterogeneous Multicore-Based CE Systems using Cache Performance Optimization

Linearized Suffix Tree: an Efficient Index Data Structure with the Capabilities of Suffix Trees and Suffix Arrays

THE VIRTUAL SUFFIX TREE

Export Citation Format