Text Indexing for Regular Expression Matching

Finding substrings of a text T that match a regular expression p is a fundamental problem. Despite being the subject of extensive research, no solution with a time complexity significantly better than O(|T||p|) has been found. Backurs and Indyk in FOCS 2016 established conditional lower bounds for the algorithmic problem based on the Strong Exponential Time Hypothesis that helps explain this difficulty. A natural question is whether we can improve the time complexity for matching the regular expression by preprocessing the text T? We show that conditioned on the Online Matrix–Vector Multiplication (OMv) conjecture, even with arbitrary polynomial preprocessing time, a regular expression query on a text cannot be answered in strongly sublinear time, i.e., O(|T|1−ε) for any ε>0. Furthermore, if we extend the OMv conjecture to a plausible conjecture regarding Boolean matrix multiplication with polynomial preprocessing time, which we call Online Matrix–Matrix Multiplication (OMM), we can strengthen this hardness result to there being no solution with a query time that is O(|T|3/2−ε). These results hold for alphabet sizes three or greater. We then provide data structures that answer queries in O(|T||p|τ) time where τ∈[1,|T|] is fixed at construction. These include a solution that works for all regular expressions with Expτ·|T| preprocessing time and space. For patterns containing only ‘concatenation’ and ‘or’ operators (the same type used in the hardness result), we provide (1) a deterministic solution which requires Expτ·|T|log2|T| preprocessing time and space, and (2) when |p|≤|T|z for z=2o(log|T|), a randomized solution with amortized query time which answers queries correctly with high probability, requiring Expτ·|T|2Ωlog|T| preprocessing time and space.

Download Full-text

Computing the Atom Graph of a Graph and the Union Join Graph of a Hypergraph

Algorithms ◽

10.3390/a14120347 ◽

2021 ◽

Vol 14 (12) ◽

pp. 347

Author(s):

Anne Berry ◽

Geneviève Simonet

Keyword(s):

Real Number ◽

Time Complexity ◽

Matrix Multiplication ◽

Efficient Algorithms ◽

Join Graph ◽

Current Value ◽

Minimal Separator

The atom graph of a graph is a graph whose vertices are the atoms obtained by clique minimal separator decomposition of this graph, and whose edges are the edges of all possible atom trees of this graph. We provide two efficient algorithms for computing this atom graph, with a complexity in O(min(nωlogn,nm,n(n+m¯)) time, where n is the number of vertices of G, m is the number of its edges, m¯ is the number of edges of the complement of G, and ω, also denoted by α in the literature, is a real number, such that O(nω) is the best known time complexity for matrix multiplication, whose current value is 2,3728596. This time complexity is no more than the time complexity of computing the atoms in the general case. We extend our results to α-acyclic hypergraphs, which are hypergraphs having at least one join tree, a join tree of an hypergraph being defined by its hyperedges in the same way as an atom tree of a graph is defined by its atoms. We introduce the notion of union join graph, which is the union of all possible join trees; we apply our algorithms for atom graphs to efficiently compute union join graphs.

Download Full-text

On a New Formula for Fibonacci’s Family m-step Numbers and Some Applications

Mathematics ◽

10.3390/math7090805 ◽

2019 ◽

Vol 7 (9) ◽

pp. 805 ◽

Cited By ~ 6

Author(s):

Monther Rashed Alfuraidan ◽

Ibrahim Nabeel Joudah

Keyword(s):

Number Theory ◽

Time Complexity ◽

Matrix Multiplication ◽

Square Roots ◽

Computational Number Theory ◽

The Matrix ◽

Theory Application ◽

Constant Coefficients ◽

New Formula

In this work, we obtain a new formula for Fibonacci’s family m-step sequences. We use our formula to find the nth term with less time complexity than the matrix multiplication method. Then, we extend our results for all linear homogeneous recurrence m-step relations with constant coefficients by using the last few terms of its corresponding Fibonacci’s family m-step sequence. As a computational number theory application, we develop a method to estimate the square roots.

Download Full-text

A systematic approach to the scale separation problem in the development of multiscale models

PLoS ONE ◽

10.1371/journal.pone.0251297 ◽

2021 ◽

Vol 16 (5) ◽

pp. e0251297

Author(s):

Pinaki Bhattacharya ◽

Qiao Li ◽

Damien Lacroix ◽

Visakan Kadirkamanathan ◽

Marco Viceconti

Keyword(s):

Fundamental Problem ◽

Multiscale Modelling ◽

Time And Space ◽

Scale Separation ◽

Multiscale Problems ◽

Modelling Methodology ◽

The Many ◽

Definition Of ◽

Quantities Of Interest ◽

Modelling Approach

Throughout engineering there are problems where it is required to predict a quantity based on the measurement of another, but where the two quantities possess characteristic variations over vastly different ranges of time and space. Among the many challenges posed by such ‘multiscale’ problems, that of defining a ‘scale’ remains poorly addressed. This fundamental problem has led to much confusion in the field of biomedical engineering in particular. The present study proposes a definition of scale based on measurement limitations of existing instruments, available computational power, and on the ranges of time and space over which quantities of interest vary characteristically. The definition is used to construct a multiscale modelling methodology from start to finish, beginning with a description of the system (portion of reality of interest) and ending with an algorithmic orchestration of mathematical models at different scales within the system. The methodology is illustrated for a specific but well-researched problem. The concept of scale and the multiscale modelling approach introduced are shown to be easily adaptable to other closely related problems. Although out of the scope of this paper, we believe that the proposed methodology can be applied widely throughout engineering.

Download Full-text

ANALYSIS OF FAST ALGORITHM OF MATRIX-VECTOR MULTIPLICATION FOR THE BANK OF DIGITAL FILTERS

T-Comm ◽

10.36724/2072-8735-2021-15-1-4-10 ◽

2021 ◽

Vol 15 (1) ◽

pp. 4-10

Author(s):

Vitaly B. Kreyndelin ◽

◽

Elena D. Grigorieva ◽

Keyword(s):

Computational Complexity ◽

Digital Filters ◽

Matrix Multiplication ◽

Digital Signal ◽

Rounding Errors ◽

Complexity Of Algorithms ◽

Matrix Vector Multiplication ◽

Computing Complexity ◽

Traditional Algorithm ◽

Digital Filter Banks

Algorithms of implementation of vector-matrix multiplication are presented, which are intended for application in banks (sets) of digital filters. These algorithms provide significant savings in computational costs over traditional algorithms. At the same time, reduction of computational complexity of algorithms is achieved without any performance loss of banks (sets) of digital filters. As the basis for the construction of algorithms proposed in the article, the previously known Winograd method of multiplication of real matrices and vectors and two versions of the method of type 3M for multiplication of complex matrices and vectors are used. Methods of combining these known methods of multiplying matrices and vectors for building digital filter banks (sets) are considered. The analysis of computing complexity of such ways which showed a possibility of reduction of computing complexity in comparison with a traditional algorithm of realization of bank (set) of digital filters approximately in 2.66 times – at realization on the processor without hardware multiplier is carried out; and by 1.33 times – at realization on the processor with the hardware multiplier. These indicators are markedly higher than those of known algorithms. Analysis of sensitivity of algorithms proposed in this article to rounding errors arising by digital signal processing was carried out. Based on this analysis, an algorithm is selected that has a computational complexity smaller than that of a traditional algorithm, but its sensitivity to rounding errors is the same as that of a traditional algorithm. Recommendations are given on its practical application in the development of a bank (set) of digital filters.

Download Full-text

Level-Based Analysis of Genetic Algorithms and Other Search Processes

10.1101/084335 ◽

2016 ◽

Cited By ~ 1

Author(s):

Dogan Corus ◽

Duc-Cuong Dang ◽

Anton V. Eremeev ◽

Per Kristian Lehre

Keyword(s):

Genetic Algorithms ◽

Time Complexity ◽

Fundamental Problem ◽

Population Based ◽

Rigorous Results ◽

Expected Time ◽

Sorting Problem ◽

Estimation Of Distribution ◽

A New Technique ◽

Distribution Algorithms

AbstractUnderstanding how the time-complexity of evolutionary algorithms (EAs) depend on their parameter settings and characteristics of fitness landscapes is a fundamental problem in evolutionary computation. Most rigorous results were derived using a handful of key analytic techniques, including drift analysis. However, since few of these techniques apply effortlessly to population-based EAs, most time-complexity results concern simplified EAs, such as the (1 + 1) EA.This paper describes the level-based theorem, a new technique tailored to population-based processes. It applies to any non-elitist process where o spring are sampled independently from a distribution depending only on the current population. Given conditions on this distribution, our technique provides upper bounds on the expected time until the process reaches a target state.We demonstrate the technique on several pseudo-Boolean functions, the sorting problem, and approximation of optimal solutions in combina-torial optimisation. The conditions of the theorem are often straightfor-ward to verify, even for Genetic Algorithms and Estimation of Distribution Algorithms which were considered highly non-trivial to analyse. Finally, we prove that the theorem is nearly optimal for the processes considered. Given the information the theorem requires about the process, a much tighter bound cannot be proved.

Download Full-text

Finding a Maximum Clique in a Set of Proper Circular Arcs in Time O(n) with Applications

International Journal of Foundations of Computer Science ◽

10.1142/s0129054197000288 ◽

1997 ◽

Vol 08 (04) ◽

pp. 443-467 ◽

Cited By ~ 2

Author(s):

Glenn K. Manacher ◽

Terrance A. Mankus

Keyword(s):

Time Complexity ◽

Maximum Clique ◽

Special Property ◽

Maximum Weight ◽

Time And Space ◽

Circular Arc ◽

Interval Model ◽

Circular Arcs ◽

The One ◽

Circular Arc Graphs

A maximum clique is sought in a set of n proper circular arcs (PCAS). By means of several passes, each O(n) in time and space, a PCAS is transformed initially into a set of circle chords and finally into a set of intervals. This interval model inherits a special property from the PCAS which ensures the discovery of a maximum overlap clique in time O(n). The one-to-one arc/interval correspondence guarantees the identification of the maximum clique in the PCAS in O(n) time and space. The present paper gives new, simpler proofs for the lemmas first outlined by us in Ref. [9], extending the methods outlined in that paper so that the time bound is improved from O(n log n) to O(n). The method depends only on certain interconnections between constructions related to the computation of longest increasing subsequences. Independently, Hell, Huang and Bhattacharya5 recently discovered a completely different approach that also achieves the same complexity, and can moreover be applied to the weighted case and to the coloring problem on proper circular arcs. The previous best result, due to Apostolico and Hambrusch2 applies to general circular arc models and has time complexity O(n2 log log n) and space complexity O(n). As applications of the method, we show that maximum weight clique of a set of weighted proper circular arcs can be found in time O(n2) and space O(n). The previous best result was O(n2 log log n) for dense general circular arc graphs.13 We also show that, for n chords with randomly placed endpoints (1) the average cardinality of a maximum clique is cn1/2 ± o(n1/2), where 21/2< c < e21/2, and (2) a maximum clique may be found in average time O(n3/2) and space θ(n). The previous best average time complexity, derived from Ref. [1], was O(n3/2 log n).

Download Full-text

FROM C-CONTINUATIONS TO NEW QUADRATIC ALGORITHMS FOR AUTOMATON SYNTHESIS

International Journal of Algebra and Computation ◽

10.1142/s0218196701000772 ◽

2001 ◽

Vol 11 (06) ◽

pp. 707-735 ◽

Cited By ~ 18

Author(s):

J.-M. CHAMPARNAUD ◽

D. ZIADI

Keyword(s):

Time Complexity ◽

Regular Expression ◽

Time Algorithm ◽

The Other ◽

Worst Case ◽

Partial Derivatives ◽

Space And Time ◽

Other Hand ◽

Deterministic Automata ◽

The Way

Two classical non-deterministic automata recognize the language denoted by a regular expression: the position automaton which deduces from the position sets defined by Glushkov and McNaughton–Yamada, and the equation automaton which can be computed via Mirkin's prebases or Antimirov's partial derivatives. Let |E| be the size of the expression and ‖E‖ be its alphabetic width, i.e. the number of symbol occurrences. The number of states in the equation automaton is less than or equal to the number of states in the position automaton, which is equal to ‖E‖+1. On the other hand, the worst-case time complexity of Antimirov algorithm is O(‖E‖3· |E|2), while it is only O(‖E‖·|E|) for the most efficient implementations yielding the position automaton (Brüggemann–Klein, Chang and Paige, Champarnaud et al.). We present an O(|E|2) space and time algorithm to compute the equation automaton. It is based on the notion of canonical derivative which makes it possible to efficiently handle sets of word derivatives. By the way, canonical derivatives also lead to a new O(|E|2) space and time algorithm to construct the position automaton.

Download Full-text

Improved time and space bounds for Boolean matrix multiplication

Acta Informatica ◽

10.1007/bf00264600 ◽

1978 ◽

Vol 11 (1) ◽

Cited By ~ 9

Author(s):

Leonard Adleman ◽

KelloggS. Booth ◽

FrancoP. Preparata ◽

WalterL. Ruzzo

Keyword(s):

Matrix Multiplication ◽

Boolean Matrix ◽

Time And Space ◽

Boolean Matrix Multiplication

Download Full-text

Optimal Prefix and Suffix Queries on Texts

Discrete Mathematics & Theoretical Computer Science ◽

10.46298/dmtcs.3541 ◽

2007 ◽

Vol DMTCS Proceedings vol. AH,... (Proceedings) ◽

Author(s):

Maxime Crochemore ◽

Costas S. Iliopoulos ◽

M. Sohel Rahman

Keyword(s):

Data Structure ◽

Suffix Tree ◽

Query Time ◽

Time And Space ◽

Matching Problem ◽

International Audience ◽

Tree Data ◽

Tree Data Structure ◽

Restricted Pattern ◽

Time And Space Complexity

International audience In this paper, we study a restricted version of the position restricted pattern matching problem introduced and studied by Mäkinen and Navarro [Position-Restricted Substring Searching, LATIN 2006]. In the problem handled in this paper, we are interested in those occurrences of the pattern that lies in a suffix or in a prefix of the given text. We achieve optimal query time for our problem against a data structure which is an extension of the classic suffix tree data structure. The time and space complexity of the data structure is dominated by that of the suffix tree. Notably, the (best) algorithm by Mäkinen and Navarro, if applied to our problem, gives sub-optimal query time and the corresponding data structure also requires more time and space.

Download Full-text

An Improved Dynamic Programming Algorithm for Bitonic TSP

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.3094 ◽

2013 ◽

Vol 347-350 ◽

pp. 3094-3098 ◽

Cited By ~ 1

Author(s):

Jian Li

Keyword(s):

Dynamic Programming ◽

Time Complexity ◽

Dynamic Programming Algorithm ◽

Space Complexity ◽

Programming Algorithm ◽

Time And Space ◽

Space Requirement ◽

Classical Algorithm ◽

Computing Speed ◽

Improved Dynamic Programming

This paper puts forward an improved dynamic programming algorithm for bitonic TSP and it proves to be correct. Divide the whole loop into right-and-left parts through analyzing the key point connecting to the last one directly; then construct a new optimal sub-structure and recursion. The time complexity of the new algorithm is O(n2) and the space complexity is O(n); while both the time and space complexities of the classical algorithm are O(n2). Experiment results showed that the new algorithm not only reduces the space requirement greatly but also increases the computing speed by 2-3 times compared with the classical algorithm.

Download Full-text