scholarly journals On Special k-Spectra, k-Locality, and Collapsing Prefix Normal Words

2021 ◽  
Author(s):  
Pamela Fleischmann

The domain of Combinatorics on Words, first introduced by Axel Thue in 1906, covers by now many subdomains. In this work we are investigating scattered factors as a representation of non-complete information and two measurements for words, namely the locality of a word and prefix normality, which have applications in pattern matching. In the first part of the thesis we investigate scattered factors: A word u is a scattered factor of w if u can be obtained from w by deleting some of its letters. That is, there exist the (potentially empty) words u1, u2, . . . , un, and v0,v1,...,vn such that u = u1u2 ̈ ̈ ̈un and w = v0u1v1u2v2 ̈ ̈ ̈unvn. First, we consider the set of length-k scattered factors of a given word w, called the k-spectrum of w and denoted by ScatFactk(w). We prove a series of properties of the sets ScatFactk(w) for binary weakly-0-balanced and, respectively, weakly-c-balanced words w, i.e., words over a two- letter alphabet where the number of occurrences of each letter is the same, or, respectively, one letter has c occurrences more than the other. In particular, we consider the question which cardinalities n = | ScatFactk (w)| are obtainable, for a positive integer k, when w is either a weakly-0- balanced binary word of length 2k, or a weakly-c-balanced binary word of length 2k ́ c. Second, we investigate k-spectra that contain all possible words of length k, i.e., k-spectra of so called k-universal words. We present an algorithm deciding whether the k-spectra for given k of two words are equal or not, running in optimal time. Moreover, we present several results regarding k-universal words and extend this notion to circular universality that helps in investigating how the universality of repetitions of a given word can be determined. We conclude the part about scattered factors with results on the reconstruction problem of words from scattered factors that asks for the minimal information, like multisets of scattered factors of a given length or the number of occurrences of scattered factors from a given set, necessary to uniquely determine a word. We show that a word w P {a, b} ̊ can be reconstructed from the number of occurrences of at most min(|w|a, |w|b) + 1 scattered factors of the form aib, where |w|a is the number of occurrences of the letter a in w. Moreover, we generalise the result to alphabets of the form {1, . . . , q} by showing that at most ∑q ́1 |w|i (q ́ i + 1) scattered factors suffices to reconstruct w. Both results i=1 improve on the upper bounds known so far. Complexity time bounds on reconstruction algorithms are also considered here. In the second part we consider patterns, i.e., words consisting of not only letters but also variables, and in particular their locality. A pattern is called k-local if on marking the pattern in a given order never more than k marked blocks occur. We start with the proof that determining the minimal k for a given pattern such that the pattern is k-local is NP- complete. Afterwards we present results on the behaviour of the locality of repetitions and palindromes. We end this part with the proof that the matching problem becomes also NP-hard if we do not consider a regular pattern - for which the matching problem is efficiently solvable - but repetitions of regular patterns. In the last part we investigate prefix normal words which are binary words in which each prefix has at least the same number of 1s as any factor of the same length. First introduced in 2011 by Fici and Lipták, the problem of determining the index (amount of equivalence classes for a given word length) of the prefix normal equivalence relation is still open. In this paper, we investigate two aspects of the problem, namely prefix normal palindromes and so-called collapsing words (extending the notion of critical words). We prove characterizations for both the palindromes and the collapsing words and show their connection. Based on this, we show that still open problems regarding prefix normal words can be split into certain subproblems.

Author(s):  
KATSUSHI INOUE ◽  
ITSUO TAKANAMI

This paper first shows that REC, the family of recognizable picture languages in Giammarresi and Restivo,3 is equal to the family of picture languages accepted by two-dimensional on-line tessellation acceptors in Inoue and Nakamura.5 By using this result, we then solve open problems in Giammarresi and Restivo,3 and show that (i) REC is not closed under complementation, and (ii) REC properly contains the family of picture languages accepted by two-dimensional nondeterministic finite automata even over a one letter alphabet.


2017 ◽  
Vol 27 (02) ◽  
pp. 1750004
Author(s):  
Brahim Neggazi ◽  
Volker Turau ◽  
Mohammed Haddad ◽  
Hamamache Kheddouci

The triangle partition problem is a generalization of the well-known graph matching problem consisting of finding the maximum number of independent edges in a given graph, i.e., edges with no common node. Triangle partition instead aims to find the maximum number of disjoint triangles. The triangle partition problem is known to be NP-complete. Thus, in this paper, the focus is on the local maximization variant, called maximal triangle partition (MTP). Thus, paper presents a new self-stabilizing algorithm for MTP that converges in O(m) moves under the unfair distributed daemon.


Author(s):  
Amihood Amir ◽  
Ayelet Butman ◽  
Ely Porat

Histogram indexing , also known as jumbled pattern indexing and permutation indexing is one of the important current open problems in pattern matching. It was introduced about 6 years ago and has seen active research since. Yet, to date there is no algorithm that can preprocess a text T in time o (| T | 2 /polylog| T |) and achieve histogram indexing, even over a binary alphabet, in time independent of the text length. The pattern matching version of this problem has a simple linear-time solution. Block-mass pattern matching problem is a recently introduced problem, motivated by issues in mass-spectrometry. It is also an example of a pattern matching problem that has an efficient, almost linear-time solution but whose indexing version is daunting. However, for fixed finite alphabets, there has been progress made. In this paper, a strong connection between the histogram indexing problem and the block-mass pattern indexing problem is shown. The reduction we show between the two problems is amazingly simple. Its value lies in recognizing the connection between these two apparently disparate problems, rather than the complexity of the reduction. In addition, we show that for both these problems, even over unbounded alphabets, there are algorithms that preprocess a text T in time o (| T | 2 /polylog| T |) and enable answering indexing queries in time polynomial in the query length. The contributions of this paper are twofold: (i) we introduce the idea of allowing a trade-off between the preprocessing time and query time of various indexing problems that have been stumbling blocks in the literature. (ii) We take the first step in introducing a class of indexing problems that, we believe, cannot be pre-processed in time o (| T | 2 /polylog| T |) and enable linear-time query processing.


1992 ◽  
Vol 42 (1) ◽  
pp. 55-60 ◽  
Author(s):  
Alok Aggarwal ◽  
Herbert Edelsbrunner ◽  
Prahakar Raghavan ◽  
Prasoon Tiwari
Keyword(s):  

2010 ◽  
Vol 21 (06) ◽  
pp. 905-924 ◽  
Author(s):  
MAREK KARPIŃSKI ◽  
ANDRZEJ RUCIŃSKI ◽  
EDYTA SZYMAŃSKA

In this paper we consider the computational complexity of deciding the existence of a perfect matching in certain classes of dense k-uniform hypergraphs. It has been known that the perfect matching problem for the classes of hypergraphs H with minimum ((k - 1)–wise) vertex degreeδ(H) at least c|V(H)| is NP-complete for [Formula: see text] and trivial for c ≥ ½, leaving the status of the problem with c in the interval [Formula: see text] widely open. In this paper we show, somehow surprisingly, that ½ is not the threshold for tractability of the perfect matching problem, and prove the existence of an ε > 0 such that the perfect matching problem for the class of hypergraphs H with δ(H) ≥ (½ - ε)|V(H)| is solvable in polynomial time. This seems to be the first polynomial time algorithm for the perfect matching problem on hypergraphs for which the existence problem is nontrivial. In addition, we consider parallel complexity of the problem, which could be also of independent interest.


2004 ◽  
Vol 56 (1-3) ◽  
pp. 35-60 ◽  
Author(s):  
Ramgopal R. Mettu ◽  
C. Greg Plaxton
Keyword(s):  

Author(s):  
Frank Vega

P versus NP is considered as one of the most important open problems in computer science. This consists in knowing the answer of the following question: Is P equal to NP? A precise statement of the P versus NP problem was introduced independently by Stephen Cook and Leonid Levin. Since that date, all efforts to find a proof for this problem have f ailed. NP is the complexity class of languages defined b y p olynomial t ime v erifiers M su ch th at wh en th e in put is an el ement of the language with its certificate, then M outputs a string which belongs to a single language in P. Another major complexity classes are L and NL. The certificate-based definition of NL is based on logarithmic space Turing machine with an additional special read-once input tape: This is called a logarithmic space verifier. NL is the complexity class of languages defined by logarithmic space verifiers M s uch t hat when t he i nput i s a n e lement o f t he l anguage with i ts c ertificate, th en M outputs 1. To attack the P versus NP problem, the NP-completeness is a useful concept. We demonstrate there is an NP-complete language defined by a logarithmic space verifier M such that when the input is an element of the language with its certificate, then M outputs a s tring which belongs to a single language in L. In this way, we obtain if L is not equal to NL, then P = NP. In addition, we show that L is not equal to NL. Hence, we prove the complexity class P is equal to NP.


Author(s):  
Pamela Fleischmann ◽  
Marie Lejeune ◽  
Florin Manea ◽  
Dirk Nowotka ◽  
Michel Rigo

A reconstruction problem of words from scattered factors asks for the minimal information, like multisets of scattered factors of a given length or the number of occurrences of scattered factors from a given set, necessary to uniquely determine a word. We show that a word [Formula: see text] can be reconstructed from the number of occurrences of at most [Formula: see text] scattered factors of the form [Formula: see text], where [Formula: see text] is the number of occurrences of the letter [Formula: see text] in [Formula: see text]. Moreover, we generalise the result to alphabets of the form [Formula: see text] by showing that at most [Formula: see text] scattered factors suffices to reconstruct [Formula: see text]. Both results improve on the upper bounds known so far. Complexity time bounds on reconstruction algorithms are also considered here.


1980 ◽  
Vol 2 (1) ◽  
pp. 65-72 ◽  
Author(s):  
David A. Plaisted ◽  
Samuel Zaks

Sign in / Sign up

Export Citation Format

Share Document