scholarly journals ABELIAN PRIMITIVE WORDS

2012 ◽  
Vol 23 (05) ◽  
pp. 1021-1033 ◽  
Author(s):  
MICHAEL DOMARATZKI ◽  
NARAD RAMPERSAD

We investigate Abelian primitive words, which are words that are not Abelian powers. We show the set of Abelian primitive words is not context-free. We can determine whether a word is Abelian primitive in linear time (for fixed alphabet size). Also differently from classical primitive words, we find that a word may have more than one Abelian root. We also consider enumeration of Abelian primitive words.

2007 ◽  
Vol 18 (06) ◽  
pp. 1293-1302 ◽  
Author(s):  
MARTIN KUTRIB ◽  
ANDREAS MALCHER

We investigate the intersection of Church-Rosser languages and (strongly) context-free languages. The intersection is still a proper superset of the deterministic context-free languages as well as of their reversals, while its membership problem is solvable in linear time. For the problem whether a given Church-Rosser or context-free language belongs to the intersection we show completeness for the second level of the arithmetic hierarchy. The equivalence of Church-Rosser and context-free languages is Π1-complete. It is proved that all considered intersections are pairwise incomparable. Finally, closure properties under several operations are investigated.


2020 ◽  
Vol 21 (4) ◽  
Author(s):  
Nikolay Handzhiyski ◽  
Elena Somova

The article describes a new and efficient algorithm for parsing, called Tunnel Parsing, that parses from left to right on the basis of a context-free grammar without left recursion and rules that recognize empty words. The algorithm is applicable mostly for domain-specific languages. In the article, particular attention is paid to the parsing of grammar element repetitions. As a result of the parsing, a statically typed concrete syntax tree is built from top to bottom, that accurately reflects the grammar. The parsing is not done through a recursion, but through an iteration. The Tunnel Parsing algorithm uses the grammars directly without a prior refactoring and is with a linear time complexity for deterministic context-free grammars.


10.1142/7265 ◽  
2011 ◽  
Author(s):  
Pál Dömösi ◽  
Masami Ito
Keyword(s):  

2009 ◽  
Vol 35 (4) ◽  
pp. 559-595 ◽  
Author(s):  
Liang Huang ◽  
Hao Zhang ◽  
Daniel Gildea ◽  
Kevin Knight

Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary re-orderings between the two languages. We develop a theory of binarization for synchronous context-free grammars and present a linear-time algorithm for binarizing synchronous rules when possible. In our large-scale experiments, we found that almost all rules are binarizable and the resulting binarized rule set significantly improves the speed and accuracy of a state-of-the-art syntax-based machine translation system. We also discuss the more general, and computationally more difficult, problem of finding good parsing strategies for non-binarizable rules, and present an approximate polynomial-time algorithm for this problem.


2016 ◽  
Vol 27 (04) ◽  
pp. 431-442 ◽  
Author(s):  
Michael Forsyth ◽  
Amlesh Jayakumar ◽  
Jarkko Peltomäki ◽  
Jeffrey Shallit

We discuss the notion of privileged word, recently introduced by Kellendonk, Lenz and Savinien. A word w is privileged if it is of length ≤ 1, or has a privileged border that occurs exactly twice in w. We prove the following results: (1) if wk is privileged for some [Formula: see text], then wj is privileged for all [Formula: see text]; (2) the language of privileged words is not context-free; (3) there is a linear-time algorithm to check if a given word is privileged; and (4) there are at least [Formula: see text] privileged binary words of length n.


2016 ◽  
Author(s):  
Hiroki Sudo ◽  
Masanobu Jimbo ◽  
Koji Nuida ◽  
Kana Shimizu

AbstractMotivationPrivacy-preserving substring matching is an important task for sensitive biological/biomedical sequence database searches. It enables a user to obtain only a substring match while his/her query is concealed to a server. The previous approach for this task is based on a linear-time algorithm in terms of alphabet size |Σ|. Therefore, a more efficient method is needed to deal with strings with large alphabet size such as a protein sequence, time-series data, and a clinical document.ResultsWe present a novel algorithm that can search a string in logarithmic time of |Σ|. In our algorithm, named secure wavelet matrix (sWM), we use an additively homomorphic encryption to build an efficient data structure called a wavelet matrix. In an experiment using a simulated string of length 10,000 whose alphabet size ranges from 4 to 1024, the run time of the sWM was an order of magnitude faster than that of the previous method. We also tested the sWM on all sequences of one protein family in Pfam (9,826 residues in total) and clinical texts written in a natural language (77,712 letters in total). By using a laptop computer for the user and a desktop PC for the server, we found that its run time was ≈ 2.5 s (user) and ≈ 6.7 s (server) for the protein sequences and ≈ 10 s (user) and ≈ 60 s (server) for the clinical texts.Availabilityhttps://github.com/cBioLab/sWM


Sign in / Sign up

Export Citation Format

Share Document