input string
Recently Published Documents


TOTAL DOCUMENTS

32
(FIVE YEARS 10)

H-INDEX

4
(FIVE YEARS 1)

2021 ◽  
Vol 24 (3) ◽  
Author(s):  
Elton Cardoso ◽  
Maycon Amaro ◽  
Samuel Feitosa ◽  
Leonardo Reis ◽  
André Du Bois ◽  
...  

We describe the formalization of Brzozowski and Antimirov derivative based algorithms for regular expression parsing, in the dependently typed language Agda. The formalization produces a proof that either an input string matches a given regular expression or that no matching exists. A tool for regular expression based search in the style of the well known GNU grep has been developed with the certified algorithms. Practical experiments conducted with this tool are reported.


Axioms ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 334
Author(s):  
Marius Zimand

It is impossible to effectively modify a string in order to increase its Kolmogorov complexity. However, is it possible to construct a few strings, no longer than the input string, so that most of them have larger complexity? We show that the answer is yes. We present an algorithm that takes as input a string x of length n and returns a list with O(n2) strings, all of length n, such that 99% of them are more complex than x, provided the complexity of x is less than n−loglogn−O(1). We also present an algorithm that obtains a list of quasi-polynomial size in which each element can be produced in polynomial time.


2021 ◽  
Vol 5 (OOPSLA) ◽  
pp. 1-24
Author(s):  
Xiaodong Jia ◽  
Ashish Kumar ◽  
Gang Tan

In this paper, we present a derivative-based, functional recognizer and parser generator for visibly pushdown grammars. The generated parser accepts ambiguous grammars and produces a parse forest containing all valid parse trees for an input string in linear time. Each parse tree in the forest can then be extracted also in linear time. Besides the parser generator, to allow more flexible forms of the visibly pushdown grammars, we also present a translator that converts a tagged CFG to a visibly pushdown grammar in a sound way, and the parse trees of the tagged CFG are further produced by running the semantic actions embedded in the parse trees of the translated visibly pushdown grammar. The performance of the parser is compared with a popular parsing tool ANTLR and other popular hand-crafted parsers. The correctness of the core parsing algorithm is formally verified in the proof assistant Coq.


Algorithmica ◽  
2021 ◽  
Author(s):  
Takuya Mieno ◽  
Yuta Fujishige ◽  
Yuto Nakashima ◽  
Shunsuke Inenaga ◽  
Hideo Bannai ◽  
...  

AbstractA substring u of a string T is called a minimal unique substring (MUS) of T if u occurs exactly once in T and any proper substring of u occurs at least twice in T. In this paper, we study the problem of computing MUSs for a sliding window over a given string T. We first show how the set of MUSs can change when the window slides over T. We then present an $$O(n\log \sigma ')$$ O ( n log σ ′ ) -time and O(d)-space algorithm to compute MUSs for a sliding window of size d over the input string T of length n, where $$\sigma '\le d$$ σ ′ ≤ d is the maximum number of distinct characters in every window.


2021 ◽  
Vol 3 ◽  
pp. 73-77
Author(s):  
Rustam Khamdamov ◽  
◽  
Komil Kerimov ◽  

Recently, attacks on web applications, such as SQL injection and cross-site scripting (XSS), have tended to increase. In this article, we proposed a new algorithm for detecting XSS attacks on a web application based on the analysis of the frequency of occurrence of special characters. The paper proposes mathematical modeling and a method for identifying XSS attacks using a function bounded below that depends on the input string. To build this function, special characters and keywords were used, which are often found in the construction of XSS attacks. Mathematical modeling and identification of information objects plays an important role in solving the problems of pattern recognition. One such task is to detect attacks or normal requests to web applications. Research devoted to the study of the detection of attacks or normal requests to web applications began relatively recently. Nevertheless, there is a lot of research in this direction. In this paper, we propose mathematical modeling and a method for identifying XSS attacks using a function bounded below that depends on the input string. To build this feature, we used special characters and keywords that are often found in building XSS attacks. In the proposed method, it is possible to detect XSS attacks using one special character or one keyword. Nevertheless, it can be experimentally shown that the proposed detection method using a set of numerous characters and words allows us to determine more accurately the vulnerability of the type of XSS attacks. The aim of this work is to develop an algorithm for detecting XSS attacks. To achieve this, we focused on the characters that are often included in the XSS attack string.


Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 294
Author(s):  
Frantisek Franek ◽  
Michael Liut

There are two reasons to have an efficient algorithm for identifying all right-maximal Lyndon substrings of a string: firstly, Bannai et al. introduced in 2015 a linear algorithm to compute all runs of a string that relies on knowing all right-maximal Lyndon substrings of the input string, and secondly, Franek et al. showed in 2017 a linear equivalence of sorting suffixes and sorting right-maximal Lyndon substrings of a string, inspired by a novel suffix sorting algorithm of Baier. In 2016, Franek et al. presented a brief overview of algorithms for computing the Lyndon array that encodes the knowledge of right-maximal Lyndon substrings of the input string. Among those presented were two well-known algorithms for computing the Lyndon array: a quadratic in-place algorithm based on the iterated Duval algorithm for Lyndon factorization and a linear algorithmic scheme based on linear suffix sorting, computing the inverse suffix array, and applying to it the next smaller value algorithm. Duval’s algorithm works for strings over any ordered alphabet, while for linear suffix sorting, a constant or an integer alphabet is required. The authors at that time were not aware of Baier’s algorithm. In 2017, our research group proposed a novel algorithm for the Lyndon array. Though the proposed algorithm is linear in the average case and has O(nlog(n)) worst-case complexity, it is interesting as it emulates the fast Fourier algorithm’s recursive approach and introduces τ-reduction, which might be of independent interest. In 2018, we presented a linear algorithm to compute the Lyndon array of a string inspired by Phase I of Baier’s algorithm for suffix sorting. This paper presents the theoretical analysis of these two algorithms and provides empirical comparisons of both of their C++ implementations with respect to the iterated Duval algorithm.


Author(s):  
Frantisek Franek ◽  
Michael Liut

There are two reasons to have an efficient algorithm for identifying all maximal Lyndon substrings of a string: firstly, Bannai et al. introduced in 2015 a linear algorithm to compute all runs of a string that relies on knowing all maximal Lyndon substrings of the input string, and secondly, Franek et al. showed in 2017 a linear equivalence of sorting suffixes and sorting maximal Lyndon substrings of a string, inspired by a novel suffix sorting algorithm of Baier. In 2016, Franek et al. presented a brief overview of algorithms for computing the Lyndon array that encodes the knowledge of maximal Lyndon substrings of the input string. Among the presented were two well-known algorithms for computing the Lyndon array: a quadratic in-place algorithm based on iterated Duval's algorithm for Lyndon factorization, and a linear algorithmic scheme based on linear suffix sorting, computing inverse suffix array, and applying to it the Next Smaller Value algorithm. Duval's algorithm works for strings over any ordered alphabet, while for linear suffix sorting, a constant or an integer alphabet is required. The authors at that time were not aware of Baier's algorithm. In 2017, our research group proposed a novel algorithm for the Lyndon array. Though the proposed algorithm is linear in the average case and has O(n log(n)) worst-case complexity, it is interesting as it emulates the fast Fourier algorithm's recursive approach and introduces tau-reduction that might be of independent interest. In 2018, we presented a linear algorithm to compute the Lyndon array of a string inspired by Phase I of Baier's algorithm for suffix sorting. This paper presents theoretical analysis of these two algorithms and provides empirical comparisons of both their C++ implementations with respect to iterated Duval's algorithm.


2020 ◽  
Vol 21 (2) ◽  
pp. 153-163
Author(s):  
Nor Farahidah Za'bah ◽  
Ahmad Amierul Ashraf Muhammad Nazmi ◽  
Amelia Wong Azman

Segmentation is an important aspect of translating finger spelling of sign language into Latin alphabets. Although the sign language devices that are currently available can translate the finger spelling into alphabets, there is a limitation where the output is stored in a long continuous string without spaces between words. The system proposed in this work is meant to be used together with a text-generating glove device. The system used text input string and the string is then fed into the system, one character at a time, and then it is segmented into words that is semantically correct. The proposed text segmentation method in this work is by using the dynamic programming and back-off algorithm, together with the probability score using word matching with an English language text corpus. Based on the results, the system is able to properly segment words with acceptable accuracy. ABSTRAK: Segmentasi adalah aspek penting dalam menterjemahkan ejaan bahasa isyarat ke dalam huruf Latin. Walaupun terdapat peranti bahasa isyarat yang menterjemahkan ejaan jari menjadi huruf, namun begitu, huruf-huruf yang dihasilkan disimpan dalam rentetan berterusan yang panjang tanpa jarak antara setiap perkataan. Sistem yang dicadangkan di dalam jurnal ini akan diselaraskan bersama dengan sarung tangan bahasa isyarat yang boleh menghasilkan teks. Sistem ini akan mengambil rentetan input teks di mana huruf akan dimasukkan satu persatu dan huruf-huruf itu akan disegmentasikan menjadi perkataan yang betul secara semantik. Kaedah pembahagian yang dicadangkan ialah segmentasi yang menggunakan pengaturcaraan dinamik dan kaedah kebarangkalian untuk mengsegmentasikan huruf-huruf tersebut berdasarkan padanan perkataan dengan pengkalan data di dalam Bahasa Inggeris. Berdasarkan hasil yang telah diperolehi, sistem ini berjaya mengsegmentasikan huruf-huruf tersebut dengan berkesan dan tepat.


2019 ◽  
Vol 56 (1) ◽  
pp. 3-43
Author(s):  
KAROLINA BROŚ

This paper examines opaque examples of phrase-level phonology taken from Chilean Spanish under the framework of Stratal Optimality Theory (OT) (Rubach 1997; Bermúdez-Otero 2003, 2019) and Harmonic Serialism (HS) (McCarthy 2008a, b, 2016). The data show an interesting double repair of the coda /s/ taking place at word edges. It is argued that Stratal OT is superior in modelling phonological processes that take place at the interface between morphology and phonology because it embraces cyclicity. Under this model, prosodic structure is built serially, level by level, and in accordance with the morphological structure of the input string. In this way, opacity at constituent edges can be solved. Stratal OT also provides insight into word-internal morphological structure and the domain-specificity of phonological processes. It is demonstrated that a distinction in this model is necessary between the word and the phrase levels, and between the stem and the word levels. As illustrated by the behaviour of Spanish nouns, affixation and the resultant alternations inform us about the domains to which both morphological and phonological processes should be assigned. Against this background, Harmonic Serialism embraces an apparently simpler recursive mechanism in which stepwise prosodic parsing can be incorporated. What is more, it offers insight into the nature of operations in OT, as well as into such problematic issues as structure building and directionality. Nevertheless, despite the model’s ability to solve various cases of opacity, the need to distinguish between two competing repairs makes HS fail when confronted with the Chilean data under examination.


2019 ◽  
Vol 6 (1) ◽  
pp. 181198 ◽  
Author(s):  
Andrew Adamatzky

We simulate an actin filament as an automaton network. Every atom takes two or three states and updates its state, in discrete time, depending on a ratio of its neighbours in some selected state. All atoms/automata simultaneously update their states by the same rule. Two state transition rules are considered. In semi-totalistic Game of Life like actin filament automaton atoms take binary states ‘0’ and ‘1’ and update their states depending on a ratio of neighbours in the state ‘1’. In excitable actin filament automaton atoms take three states: resting, excited and refractory. A resting atom excites if a ratio of its excited neighbours belong to some specified interval; transitions from excited state to refractory state and from refractory state to resting state are unconditional. In computational experiments, we implement mappings of an 8-bit input string to an 8-bit output string via dynamics of perturbation/excitation on actin filament automata. We assign eight domains in an actin filament as I/O ports. To write True to a port, we perturb/excite a certain percentage of the nodes in the domain corresponding to the port. We read outputs at the ports after some time interval. A port is considered to be in a state True if a number of excited nodes in the port's domain exceed a certain threshold. A range of eight-argument Boolean functions is uncovered in a series of computational trials when all possible configurations of eight-elements binary strings were mapped onto excitation outputs of the I/O domains.


Sign in / Sign up

Export Citation Format

Share Document