subword complexity
Recently Published Documents


TOTAL DOCUMENTS

50
(FIVE YEARS 5)

H-INDEX

9
(FIVE YEARS 1)

Author(s):  
Damien Jamet ◽  
Pierre Popoli ◽  
Thomas Stoll

AbstractAutomatic sequences are not suitable sequences for cryptographic applications since both their subword complexity and their expansion complexity are small, and their correlation measure of order 2 is large. These sequences are highly predictable despite having a large maximum order complexity. However, recent results show that polynomial subsequences of automatic sequences, such as the Thue–Morse sequence, are better candidates for pseudorandom sequences. A natural generalization of automatic sequences are morphic sequences, given by a fixed point of a prolongeable morphism that is not necessarily uniform. In this paper we prove a lower bound for the maximum order complexity of the sum of digits function in Zeckendorf base which is an example of a morphic sequence. We also prove that the polynomial subsequences of this sequence keep large maximum order complexity, such as the Thue–Morse sequence.


Entropy ◽  
2020 ◽  
Vol 22 (2) ◽  
pp. 207 ◽  
Author(s):  
Lida Ahmadi ◽  
Mark Daniel Ward

Patterns within strings enable us to extract vital information regarding a string’s randomness. Understanding whether a string is random (Showing no to little repetition in patterns) or periodic (showing repetitions in patterns) are described by a value that is called the kth Subword Complexity of the character string. By definition, the kth Subword Complexity is the number of distinct substrings of length k that appear in a given string. In this paper, we evaluate the expected value and the second factorial moment (followed by a corollary on the second moment) of the kth Subword Complexity for the binary strings over memory-less sources. We first take a combinatorial approach to derive a probability generating function for the number of occurrences of patterns in strings of finite length. This enables us to have an exact expression for the two moments in terms of patterns’ auto-correlation and correlation polynomials. We then investigate the asymptotic behavior for values of k = Θ ( log n ) . In the proof, we compare the distribution of the kth Subword Complexity of binary strings to the distribution of distinct prefixes of independent strings stored in a trie. The methodology that we use involves complex analysis, analytical poissonization and depoissonization, the Mellin transform, and saddle point analysis.


2019 ◽  
Vol 792 ◽  
pp. 96-116 ◽  
Author(s):  
Jeffrey Shallit ◽  
Arseny Shur
Keyword(s):  

2019 ◽  
Vol 20 (7) ◽  
pp. 1704 ◽  
Author(s):  
Chengchao Wu ◽  
Jin Chen ◽  
Yunxia Liu ◽  
Xuehai Hu

Abstract: Deciphering the code of cis-regulatory element (CRE) is one of the core issues of current biology. As an important category of CRE, enhancers play crucial roles in gene transcriptional regulations in a distant manner. Further, the disruption of an enhancer can cause abnormal transcription and, thus, trigger human diseases, which means that its accurate identification is currently of broad interest. Here, we introduce an innovative concept, i.e., abelian complexity function (ACF), which is a more complex extension of the classic subword complexity function, for a new coding of DNA sequences. After feature selection by an upper bound estimation and integration with DNA composition features, we developed an enhancer prediction model with hybrid abelian complexity features (HACF). Compared with existing methods, HACF shows consistently superior performance on three sources of enhancer datasets. We tested the generalization ability of HACF by scanning human chromosome 22 to validate previously reported super-enhancers. Meanwhile, we identified novel candidate enhancers which have supports from enhancer-related ENCODE ChIP-seq signals. In summary, HACF improves current enhancer prediction and may be beneficial for further prioritization of functional noncoding variants.


2015 ◽  
Vol 25 (1) ◽  
pp. 96-106
Author(s):  
Jeffrey Shallit

Abstract A sequence (an)n≥0 is k-automatic if there is a finite automaton that, on input n expressed in base k, reaches a state with output an. In this paper I will survey some recent advances concerning enumeration of various aspects of these sequences, such as the recurrence function, and the subword complexity (which counts the number of distinct blocks of length n).


Author(s):  
Julien Cassaigne ◽  
Anna E. Frid ◽  
Svetlana Puzynina ◽  
Luca Q. Zamboni
Keyword(s):  

2013 ◽  
Vol 35 (2) ◽  
pp. 461-481 ◽  
Author(s):  
DONG HAN KIM ◽  
SEONHEE LIM

AbstractIn this article, we discuss subword complexity of colorings of regular trees. We characterize colorings of bounded subword complexity and study Sturmian colorings, which are colorings of minimal unbounded subword complexity. We classify Sturmian colorings using their type sets. We show that any Sturmian coloring is a lifting of a coloring on a quotient graph of the tree which is a geodesic or a ray, with loops possibly attached, thus a lifting of an ‘infinite word’. We further give a complete characterization of the quotient graph for eventually periodic colorings.


Sign in / Sign up

Export Citation Format

Share Document