symbolic sequence
Recently Published Documents


TOTAL DOCUMENTS

46
(FIVE YEARS 6)

H-INDEX

9
(FIVE YEARS 0)

2021 ◽  
pp. 1-43
Author(s):  
GUILHEM BRUNET

Abstract Let $m_1 \geq m_2 \geq 2$ be integers. We consider subsets of the product symbolic sequence space $(\{0,\ldots ,m_1-1\} \times \{0,\ldots ,m_2-1\})^{\mathbb {N}^*}$ that are invariant under the action of the semigroup of multiplicative integers. These sets are defined following Kenyon, Peres, and Solomyak and using a fixed integer $q \geq 2$ . We compute the Hausdorff and Minkowski dimensions of the projection of these sets onto an affine grid of the unit square. The proof of our Hausdorff dimension formula proceeds via a variational principle over some class of Borel probability measures on the studied sets. This extends well-known results on self-affine Sierpiński carpets. However, the combinatoric arguments we use in our proofs are more elaborate than in the self-similar case and involve a new parameter, namely $j = \lfloor \log _q ( {\log (m_1)}/{\log (m_2)} ) \rfloor $ . We then generalize our results to the same subsets defined in dimension $d \geq 2$ . There, the situation is even more delicate and our formulas involve a collection of $2d-3$ parameters.


Author(s):  
Vladimir D. Gusev ◽  
Liubov A. Miroshnichenko

An important quantitative characteristic of symbolic sequence (texts, strings) is complexity, which reflects at the intuitive level the degree of their "non-randomness". A.N. Kolmogorov formulated the most general definition of complexity. He proposed measuring the complexity of an object (symbolic sequence) by the length of the shortest descriptions by which this object can be uniquely reconstructed. Since there is no program guaranteed to search for the shortest description, in practice, various algorithmic approximations considered in this paper are used for this purpose. Along with definitions of complexity, suggesting the possibility of reconstruction a sequence from its "description", a number of measures are considered that do not imply such restoration. They are based on the calculation of some quantitative characteristics. Of interest is not only a quantitative assessment of complexity, but also the identification and classification of structural regularities that determine its specific value. In one form or another, they are expressed in the demonstration of repetition in the broadest sense. The considered measures of complexity are conventionally divided into statistical ones that take into account the frequency of occurrence of symbols or short “words” in the text, “dictionary” ones that estimate the number of different “subwords” and “structural” ones based on the identification of long repeating fragments of text and the determination of relationships between them. Most of the methods are designed for sequences of an arbitrary linguistic nature. The special attention paid to DNA sequences, reflected in the title of the article, is due to the importance of the object, manifestations of repetition of different types, and numerous examples of using the concept of complexity in solving problems of classification and evolution of various biological objects. Local structural features found in the sliding window mode in DNA sequences are of considerable interest, since zones of low complexity in the genomes of various organisms are often associated with the regulation of basic genetic processes.


2019 ◽  
Vol 30 (2) ◽  
pp. 304-313
Author(s):  
Dan King ◽  
Sumitra Auschaitrakul

Molecules ◽  
2019 ◽  
Vol 24 (2) ◽  
pp. 348
Author(s):  
Byoungsang Lee ◽  
So Yeon Ahn ◽  
Charles Park ◽  
James J. Moon ◽  
Jung Heon Lee ◽  
...  

In biological systems, a few sequence differences diversify the hybridization profile of nucleotides and enable the quantitative control of cellular metabolism in a cooperative manner. In this respect, the information required for a better understanding may not be in each nucleotide sequence, but representative information contained among them. Existing methodologies for nucleotide sequence design have been optimized to track the function of the genetic molecule and predict interaction with others. However, there has been no attempt to extract new sequence information to represent their inheritance function. Here, we tried to conceptually reveal the presence of a representative sequence from groups of nucleotides. The combined application of the K-means clustering algorithm and the social network analysis theorem enabled the effective calculation of the representative sequence. First, a “common sequence” is made that has the highest hybridization property to analog sequences. Next, the sequence complementary to the common sequence is designated as a ‘representative sequence’. Based on this, we obtained a representative sequence from multiple analog sequences that are 8–10-bases long. Their hybridization was empirically tested, which confirmed that the common sequence had the highest hybridization tendency, and the representative sequence better alignment with the analogs compared to a mere complementary.


2017 ◽  
Vol 27 (14) ◽  
pp. 1750217 ◽  
Author(s):  
Haiyun Xu ◽  
Fangyue Chen ◽  
Weifeng Jin

The topological conjugacy classification of elementary cellular automata with majority memory (ECAMs) is studied under the framework of symbolic dynamics. In the light of the conventional symbolic sequence space, the compact symbolic vector space is identified with a feasible metric and topology. A slight change is introduced to present that all global maps of ECAMs are continuous functions, thereafter generating the compact dynamical systems. By exploiting two fundamental homeomorphisms in symbolic vector space, all ECAMs are furthermore grouped into 88 equivalence classes in the sense that different mappings in the same global equivalence are mutually topologically conjugate.


2016 ◽  
Author(s):  
Daniel J Greenhoe

The spherical metric d_r operates on the surface of a sphere with radius r centered at the origin in a linear space R^n. Thus, for any pair of points (p,q) on the surface of this sphere, (p,q) is in the domain of d_r and d_r(p,q) is the "distance" between those points. However, if x and y are both in R^n but are not on the surface of a common sphere centered at the origin, then (p,q) is not in the domain of d_r and d_r(p,q) is simply undefined. In certain applications, however, it would be useful to have an extension d of d_r to the entire space R^n (rather than just on a surface in R^n). Real world applications for such an extended metric include calculations involving near earth objects, and for certain distance spaces useful in symbolic sequence processing. This paper introduces an extension to the spherical metric using a polar form of linear interpolation. The extension is herein called the "Lagrange arc distance". It has as its domain the entire space R^n, is homogeneous, and is continuous everywhere in R^n except at the origin. However the extension does come at a cost: The Lagrange arc distance d(p,q), as its name suggests, is a distance function rather than a metric. In particular, the triangle inequality does not in general hold. Moreover, it is not translation invariant, does not induce a norm, and balls in the distance space (R^n,d) are not convex. On the other hand, empirical evidence suggests that the Lagrange arc distance results in structure similar to that of the Euclidean metric in that balls in R^2 and R^3 generated by the two functions are in some regions of R^n very similar in form.


2016 ◽  
Author(s):  
Daniel J Greenhoe

The spherical metric d_r operates on the surface of a sphere with radius r centered at the origin in a linear space R^n. Thus, for any pair of points (p,q) on the surface of this sphere, (p,q) is in the domain of d_r and d_r(p,q) is the "distance" between those points. However, if x and y are both in R^n but are not on the surface of a common sphere centered at the origin, then (p,q) is not in the domain of d_r and d_r(p,q) is simply undefined. In certain applications, however, it would be useful to have an extension d of d_r to the entire space R^n (rather than just on a surface in R^n). Real world applications for such an extended metric include calculations involving near earth objects, and for certain distance spaces useful in symbolic sequence processing. This paper introduces an extension to the spherical metric using a polar form of linear interpolation. The extension is herein called the "Lagrange arc distance". It has as its domain the entire space R^n, is homogeneous, and is continuous everywhere in R^n except at the origin. However the extension does come at a cost: The Lagrange arc distance d(p,q), as its name suggests, is a distance function rather than a metric. In particular, the triangle inequality does not in general hold. Moreover, it is not translation invariant, does not induce a norm, and balls in the distance space (R^n,d) are not convex. On the other hand, empirical evidence suggests that the Lagrange arc distance results in structure similar to that of the Euclidean metric in that balls in R^2 and R^3 generated by the two functions are in some regions of R^n very similar in form.


Sign in / Sign up

Export Citation Format

Share Document