scholarly journals Counting Bordered Partial Words by Critical Positions

10.37236/625 ◽  
2011 ◽  
Vol 18 (1) ◽  
Author(s):  
Emily Allen ◽  
F. Blanchet-Sadri ◽  
Cameron Byrum ◽  
Mihai Cucuringu ◽  
Robert Mercaş

A partial word, sequence over a finite alphabet that may have some undefined positions or holes, is bordered if one of its proper prefixes is compatible with one of its suffixes. The number theoretical problem of enumerating all bordered full words (the ones without holes) of a fixed length $n$ over an alphabet of a fixed size $k$ is well known. It turns out that all borders of a full word are simple, and so every bordered full word has a unique minimal border no longer than half its length. Counting bordered partial words having $h$ holes with the parameters $k, n$ is made extremely more difficult by the failure of that combinatorial property since there is now the possibility of a minimal border that is nonsimple. Here, we give recursive formulas based on our approach of the so-called simple and nonsimple critical positions.

2012 ◽  
Vol 23 (06) ◽  
pp. 1189-1206 ◽  
Author(s):  
F. BLANCHET-SADRI

Algorithmic combinatorics on partial words, or sequences of symbols over a finite alphabet that may have some do-not-know symbols or holes, has been developing in the past few years. Applications can be found, for instance, in molecular biology for the sequencing and analysis of DNA, in bio-inspired computing where partial words have been considered for identifying good encodings for DNA computations, and in data compression. In this paper, we focus on two areas of algorithmic combinatorics on partial words, namely, pattern avoidance and subword complexity. We discuss recent contributions as well as a number of open problems. In relation to pattern avoidance, we classify all binary patterns with respect to partial word avoidability, we classify all unary patterns with respect to hole sparsity, and we discuss avoiding abelian powers in partial words. In relation to subword complexity, we generate and count minimal Sturmian partial words, we construct de Bruijn partial words, and we construct partial words with subword complexities not achievable by full words (those without holes).


2020 ◽  
Vol 9 (11) ◽  
pp. 9219-9230
Author(s):  
R.K. Kumari ◽  
R. Arulprakasam ◽  
R. Perumal ◽  
V.R. Dare

Partial words are linear words with holes. Cyclic words are derived from linear words by linking its first letter after the last one. Both partial words and cyclic words have wide applications in DNA sequencing. In this paper we introduce cyclic partial words and discuss their periodicity and certain properties. We also establish representation of a cyclic partial word using trees.


Author(s):  
Y. Yao ◽  
H. Zhao ◽  
D. Huang ◽  
Q. Tan

<p><strong>Abstract.</strong> Remote sensing image scene classification has gained remarkable attention, due to its versatile use in different applications like geospatial object detection, ground object information extraction, environment monitoring and etc. The scene not only contains the information of the ground objects, but also includes the spatial relationship between the ground objects and the environment. With rapid growth of the amount of remote sensing image data, the need for automatic annotation methods for image scenes is more urgent. This paper proposes a new framework for high resolution remote sensing images scene classification based on convolutional neural network. To eliminate the requirement of fixed-size input image, multiple pyramid pooling strategy is equipped between convolutional layers and fully connected layers. Then, the fixed-size features generated by multiple pyramid pooling layer was extended to one-dimension fixed-length vector and fed into fully connected layers. Our method could generate a fixed-length representation regardless of image size, at the same time get higher classification accuracy. On UC-Merced and NWPU-RESISC45 datasets, our framework achieved satisfying accuracies, which is 93.24% and 88.62% respectively.</p>


2018 ◽  
Vol 29 (05) ◽  
pp. 845-860
Author(s):  
Daniil Gasnikov ◽  
Arseny M. Shur

We contribute to the study of square-free words. The classical notion of a square-free word has a natural generalization to partial words, studied in several papers since 2008. We prove that the maximal density of wildcards in the ternary infinite square-free partial word is surprisingly big: [Formula: see text]. Further we show that the density of wildcards in a finitary infinite square-free partial words is at most [Formula: see text] and this bound is reached by a quaternary word. We demonstrate that partial square-free words can be viewed as “usual” square-free words with some letters replaced by wildcards and introduce the corresponding characteristic of infinite square-free words, called flexibility. The flexibility is estimated for some important words and classes of words; an interesting phenomenon is the existence of “rigid” square-free words, having no room for wildcards at all.


Entropy ◽  
2019 ◽  
Vol 21 (6) ◽  
pp. 580
Author(s):  
Albert No

We established a universality of logarithmic loss over a finite alphabet as a distortion criterion in fixed-length lossy compression. For any fixed-length lossy-compression problem under an arbitrary distortion criterion, we show that there is an equivalent lossy-compression problem under logarithmic loss. The equivalence is in the strong sense that we show that finding good schemes in corresponding lossy compression under logarithmic loss is essentially equivalent to finding good schemes in the original problem. This equivalence relation also provides an algebraic structure in the reconstruction alphabet, which allows us to use known techniques in the clustering literature. Furthermore, our result naturally suggests a new clustering algorithm in the categorical data-clustering problem.


2010 ◽  
Vol 21 (05) ◽  
pp. 705-722 ◽  
Author(s):  
F. BLANCHET-SADRI ◽  
TAKTIN OEY ◽  
TIMOTHY D. RANKIN

Fine and Wilf's well-known theorem states that any word having periods p,q and length at least p+q- gcd (p,q) also has gcd (p,q) as a period. Moreover, the length p+q- gcd (p,q) is critical since counterexamples can be provided for shorter words. This result has since been extended to partial words, or finite sequences that may contain some "holes." More precisely, any partial word u with H holes having weak periods p,q and length at least the so-denoted lH(p,q) also has strong period gcd (p,q) provided u is not (H,(p,q))-special. This extension was done for one hole by Berstel and Boasson (where the class of (1,(p,q))-special partial words is empty), and for an arbitrary number of holes by Blanchet-Sadri. In this paper, we further extend these results, allowing an arbitrary number of weak periods. In addition to speciality, the concepts of intractable period sets and interference between periods play a role.


2004 ◽  
Vol 15 (02) ◽  
pp. 355-383 ◽  
Author(s):  
ARTURO CARPI ◽  
ALDO de LUCA

We consider some combinatorial properties of two-dimensional words (or pictures) over a given finite alphabet, which are related to the number of occurrences in them of words of a fixed size (m,n). In particular a two-dimensional word (briefly, 2D-word) is called (m,n)-full if it contains as factors (or subwords) all words of size (m,n). An (m,n)-full word such that any word of size (m,n) occurs in it exactly once is called a de Bruijn word of order (m,n). A 2D-word w is called (m,n)-uniform if the difference in the number of occurrences in w of any two words of size (m,n) is at most 1. A 2D-word is called uniform if it is (m,n)-uniform for all m,n>0. In this paper we extend to the two-dimensional case some results relating the notions above which were proved in the one-dimensional case in a preceding article. In this analysis the study of repeated factors in a 2D-word plays an essential role. Finally, some open problems and conjectures are discussed.


2016 ◽  
Vol 53 (2) ◽  
pp. 606-613 ◽  
Author(s):  
Joseba Dalmau

Abstract We study a classical multitype Galton–Watson process with mutation and selection. The individuals are sequences of fixed length over a finite alphabet. On the sharp peak fitness landscape together with independent mutations per locus, we show that, as the length of the sequences goes to ∞ and the mutation probability goes to 0, the asymptotic relative frequency of the sequences differing on k digits from the master sequence approaches (σe-a - 1)(ak/k!)∑i≥ 1ik/σi, where σ is the selective advantage of the master sequence and a is the product of the length of the chains with the mutation probability. The probability distribution Q(σ, a) on the nonnegative integers given by the above equation is the quasispecies distribution with parameters σ and a.


1968 ◽  
Vol 07 (03) ◽  
pp. 156-158
Author(s):  
Th. R. Taylor

The technique, scope and limitations of a fixed field/fixed length case record utilising the IBM 1232 system is described. The principal problems lie with personnel rather than machinery and with programmes for analysis rather than clinical data.


Sign in / Sign up

Export Citation Format

Share Document