A New Code for Encoding All Monotone Sources With a Fixed Large Alphabet Size

In this paper, the relation between the Glushkov automaton [Formula: see text] and the partial derivative automaton [Formula: see text] of a given regular expression, in terms of transition complexity, is studied. The average transition complexity of [Formula: see text] was proved by Nicaud to be linear in the size of the corresponding expression. This result was obtained using an upper bound of the number of transitions of [Formula: see text]. Here we present a new quadratic construction of [Formula: see text] that leads to a more elegant and straightforward implementation, and that allows the exact counting of the number of transitions. Based on that, a better estimation of the average size is presented. Asymptotically, and as the alphabet size grows, the number of transitions per state is on average 2. Broda et al. computed an upper bound for the ratio of the number of states of [Formula: see text] to the number of states of [Formula: see text] which is about ½ for large alphabet sizes. Here we show how to obtain an upper bound for the number of transitions in [Formula: see text], which we then use to get an average case approximation. In conclusion, assymptotically, and for large alphabets, the size of [Formula: see text] is half the size of the [Formula: see text]. This is corroborated by some experiments, even for small alphabets and small regular expressions.

Download Full-text

Counting Finite Languages by Total Word Length

Integers ◽

10.1515/integ.2011.068 ◽

2011 ◽

Vol 11 (6) ◽

Cited By ~ 3

Author(s):

Stefan Gerhold

Keyword(s):

Explicit Expression ◽

Generating Function ◽

Word Length ◽

Finite Alphabet ◽

Alphabet Size ◽

Total Length ◽

Large Alphabet ◽

Finite Language

AbstractWe investigate the number of sets of words that can be formed from a finite alphabet, counted by the total length of the words in the set. An explicit expression for the counting sequence is derived from the generating function, and asymptotics for large alphabet size and large total word length are discussed. Moreover, we derive a Gaussian limit law for the number of words in a random finite language.

Download Full-text

Memory-assisted compression of seismic data: Tackling a large alphabet-size problem by statistical methods

10.1190/segam2017-17750316.1 ◽

2017 ◽

Cited By ~ 1

Author(s):

Ali Payani ◽

Afshin Abdi ◽

Faramarz Fekri

Keyword(s):

Statistical Methods ◽

Seismic Data ◽

Alphabet Size ◽

Size Problem ◽

Large Alphabet

Download Full-text

Non-binary LDPC codes with large alphabet size

2014 IEEE International Symposium on Information Theory ◽

10.1109/isit.2014.6875273 ◽

2014 ◽

Author(s):

Koji Tazoe ◽

Kenta Kasai ◽

Kohichi Sakaniwa

Keyword(s):

Ldpc Codes ◽

Alphabet Size ◽

Large Alphabet

Download Full-text

Coding over an erasure channel with a large alphabet size

2008 IEEE International Symposium on Information Theory ◽

10.1109/isit.2008.4595148 ◽

2008 ◽

Cited By ~ 10

Author(s):

Shervan Fashandi ◽

Shahab Oveis Gharan ◽

Amir K. Khandani

Keyword(s):

Alphabet Size ◽

Large Alphabet ◽

Erasure Channel

Download Full-text

Large-Alphabet Semi-Static Entropy Coding Via Asymmetric Numeral Systems

ACM Transactions on Information Systems ◽

10.1145/3397175 ◽

2020 ◽

Vol 38 (4) ◽

pp. 1-33

Author(s):

Alistair Moffat ◽

Matthias Petri

Keyword(s):

Entropy Coding ◽

Large Alphabet ◽

Numeral Systems

Download Full-text

Layered schemes for large-alphabet secret key distribution

2013 Information Theory and Applications Workshop (ITA) ◽

10.1109/ita.2013.6502993 ◽

2013 ◽

Author(s):

Hongchao Zhou ◽

Ligong Wang ◽

G. Wornell

Keyword(s):

Key Distribution ◽

Secret Key ◽

Large Alphabet

Download Full-text

The Display Mode and the Combination of Sequence Length and Alphabet Size as Factors in Keying Speed and Accuracy

IEEE Transactions on Human Factors in Electronics ◽

10.1109/thfe.1966.232336 ◽

1966 ◽

Vol HFE-7 (3) ◽

pp. 110-115 ◽

Cited By ~ 3

Author(s):

R.L. Deininger ◽

M.J. Billington ◽

R.R. Riesz

Keyword(s):

Sequence Length ◽

Alphabet Size ◽

Display Mode ◽

Speed And Accuracy

Download Full-text

Efficient Web Mining for Traversal Path Patterns

Web Mining ◽

10.4018/978-1-59140-414-9.ch015 ◽

2011 ◽

pp. 322-338 ◽

Cited By ~ 1

Author(s):

Zhixiang Chen ◽

Richard H. Fowler ◽

Ada Wai-Chee Fu ◽

Chunyue Wang

Keyword(s):

Web Mining ◽

Linear Time ◽

Fundamental Problem ◽

A Priori ◽

Web Pages ◽

Suffix Trees ◽

Web Logs ◽

Large Alphabet ◽

Optimal Linear ◽

Linear Time Algorithms

A maximal forward reference of a Web user is a longest consecutive sequence of Web pages visited by the user in a session without revisiting some previously visited page in the sequence. Efficient mining of frequent traversal path patterns, that is, large reference sequences of maximal forward references, from very large Web logs is a fundamental problem in Web mining. This chapter aims at designing algorithms for this problem with the best possible efficiency. First, two optimal linear time algorithms are designed for finding maximal forward references from Web logs. Second, two algorithms for mining frequent traversal path patterns are devised with the help of a fast construction of shallow generalized suffix trees over a very large alphabet. These two algorithms have respectively provable linear and sublinear time complexity, and their performances are analyzed in comparison with the a priori-like algorithms and the Ukkonen algorithm. It is shown that these two new algorithms are substantially more efficient than the a priori-like algorithms and the Ukkonen algorithm.

Download Full-text

Reducing Local Alphabet Size in Recognizable Picture Languages

Developments in Language Theory - Lecture Notes in Computer Science ◽

10.1007/978-3-030-81508-0_9 ◽

2021 ◽

pp. 103-116

Author(s):

Stefano Crespi Reghizzi ◽

Antonio Restivo ◽

Pierluigi San Pietro

Keyword(s):

Alphabet Size ◽

Picture Languages

Download Full-text