Stochastic model of Zipf's law and the universality of the power-law exponent

2014 ◽  
Vol 89 (4) ◽  
Author(s):  
Ken Yamamoto
2002 ◽  
Vol 05 (01) ◽  
pp. 1-6 ◽  
Author(s):  
RAMON FERRER i CANCHO ◽  
RICARD V. SOLÉ

Random-text models have been proposed as an explanation for the power law relationship between word frequency and rank, the so-called Zipf's law. They are generally regarded as null hypotheses rather than models in the strict sense. In this context, recent theories of language emergence and evolution assume this law as a priori information with no need of explanation. Here, random texts and real texts are compared through (a) the so-called lexical spectrum and (b) the distribution of words having the same length. It is shown that real texts fill the lexical spectrum much more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high.


1982 ◽  
Vol 14 (11) ◽  
pp. 1449-1467 ◽  
Author(s):  
B Roehner ◽  
K E Wiese

A dynamic deterministic model of urban growth is proposed, which in its most simple form yields Zipf's law for city-size distribution, and in its general form may account for distributions that deviate strongly from Zipf's law. The qualitative consequences of the model are examined, and a corresponding stochastic model is introduced, which permits, in particular, the study of zero-growth situations.


2020 ◽  
Vol 24 ◽  
pp. 275-293
Author(s):  
Aristides V. Doumas ◽  
Vassilis G. Papanicolaou

The origin of power-law behavior (also known variously as Zipf’s law) has been a topic of debate in the scientific community for more than a century. Power laws appear widely in physics, biology, earth and planetary sciences, economics and finance, computer science, demography and the social sciences. In a highly cited article, Mark Newman [Contemp. Phys. 46 (2005) 323–351] reviewed some of the empirical evidence for the existence of power-law forms, however underscored that even though many distributions do not follow a power law, quite often many of the quantities that scientists measure are close to a Zipf law, and hence are of importance. In this paper we engage a variant of Zipf’s law with a general urn problem. A collector wishes to collect m complete sets of N distinct coupons. The draws from the population are considered to be independent and identically distributed with replacement, and the probability that a type-j coupon is drawn is denoted by pj, j = 1, 2, …, N. Let Tm(N) the number of trials needed for this problem. We present the asymptotics for the expectation (five terms plus an error), the second rising moment (six terms plus an error), and the variance of Tm(N) (leading term) as N →∞, when pj = aj / ∑j=2N+1aj, where aj = (ln j)−p, p > 0. Moreover, we prove that Tm(N) (appropriately normalized) converges in distribution to a Gumbel random variable. These “log-Zipf” classes of coupon probabilities are not covered by the existing literature and the present paper comes to fill this gap. In the spirit of a recent paper of ours [ESAIM: PS 20 (2016) 367–399] we enlarge the classes for which the Dixie cup problem is solved w.r.t. its moments, variance, distribution.


2021 ◽  
Vol 145 ◽  
pp. 104324
Author(s):  
Juan C Quiroz ◽  
Liliana Laranjo ◽  
Catalin Tufanaru ◽  
Ahmet Baki Kocaballi ◽  
Dana Rezazadegan ◽  
...  

Fractals ◽  
2004 ◽  
Vol 12 (01) ◽  
pp. 49-53 ◽  
Author(s):  
TAISEI KAIZOJI ◽  
MASAHIDE NUKI

We show power-scaling behaviors for fluctuations in share volume, which no other studies have done so far. After analyzing a database of the daily transactions for all securities listed on the Tokyo Stock Exchange, we selected 1050 large companies that each had an unbroken series of daily trading activity from January 1975 to January 2002. We found that the cumulative distributions of daily fluctuations in share volumes can be well described by a power-law decay, and that the cumulative distributions for almost all of the companies can be characterized by an exponent within the stable Lévy domain 0 < α < 2. Furthermore, more than 35% of the cumulative distributions can be well approximated by Zipf's law, i.e. the cumulative distributions have an exponent close to unity.


Author(s):  
Dariusz Skotarek

Zipf’s Law states that within a given text the frequency of any word is inversely proportional to its rank in the frequency table of the words used in that text. It is a statistical regularity of a power law that occurs ubiquitously in language – so far every language that has been tested was found to display the Zipfian distribution. Toki Pona is an experimental artificial language spoken by hundreds of users. It is extremely minimalistic – its vocabulary consists of mere 120 words. A comparative statistical analysis of two parallel texts in French and Toki Pona showed that even a language of such scarce vocabulary adheres to Zipf’s Law just like natural languages.


Glottotheory ◽  
2019 ◽  
Vol 9 (2) ◽  
pp. 113-129
Author(s):  
Victor Davis

Abstract Heap’s Law https://dl.acm.org/citation.cfm?id=539986 Heaps, H S 1978 Information Retrieval: Computational and Theoretical Aspects (Academic Press). states that in a large enough text corpus, the number of types as a function of tokens grows as N = K{M^\beta } for some free parameters K, \beta . Much has been written http://iopscience.iop.org/article/10.1088/1367-2630/15/9/093033 Font-Clos, Francesc 2013 A scaling law beyond Zipf’s law and its relation to Heaps’ law (New Journal of Physics 15 093033)., http://iopscience.iop.org/article/10.1088/1367-2630/11/12/123015 Bernhardsson S, da Rocha L E C and Minnhagen P 2009 The meta book and size-dependent properties of written language (New Journal of Physics 11 123015)., http://iopscience.iop.org/article/10.1088/1742-5468/2011/07/P07013 Bernhardsson S, Ki Baek and Minnhagen 2011 A paradoxical property of the monkey book (Journal of Statistical Mechanics: Theory and Experiment, Volume 2011)., http://milicka.cz/kestazeni/type-token_relation.pdf Milička, Jiří 2009 Type-token & Hapax-token Relation: A Combinatorial Model (Glottotheory. International Journal of Theoretical Linguistics 2 (1), 99–110)., https://www.nature.com/articles/srep00943 Petersen, Alexander 2012 Languages cool as they expand: Allometric scaling and the decreasing need for new words (Scientific Reports volume 2, Article number: 943). about how this result and various generalizations can be derived from Zipf’s Law. http://dx.doi.org/10.1037/h0052442 Zipf, George 1949 Human behavior and the principle of least effort (Reading: Addison-Wesley). Here we derive from first principles a completely novel expression of the type-token curve and prove its superior accuracy on real text. This expression naturally generalizes to equally accurate estimates for counting hapaxes and higher n-legomena.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Giordano De Marzo ◽  
Andrea Gabrielli ◽  
Andrea Zaccaria ◽  
Luciano Pietronero

Sign in / Sign up

Export Citation Format

Share Document