The logarithmic Zipf law in a general urn problem

2020 ◽  
Vol 24 ◽  
pp. 275-293
Author(s):  
Aristides V. Doumas ◽  
Vassilis G. Papanicolaou

The origin of power-law behavior (also known variously as Zipf’s law) has been a topic of debate in the scientific community for more than a century. Power laws appear widely in physics, biology, earth and planetary sciences, economics and finance, computer science, demography and the social sciences. In a highly cited article, Mark Newman [Contemp. Phys. 46 (2005) 323–351] reviewed some of the empirical evidence for the existence of power-law forms, however underscored that even though many distributions do not follow a power law, quite often many of the quantities that scientists measure are close to a Zipf law, and hence are of importance. In this paper we engage a variant of Zipf’s law with a general urn problem. A collector wishes to collect m complete sets of N distinct coupons. The draws from the population are considered to be independent and identically distributed with replacement, and the probability that a type-j coupon is drawn is denoted by pj, j = 1, 2, …, N. Let Tm(N) the number of trials needed for this problem. We present the asymptotics for the expectation (five terms plus an error), the second rising moment (six terms plus an error), and the variance of Tm(N) (leading term) as N →∞, when pj = aj / ∑j=2N+1aj, where aj = (ln j)−p, p > 0. Moreover, we prove that Tm(N) (appropriately normalized) converges in distribution to a Gumbel random variable. These “log-Zipf” classes of coupon probabilities are not covered by the existing literature and the present paper comes to fill this gap. In the spirit of a recent paper of ours [ESAIM: PS 20 (2016) 367–399] we enlarge the classes for which the Dixie cup problem is solved w.r.t. its moments, variance, distribution.


2002 ◽  
Vol 05 (01) ◽  
pp. 1-6 ◽  
Author(s):  
RAMON FERRER i CANCHO ◽  
RICARD V. SOLÉ

Random-text models have been proposed as an explanation for the power law relationship between word frequency and rank, the so-called Zipf's law. They are generally regarded as null hypotheses rather than models in the strict sense. In this context, recent theories of language emergence and evolution assume this law as a priori information with no need of explanation. Here, random texts and real texts are compared through (a) the so-called lexical spectrum and (b) the distribution of words having the same length. It is shown that real texts fill the lexical spectrum much more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high.



Author(s):  
Yizhen Wu ◽  
Mingyue Jiang ◽  
Zhijian Chang ◽  
Yuanqing Li ◽  
Kaifang Shi

Currently, whether the urban development in China satisfies Zipf’s law across different scales is still unclear. Thus, this study attempted to explore whether China’s urban development satisfies Zipf’s law across different scales from the National Polar-Orbiting Partnership’s Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) nighttime light data. First, the NPP-VIIRS data were corrected. Then, based on the Zipf law model, the corrected NPP-VIIRS data were used to evaluate China’s urban development at multiple scales. The results showed that the corrected NPP-VIIRS data could effectively reflect the state of urban development in China. Additionally, the Zipf index (q) values, which could express the degree of urban development, decreased from 2012 to 2018 overall in all provinces, prefectures, and counties. Since the value of q was relatively close to 1 with an R2 value > 0.70, the development of the provinces and prefectures was close to the ideal Zipf’s law state. In all counties, q > 1 with an R2 value > 0.70, which showed that the primate county had a relatively stronger monopoly capacity. When the value of q < 1 with a continuous declination in the top 2000 counties, the top 250 prefectures, and the top 20 provinces in equilibrium, there was little difference in the scale of development at the multiscale level with an R2 > 0.90. The results enriched our understanding of urban development in terms of Zipf’s law and had valuable implications for relevant decision-makers and stakeholders.



2012 ◽  
Vol 2012 ◽  
pp. 1-21 ◽  
Author(s):  
Yanguang Chen

Hierarchy of cities reflects the ubiquitous structure frequently observed in the natural world and social institutions. Where there is a hierarchy with cascade structure, there is a Zipf's rank-size distribution, andvice versa. However, we have no theory to explain the spatial dynamics associated with Zipf's law of cities. In this paper, a new angle of view is proposed to find the simple rules dominating complex systems and regular patterns behind random distribution of cities. The hierarchical structure can be described with a set of exponential functions that are identical in form to Horton-Strahler's laws on rivers and Gutenberg-Richter's laws on earthquake energy. From the exponential models, we can derive four power laws including Zipf's law indicative of fractals and scaling symmetry. A card-shuffling model is built to interpret the relation between Zipf's law and hierarchy of cities. This model can be expanded to illuminate the general empirical power-law distributions across the individual physical and social sciences, which are hard to be comprehended within the specific scientific domains. This research is useful for us to understand how complex systems such as networks of cities are self-organized.



2013 ◽  
Vol 15 (4) ◽  
pp. 043021 ◽  
Author(s):  
Matt Visser


2021 ◽  
Vol 145 ◽  
pp. 104324
Author(s):  
Juan C Quiroz ◽  
Liliana Laranjo ◽  
Catalin Tufanaru ◽  
Ahmet Baki Kocaballi ◽  
Dana Rezazadegan ◽  
...  


2014 ◽  
Vol 28 (11) ◽  
pp. 1450088 ◽  
Author(s):  
K. Lukierska-Walasek ◽  
K. Topolski

In this paper, we describe the link between the Zipf law and statistical distributions for the Fortuin–Kasteleyn clusters in Ising as well as Potts models. From these results, it is seen that Zipf's law can be a criterion of a phase transition, but it does not determine its order. We present the corresponding histograms for fixed domain configurations.



2020 ◽  
Author(s):  
Ciprian Florin Pater ◽  
Deni Mazrekaj

Many economic regularities have been found to adhere to power laws. In this paper, we apply Benford’s law to consumer price index data from Norway and Zipf’s law on a Norwegian report about the history of Norwegian national accounts. Norway is a particularly interesting country to study as it scores among the highest-ranked countries on data quality. We find that the consumer price index adheres to Benford’s law, showing high data quality. On the other hand, our results do indicate that the report does not adhere to Zipf’s law.



2020 ◽  
Vol 11 ◽  
Author(s):  
Jayden L. Macklin-Cordes ◽  
Erich R. Round

Causal processes can give rise to distinctive distributions in the linguistic variables that they affect. Consequently, a secure understanding of a variable's distribution can hold a key to understanding the forces that have causally shaped it. A storied distribution in linguistics has been Zipf's law, a kind of power law. In the wake of a major debate in the sciences around power-law hypotheses and the unreliability of earlier methods of evaluating them, here we re-evaluate the distributions claimed to characterize phoneme frequencies. We infer the fit of power laws and three alternative distributions to 166 Australian languages, using a maximum likelihood framework. We find evidence supporting earlier results, but also nuancing them and increasing our understanding of them. Most notably, phonemic inventories appear to have a Zipfian-like frequency structure among their most-frequent members (though perhaps also a lognormal structure) but a geometric (or exponential) structure among the least-frequent. We compare these new insights the kinds of causal processes that affect the evolution of phonemic inventories over time, and identify a potential account for why, despite there being an important role for phonetic substance in phonemic change, we could still expect inventories with highly diverse phonetic content to share similar distributions of phoneme frequencies. We conclude with priorities for future work in this promising program of research.



Sign in / Sign up

Export Citation Format

Share Document