zipf law
Recently Published Documents


TOTAL DOCUMENTS

50
(FIVE YEARS 7)

H-INDEX

9
(FIVE YEARS 1)

Entropy ◽  
2021 ◽  
Vol 23 (9) ◽  
pp. 1148
Author(s):  
Łukasz Dębowski

We present a hypothetical argument against finite-state processes in statistical language modeling that is based on semantics rather than syntax. In this theoretical model, we suppose that the semantic properties of texts in a natural language could be approximately captured by a recently introduced concept of a perigraphic process. Perigraphic processes are a class of stochastic processes that satisfy a Zipf-law accumulation of a subset of factual knowledge, which is time-independent, compressed, and effectively inferrable from the process. We show that the classes of finite-state processes and of perigraphic processes are disjoint, and we present a new simple example of perigraphic processes over a finite alphabet called Oracle processes. The disjointness result makes use of the Hilberg condition, i.e., the almost sure power-law growth of algorithmic mutual information. Using a strongly consistent estimator of the number of hidden states, we show that finite-state processes do not satisfy the Hilberg condition whereas Oracle processes satisfy the Hilberg condition via the data-processing inequality. We discuss the relevance of these mathematical results for theoretical and computational linguistics.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Iqrar Ansari ◽  
Khuram Ali Khan ◽  
Ammara Nosheen ◽  
Ðilda Pečarić ◽  
Josip Pečarić

AbstractThe main purpose of the presented paper is to obtain some time scale inequalities for different divergences and distances by using weighted time scales Jensen’s inequality. These results offer new inequalities in h-discrete calculus and quantum calculus and extend some known results in the literature. The lower bounds of some divergence measures are also presented. Moreover, the obtained discrete results are given in the light of the Zipf–Mandelbrot law and the Zipf law.


Author(s):  
Yizhen Wu ◽  
Mingyue Jiang ◽  
Zhijian Chang ◽  
Yuanqing Li ◽  
Kaifang Shi

Currently, whether the urban development in China satisfies Zipf’s law across different scales is still unclear. Thus, this study attempted to explore whether China’s urban development satisfies Zipf’s law across different scales from the National Polar-Orbiting Partnership’s Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) nighttime light data. First, the NPP-VIIRS data were corrected. Then, based on the Zipf law model, the corrected NPP-VIIRS data were used to evaluate China’s urban development at multiple scales. The results showed that the corrected NPP-VIIRS data could effectively reflect the state of urban development in China. Additionally, the Zipf index (q) values, which could express the degree of urban development, decreased from 2012 to 2018 overall in all provinces, prefectures, and counties. Since the value of q was relatively close to 1 with an R2 value > 0.70, the development of the provinces and prefectures was close to the ideal Zipf’s law state. In all counties, q > 1 with an R2 value > 0.70, which showed that the primate county had a relatively stronger monopoly capacity. When the value of q < 1 with a continuous declination in the top 2000 counties, the top 250 prefectures, and the top 20 provinces in equilibrium, there was little difference in the scale of development at the multiscale level with an R2 > 0.90. The results enriched our understanding of urban development in terms of Zipf’s law and had valuable implications for relevant decision-makers and stakeholders.


2020 ◽  
Vol 24 ◽  
pp. 275-293
Author(s):  
Aristides V. Doumas ◽  
Vassilis G. Papanicolaou

The origin of power-law behavior (also known variously as Zipf’s law) has been a topic of debate in the scientific community for more than a century. Power laws appear widely in physics, biology, earth and planetary sciences, economics and finance, computer science, demography and the social sciences. In a highly cited article, Mark Newman [Contemp. Phys. 46 (2005) 323–351] reviewed some of the empirical evidence for the existence of power-law forms, however underscored that even though many distributions do not follow a power law, quite often many of the quantities that scientists measure are close to a Zipf law, and hence are of importance. In this paper we engage a variant of Zipf’s law with a general urn problem. A collector wishes to collect m complete sets of N distinct coupons. The draws from the population are considered to be independent and identically distributed with replacement, and the probability that a type-j coupon is drawn is denoted by pj, j = 1, 2, …, N. Let Tm(N) the number of trials needed for this problem. We present the asymptotics for the expectation (five terms plus an error), the second rising moment (six terms plus an error), and the variance of Tm(N) (leading term) as N →∞, when pj = aj / ∑j=2N+1aj, where aj = (ln j)−p, p > 0. Moreover, we prove that Tm(N) (appropriately normalized) converges in distribution to a Gumbel random variable. These “log-Zipf” classes of coupon probabilities are not covered by the existing literature and the present paper comes to fill this gap. In the spirit of a recent paper of ours [ESAIM: PS 20 (2016) 367–399] we enlarge the classes for which the Dixie cup problem is solved w.r.t. its moments, variance, distribution.


2018 ◽  
Vol 7 (3) ◽  
pp. 1558
Author(s):  
S Lakshmisridevi ◽  
R Devanathan

The application of Zipf’s law is universal not only in linguistics but also in various other areas. Mandelbrot modified Zipf law as Zipf Mandelbrot law and it is further we proposed a modification of the ZM law for modeling rank frequency- data of linguistic text. Our model generalized ZM law into a linear regression model involving arbitrary order of Zipfian rank of words in a text .The performance of the proposed model is studied for an English text and it shown to compare favorably with that of Z-M law using Chi-Square goodness of fit test. In this paper we have applied to Tamil text and its performance is also up to the mark and it is been proved by the Chi-Square test and it addresses mainly the lower ranks, we propose to extend the work to higher order ranks using LNRE model in the future. 


2018 ◽  
Vol 382 (22) ◽  
pp. 1456-1459 ◽  
Author(s):  
Toshiya Ohtsuki ◽  
Satoshi Tanimoto ◽  
Makoto Sekiyama ◽  
Akihiro Fujihara ◽  
Hiroshi Yamamoto
Keyword(s):  

2018 ◽  
Vol 14 (1) ◽  
pp. 1-34 ◽  
Author(s):  
Alexander Koplenig

AbstractUsing the Google Ngram Corpora for six different languages (including two varieties of English), a large-scale time series analysis is conducted. It is demonstrated that diachronic changes of the parameters of the Zipf–Mandelbrot law (and the parameter of the Zipf law, all estimated by maximum likelihood) can be used to quantify and visualize important aspects of linguistic change (as represented in the Google Ngram Corpora). The analysis also reveals that there are important cross-linguistic differences. It is argued that the Zipf–Mandelbrot parameters can be used as a first indicator of diachronic linguistic change, but more thorough analyses should make use of the full spectrum of different lexical, syntactical and stylometric measures to fully understand the factors that actually drive those changes.


Sign in / Sign up

Export Citation Format

Share Document