Benfordʼs Law Geometry

Author(s):  
Lawrence Leemis

This chapter switches from the traditional analysis of Benford's law using data sets to a search for probability distributions that obey Benford's law. It begins by briefly discussing the origins of Benford's law through the independent efforts of Simon Newcomb (1835–1909) and Frank Benford, Jr. (1883–1948), both of whom made their discoveries through empirical data. Although Benford's law applies to a wide variety of data sets, none of the popular parametric distributions, such as the exponential and normal distributions, agree exactly with Benford's law. The chapter thus highlights the failures of several of these well-known probability distributions in conforming to Benford's law, considers what types of probability distributions might produce data that obey Benford's law, and looks at some of the geometry associated with these probability distributions.

2004 ◽  
Vol 33 (1) ◽  
pp. 229-246 ◽  
Author(s):  
Christina Lynn Geyer ◽  
Patricia Pepple Williamson

2021 ◽  
Vol 16 (1) ◽  
pp. 73-79
Author(s):  
Vitor Hugo Moreau

Reporting of daily new cases and deaths on COVID-19 is one of the main tools to understand and menage the pandemic. However, governments and health authorities worldwide present divergent procedures while registering and reporting their data. Most of the bias in those procedures are influenced by economic and political pressures and may lead to intentional or unintentional data corruption, what can mask crucial information. Benford’s law is a statistical phenomenon, extensively used to detect data corruption in large data sets. Here, we used the Benford’s law to screen and detect inconsistencies in data on daily new cases of COVID-19 reported by 80 countries. Data from 26 countries display severe nonconformity to the Benford’s law (p< 0.01), what may suggest data corruption or manipulation.


Author(s):  
Arno Berger ◽  
Theodore P. Hill

In order to translate the informal versions of Benford's law into more precise formal statements, it is necessary to specify exactly what the Benford property means in various mathematical contexts. For the purpose of this book, the objects of interest fall mainly into three categories: sequences of real numbers, real-valued functions defined on [0,+ ∞), and probability distributions and random variables. This chapter defines Benford sequences, functions, and random variables, with examples of each.


2009 ◽  
Vol 28 (2) ◽  
pp. 305-324 ◽  
Author(s):  
Mark J. Nigrini ◽  
Steven J. Miller

SUMMARY: Auditors are required to use analytical procedures to identify the existence of unusual transactions, events, and trends. Benford's Law gives the expected patterns of the digits in numerical data, and has been advocated as a test for the authenticity and reliability of transaction level accounting data. This paper describes a new second-order test that calculates the digit frequencies of the differences between the ordered (ranked) values in a data set. These digit frequencies approximate the frequencies of Benford's Law for most data sets. The second-order test is applied to four sets of transactional data. The second-order test detected errors in data downloads, rounded data, data generated by statistical procedures, and the inaccurate ordering of data. The test can be applied to any data set and nonconformity usually signals an unusual issue related to data integrity that might not have been easily detectable using traditional analytical procedures.


2021 ◽  
Vol 3 ◽  
pp. 29
Author(s):  
Daniel McCarville

Benford’s Law is an empirical observation about the frequency of digits in a variety of naturally occurring data sets. Auditors and forensic scientists have used Benford’s Law to detect erroneous data in accounting and legal usage. One well-known limitation is that Benford’s Law fails when data have clear minimum and maximum values. Many kinds of education data, including assessment scores, typically include hard maximums and therefore do not meet the parametric assumptions of Benford’s Law. This paper implements a transformation procedure which allows for assessment data to be compared to Benford’s Law. As a case study, a data quality assessment of oral language scores from the Early Childhood Longitudinal Study, Kindergarten (ECLS-K) study is used and higher risk data segments detected. The same method could be used to evaluate other concerns, such as test fraud, or other bounded datasets.


Author(s):  
Susan D'Agostino

“Act natural, because of Benford’s Law” explains how and why large data sets generated as a result of human behavior concerning health records, population counts, tax returns, stock prices, national debts, election data, and more, have numbers whose first digits are unevenly distributed, with Benford’s Law offering percentages. When an individual tampers with a naturally generated data set, they often introduce fake numbers whose first digits are (more or less) evenly distributed from one to nine. Often, a subsequent investigation reveals that someone has tampered with the data set. Mathematics students and enthusiasts are encouraged to act natural so as to avoid looking like a fraudulent data set that does not observe Benford’s Law. At the chapter’s end, readers may check their understanding by working on a problem. A solution is provided.


Author(s):  
Arno Berger ◽  
Theodore P. Hill

This chapter establishes and illustrates three basic invariance properties of the Benford distribution that are instrumental in demonstrating whether or not certain datasets are Benford, and that also prove helpful for predicting which empirical data are likely to follow Benford's law closely. These are the scale-invariance property, base-invariance property, and sum-invariance property.


Author(s):  
David Hoyle

This chapter focuses on the occurrence of Benford's law within the natural sciences, emphasizing that Benford's law is to be expected within many scientific data sets. This is a consequence of the reasonable assumption that a particular scientific process is scale invariant, or nearly scale invariant. The chapter reviews previous work from many fields showing a number of data sets that conform to Benford's law. In each case the underlying scale invariance, or mechanism that leads to scale invariance, is identified. Having established that Benford's law is to be expected for many data sets in the natural sciences, the second half of the chapter highlights generic potential applications of Benford's law. Finally, direct applications of Benford's law are highlighted, whereby the Benford distribution is used in a constructive way rather than simply assessing an already existing data set.


Sign in / Sign up

Export Citation Format

Share Document