scholarly journals Extended power-law scaling of heavy-tailed random air-permeability fields in fractured and sedimentary rocks

2012 ◽  
Vol 16 (9) ◽  
pp. 3249-3260 ◽  
Author(s):  
A. Guadagnini ◽  
M. Riva ◽  
S. P. Neuman

Abstract. We analyze the scaling behaviors of two field-scale log permeability data sets showing heavy-tailed frequency distributions in three and two spatial dimensions, respectively. One set consists of 1-m scale pneumatic packer test data from six vertical and inclined boreholes spanning a decameters scale block of unsaturated fractured tuffs near Superior, Arizona, the other of pneumatic minipermeameter data measured at a spacing of 15 cm along three horizontal transects on a 21 m long and 6 m high outcrop of the Upper Cretaceous Straight Cliffs Formation, including lower-shoreface bioturbated and cross-bedded sandstone near Escalante, Utah. Order q sample structure functions of each data set scale as a power ξ(q) of separation scale or lag, s, over limited ranges of s. A procedure known as extended self-similarity (ESS) extends this range to all lags and yields a nonlinear (concave) functional relationship between ξ(q) and q. Whereas the literature tends to associate extended and nonlinear power-law scaling with multifractals or fractional Laplace motions, we have shown elsewhere that (a) ESS of data having a normal frequency distribution is theoretically consistent with (Gaussian) truncated (additive, self-affine, monofractal) fractional Brownian motion (tfBm), the latter being unique in predicting a breakdown in power-law scaling at small and large lags, and (b) nonlinear power-law scaling of data having either normal or heavy-tailed frequency distributions is consistent with samples from sub-Gaussian random fields or processes subordinated to tfBm or truncated fractional Gaussian noise (tfGn), stemming from lack of ergodicity which causes sample moments to scale differently than do their ensemble counterparts. Here we (i) demonstrate that the above two data sets are consistent with sub-Gaussian random fields subordinated to tfBm or tfGn and (ii) provide maximum likelihood estimates of parameters characterizing the corresponding Lévy stable subordinators and tfBm or tfGn functions.

2012 ◽  
Vol 9 (6) ◽  
pp. 7379-7413
Author(s):  
A. Guadagnini ◽  
M. Riva ◽  
S. P. Neuman

Abstract. We analyze the scaling behaviors of two log permeability data sets showing heavy-tailed frequency distributions in three and two spatial dimensions, respectively. One set consists of 1-m scale pneumatic packer test data from six vertical and inclined boreholes spanning a decameters scale block of unsaturated fractured tuffs near Superior, Arizona, the other of pneumatic minipermeameter data measured at a spacing of 15 cm along two horizontal transects on a 21 m long outcrop of lower-shoreface bioturbated sandstone near Escalante, Utah. Order q sample structure functions of each data set scale as a power ξ (q) of separation scale or lag, s, over limited ranges of s. A procedure known as Extended Self-Similarity (ESS) extends this range to all lags and yields a nonlinear (concave) functional relationship between ξ (q) and q. Whereas the literature tends to associate extended and nonlinear power-law scaling with multifractals or fractional Laplace motions, we have shown elsewhere that (a) ESS of data having a normal frequency distribution is theoretically consistent with (Gaussian) truncated (additive, self-affine, monofractal) fractional Brownian motion (tfBm), the latter being unique in predicting a breakdown in power-law scaling at small and large lags, and (b) nonlinear power-law scaling of data having either normal or heavy-tailed frequency distributions is consistent with samples from sub-Gaussian random fields or processes subordinated to tfBm, stemming from lack of ergodicity which causes sample moments to scale differently than do their ensemble counterparts. Here we (i) demonstrate that the above two data sets are consistent with sub-Gaussian random fields subordinated to tfBm and (ii) provide maximum likelihood estimates of parameters characterizing the corresponding Lévy stable subordinators and tfBm functions.


Plant Disease ◽  
2006 ◽  
Vol 90 (11) ◽  
pp. 1433-1440 ◽  
Author(s):  
David H. Gent ◽  
Walter F. Mahaffee ◽  
William W. Turechek

The spatial heterogeneity of the incidence of hop cones with powdery mildew (Podosphaera macularis) was characterized from transect surveys of 41 commercial hop yards in Oregon and Washington from 2000 to 2005. The proportion of sampled cones with powdery mildew ( p) was recorded for each of 221 transects, where N = 60 sampling units of n = 25 cones assessed in each transect according to a cluster sampling strategy. Disease incidence ranged from 0 to 0.92 among all yards and dates. The binomial and beta-binomial frequency distributions were fit to the N sampling units in a transect using maximum likelihood. The estimation procedure converged for 74% of the data sets where p > 0, and a loglikelihood ratio test indicated that the beta-binomial distribution provided a better fit to the data than the binomial distribution for 46% of the data sets, indicating an aggregated pattern of disease. Similarly, the C(α) test indicated that 54% could be described by the beta-binomial distribution. The heterogeneity parameter of the beta-binomial distribution, θ, a measure of variation among sampling units, ranged from 0.01 to 0.20, with a mean of 0.037 and a median of 0.015. Estimates of the index of dispersion ranged from 0.79 to 7.78, with a mean of 1.81 and a median of 1.37, and were significantly greater than 1 for 54% of the data sets. The binary power law provided an excellent fit to the data, with slope and intercept parameters significantly greater than 1, which indicated that heterogeneity varied systematically with the incidence of infected cones. A covariance analysis indicated that the geographic location (region) of the yards and the type of hop cultivar had little effect on heterogeneity; however, the year of sampling significantly influenced the intercept and slope parameters of the binary power law. Significant spatial autocorrelation was detected in only 11% of the data sets, with estimates of first-order autocorrelation, r1, ranging from -0.30 to 0.70, with a mean of 0.06 and a median of 0.04; however, correlation was detected in only 20 and 16% of the data sets by median and ordinary runs analysis, respectively. Together, these analyses suggest that the incidence of powdery mildew on cones was slightly aggregated among plants, but patterns of aggregation larger than the sampling unit were rare (20% or less of data sets). Knowledge of the heterogeneity of diseased cones was used to construct fixed sampling curves to precisely estimate the incidence of powdery mildew on cones at varying disease intensities. Use of the sampling curves developed in this research should help to improve sampling methods for disease assessment and management decisions.


1997 ◽  
Vol 20 (3) ◽  
pp. 529-547 ◽  
Author(s):  
Bettina Hosenfeld ◽  
Han L.J. van der Maas ◽  
Dymphna C. van den Boom

This paper reports on modelling six frequency distributions representing the analogical reasoning performance of four different samples of elementary schoolchildren. A two-component model outperformed a one-component model in all investigated data sets, discriminating accurate performers with high success probabilities and inaccurate performers with low success probabilities, whereas for two data sets a three-component model provided the best fit. In a treatment-control group data set, the treatment group comprised a larger proportion of accurate performers than the control group, whereas the success probabilities of the two latent classes were nearly identical in both groups. In a repeated-measures data set, both the success probabilities of the two latent classes and the proportion of accurate performers increased from the first to the second test session. The results provided a first indication of a transition in the development of analogical reasoning in elementary schoolchildren.


1984 ◽  
Vol 15 (3) ◽  
pp. 155-161
Author(s):  
C. Firer

In this article the concept of never-buyers of consumer non-durables is discussed. The traditional Negative Binomial Distribution approach of Ehrenberg to the question is presented. Previously unpublished work carried out at the Graduate School of Business Administration, University of the Witwatersrand, is reviewed and hypotheses are put forward that the observed large zero cell in the purchase frequency distributions may be caused by the existence of a group of never-buyers of the product, or by the superimposition of at least two distinct buying populations, previously identified as brand-loyal and multibrand/brand-switching households. The results of the research aimed at testing the first hypothesis are presented here. Two carefully monitored data sets were modelled using zero-augmented Negative Binomial and Sichel distributions. The data were previously shown to exhibit the necessary mean households purchase/consumption stationarity. Individual brands in one data set (purchases of toilet soap) were shown to follow the predictions of the traditional theory - the proportion of non-buyers decreasing with time. In the second data set (consumption of packaged soup) the proportion of non-consumers of the brands fell towards zero as the length of the time period studied was increased, but at a rate faster than that predicted by the theory. The hypothesis of the existence of never-buyers/users of individual brands in these two product classes was therefore rejected.


Author(s):  
Drew Levin ◽  
Patrick Finley

ObjectiveTo develop a spatially accurate biosurveillance synthetic datagenerator for the testing, evaluation, and comparison of new outbreakdetection techniques.IntroductionDevelopment of new methods for the rapid detection of emergingdisease outbreaks is a research priority in the field of biosurveillance.Because real-world data are often proprietary in nature, scientists mustutilize synthetic data generation methods to evaluate new detectionmethodologies. Colizza et. al. have shown that epidemic spread isdependent on the airline transportation network [1], yet current datagenerators do not operate over network structures.Here we present a new spatial data generator that models thespread of contagion across a network of cities connected by airlineroutes. The generator is developed in the R programming languageand produces data compatible with the popular `surveillance’ softwarepackage.MethodsColizza et. al. demonstrate the power-law relationships betweencity population, air traffic, and degree distribution [1]. We generate atransportation network as a Chung-Lu random graph [2] that preservesthese scale-free relationships (Figure 1).First, given a power-law exponent and a desired number of cities,a probability mass function (PMF) is generated that mirrors theexpected degree distribution for the given power-law relationship.Values are then sampled from this PMF to generate an expecteddegree (number of connected cities) for each city in the network.Edges (airline connections) are added to the network probabilisticallyas described in [2]. Unconnected graph components are each joinedto the largest component using linear preferential attachment. Finally,city sizes are calculated based on an observed three-quarter power-law scaling relationship with the sampled degree distribution.Each city is represented as a customizable stochastic compartmentalSIR model. Transportation between cities is modeled similar to [2].An infection is initialized in a single random city and infection countsare recorded in each city for a fixed period of time. A consistentfraction of the modeled infection cases are recorded as daily clinicvisits. These counts are then added onto statically generated baselinedata for each city to produce a full synthetic data set. Alternatively,data sets can be generated using real-world networks, such as the onemaintained by the International Air Transport Association.ResultsDynamics such as the number of cities, degree distribution power-law exponent, traffic flow, and disease kinetics can be customized.In the presented example (Figure 2) the outbreak spreads over a 20city transportation network. Infection spreads rapidly once the morepopulated hub cities are infected. Cities that are multiple flights awayfrom the initially infected city are infected late in the process. Thegenerator is capable of creating data sets of arbitrary size, length, andconnectivity to better mirror a diverse set of observed network types.ConclusionsNew computational methods for outbreak detection andsurveillance must be compared to established approaches. Outbreakmitigation strategies require a realistic model of human transportationbehavior to best evaluate impact. These actions require test data thataccurately reflect the complexity of the real-world data they wouldbe applied to. The outbreak data generated here represents thecomplexity of modern transportation networks and are made to beeasily integrated with established software packages to allow for rapidtesting and deployment.Randomly generated scale-free transportation network with a power-lawdegree exponent ofλ=1.8. City and link sizes are scaled to reflect their weight.An example of observed daily outbreak-related clinic visits across a randomlygenerated network of 20 cities. Each city is colored by the number of flightsrequired to reach the city from the initial infection location. These generatedcounts are then added onto baseline data to create a synthetic data set forexperimentation.KeywordsSimulation; Network; Spatial; Synthetic; Data


2021 ◽  
Author(s):  
Petya Kindalova ◽  
Ioannis Kosmidis ◽  
Thomas E. Nichols

AbstractObjectivesWhite matter lesions are a very common finding on MRI in older adults and their presence increases the risk of stroke and dementia. Accurate and computationally efficient modelling methods are necessary to map the association of lesion incidence with risk factors, such as hypertension. However, there is no consensus in the brain mapping literature whether a voxel-wise modelling approach is better for binary lesion data than a more computationally intensive spatial modelling approach that accounts for voxel dependence.MethodsWe review three regression approaches for modelling binary lesion masks including massunivariate probit regression modelling with either maximum likelihood estimates, or mean bias-reduced estimates, and spatial Bayesian modelling, where the regression coefficients have a conditional autoregressive model prior to account for local spatial dependence. We design a novel simulation framework of artificial lesion maps to compare the three alternative lesion mapping methods. The age effect on lesion probability estimated from a reference data set (13,680 individuals from the UK Biobank) is used to simulate a realistic voxel-wise distribution of lesions across age. To mimic the real features of lesion masks, we suggest matching brain lesion summaries (total lesion volume, average lesion size and lesion count) across the reference data set and the simulated data sets. Thus, we allow for a fair comparison between the modelling approaches, under a realistic simulation setting.ResultsOur findings suggest that bias-reduced estimates for voxel-wise binary-response generalized linear models (GLMs) overcome the drawbacks of infinite and biased maximum likelihood estimates and scale well for large data sets because voxel-wise estimation can be performed in parallel across voxels. Contrary to the assumption of spatial dependence being key in lesion mapping, our results show that voxel-wise bias-reduction and spatial modelling result in largely similar estimates.ConclusionBias-reduced estimates for voxel-wise GLMs are not only accurate but also computationally efficient, which will become increasingly important as more biobank-scale neuroimaging data sets become available.


Fractals ◽  
2001 ◽  
Vol 09 (02) ◽  
pp. 209-222 ◽  
Author(s):  
STEPHEN M. BURROUGHS ◽  
SARAH F. TEBBENS

Power law cumulative number-size distributions are widely used to describe the scaling properties of data sets and to establish scale invariance. We derive the relationships between the scaling exponents of non-cumulative and cumulative number-size distributions for linearly binned and logarithmically binned data. Cumulative number-size distributions for data sets of many natural phenomena exhibit a "fall-off" from a power law at the largest object sizes. Previous work has often either ignored the fall-off region or described this region with a different function. We demonstrate that when a data set is abruptly truncated at large object size, fall-off from a power law is expected for the cumulative distribution. Functions to describe this fall-off are derived for both linearly and logarithmically binned data. These functions lead to a generalized function, the upper-truncated power law, that is independent of binning method. Fitting the upper-truncated power law to a cumulative number-size distribution determines the parameters of the power law, thus providing the scaling exponent of the data. Unlike previous approaches that employ alternate functions to describe the fall-off region, an upper-truncated power law describes the data set, including the fall-off, with a single function.


Author(s):  
Samuel U. Enogwe ◽  
Chisimkwuo John ◽  
Happiness O. Obiora-Ilouno ◽  
Chrisogonus K. Onyekwere

In this paper, we propose a new lifetime distribution called the generalized weighted Rama (GWR) distribution, which extends the two-parameter Rama distribution and has the Rama distribution as a special case. The GWR distribution has the ability to model data sets that have positive skewness and upside-down bathtub shape hazard rate. Expressions for mathematical and reliability properties of the GWR distribution have been derived. Estimation of parameters was achieved using the method of maximum likelihood estimation and a simulation was performed to verify the stability of the maximum likelihood estimates of the model parameters. The asymptotic confidence intervals of the parameters of the proposed distribution are obtained. The applicability of the GWR distribution was illustrated with a real data set and the results obtained show that the GWR distribution is a better candidate for the data than the other competing distributions being investigated.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Ibrahim Alkhairy ◽  
M. Nagy ◽  
Abdisalam Hassan Muse ◽  
Eslam Hussam

The purpose of this paper is to investigate a new family of distributions based on an inverse trigonometric function known as the arctangent function. In the context of actuarial science, heavy-tailed probability distributions are immensely beneficial and play an important role in modelling data sets. Actuaries are committed to finding for such distributions in order to get an excellent fit to complex economic and actuarial data sets. The current research takes a look at a popular method for generating new distributions which are excellent candidates for dealing with heavy-tailed data. The proposed family of distributions is known as the Arctan-X family of distributions and is introduced using an inverse trigonometric function. For the specific purpose of the show of strength, we studied the Arctan-Weibull distribution as a special case of the developed family. To estimate the parameters of the Arctan-Weibull distribution, the frequentist approach, i.e., maximum likelihood estimation, is used. A rigorous Monte Carlo simulation analysis is used to determine the efficiency of the obtained estimators. The Arctan-Weibull model is demonstrated using a real-world insurance data set. The Arctan-Weibull is compared to well-known two-, three-, and four-parameter competitors. Among the competing distributions are Weibull, Kappa, Burr-XII, and beta-Weibull. For model comparison, we used the most precise tests used to know whether the Arctan-Weibull distribution is more useful than competing models.


Sign in / Sign up

Export Citation Format

Share Document