Parametric Families of Probability Distributions for Functional Data Using Quasi-Arithmetic Means with Archimedean Generators

Stochastic Order and Generalized Weighted Mean Invariance

Entropy ◽

10.3390/e23060662 ◽

2021 ◽

Vol 23 (6) ◽

pp. 662

Author(s):

Mateu Sbert ◽

Jordi Poch ◽

Shuning Chen ◽

Víctor Elvira

Keyword(s):

Monte Carlo ◽

Probability Distributions ◽

Stochastic Order ◽

Arithmetic Mean ◽

Stochastic Orders ◽

Arithmetic Means ◽

Increasing Convex Order ◽

Monte Carlo Techniques ◽

Multiple Importance Sampling ◽

Invariance Properties

In this paper, we present order invariance theoretical results for weighted quasi-arithmetic means of a monotonic series of numbers. The quasi-arithmetic mean, or Kolmogorov–Nagumo mean, generalizes the classical mean and appears in many disciplines, from information theory to physics, from economics to traffic flow. Stochastic orders are defined on weights (or equivalently, discrete probability distributions). They were introduced to study risk in economics and decision theory, and recently have found utility in Monte Carlo techniques and in image processing. We show in this paper that, if two distributions of weights are ordered under first stochastic order, then for any monotonic series of numbers their weighted quasi-arithmetic means share the same order. This means for instance that arithmetic and harmonic mean for two different distributions of weights always have to be aligned if the weights are stochastically ordered, this is, either both means increase or both decrease. We explore the invariance properties when convex (concave) functions define both the quasi-arithmetic mean and the series of numbers, we show its relationship with increasing concave order and increasing convex order, and we observe the important role played by a new defined mirror property of stochastic orders. We also give some applications to entropy and cross-entropy and present an example of multiple importance sampling Monte Carlo technique that illustrates the usefulness and transversality of our approach. Invariance theorems are useful when a system is represented by a set of quasi-arithmetic means and we want to change the distribution of weights so that all means evolve in the same direction.

Download Full-text

An FDA-Based Approach for Clustering Elicited Expert Knowledge

Stats ◽

10.3390/stats4010014 ◽

2021 ◽

Vol 4 (1) ◽

pp. 184-204

Author(s):

Carlos Barrera-Causil ◽

Juan Carlos Correa ◽

Andrew Zamecnik ◽

Francisco Torres-Avilés ◽

Fernando Marmolejo-Ramos

Keyword(s):

Functional Data ◽

Expert Knowledge ◽

Probability Distributions ◽

Knowledge Elicitation ◽

Parallel Analysis ◽

Data Sets ◽

Dimensional Problem ◽

Prior Distributions ◽

Infinite Dimensional ◽

Finite Dimensional

Expert knowledge elicitation (EKE) aims at obtaining individual representations of experts’ beliefs and render them in the form of probability distributions or functions. In many cases the elicited distributions differ and the challenge in Bayesian inference is then to find ways to reconcile discrepant elicited prior distributions. This paper proposes the parallel analysis of clusters of prior distributions through a hierarchical method for clustering distributions and that can be readily extended to functional data. The proposed method consists of (i) transforming the infinite-dimensional problem into a finite-dimensional one, (ii) using the Hellinger distance to compute the distances between curves and thus (iii) obtaining a hierarchical clustering structure. In a simulation study the proposed method was compared to k-means and agglomerative nesting algorithms and the results showed that the proposed method outperformed those algorithms. Finally, the proposed method is illustrated through an EKE experiment and other functional data sets.

Download Full-text

Some Parametric Families of Probability Distributions

Lifetime Data - Series on Quality, Reliability and Engineering Statistics ◽

10.1142/9789814730679_0003 ◽

2015 ◽

pp. 37-51

Keyword(s):

Probability Distributions ◽

Parametric Families

Download Full-text

Functional Kernel Density Estimation: Point and Fourier Approaches to Time Series Anomaly Detection

Entropy ◽

10.3390/e22121363 ◽

2020 ◽

Vol 22 (12) ◽

pp. 1363

Author(s):

Michael R. Lindstrom ◽

Hyuntae Jung ◽

Denis Larocque

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Density Estimation ◽

Kernel Density Estimation ◽

Functional Data ◽

Probability Distributions ◽

Kernel Density ◽

Principal Component ◽

Probability Densities ◽

Report Data

We present an unsupervised method to detect anomalous time series among a collection of time series. To do so, we extend traditional Kernel Density Estimation for estimating probability distributions in Euclidean space to Hilbert spaces. The estimated probability densities we derive can be obtained formally through treating each series as a point in a Hilbert space, placing a kernel at those points, and summing the kernels (a “point approach”), or through using Kernel Density Estimation to approximate the distributions of Fourier mode coefficients to infer a probability density (a “Fourier approach”). We refer to these approaches as Functional Kernel Density Estimation for Anomaly Detection as they both yield functionals that can score a time series for how anomalous it is. Both methods naturally handle missing data and apply to a variety of settings, performing well when compared with an outlyingness score derived from a boxplot method for functional data, with a Principal Component Analysis approach for functional data, and with the Functional Isolation Forest method. We illustrate the use of the proposed methods with aviation safety report data from the International Air Transport Association (IATA).

Download Full-text

Some Parametric Families of Probability Distributions

Lifetime Data - Series on Quality, Reliability and Engineering Statistics ◽

10.1142/9789812774514_0003 ◽

2006 ◽

pp. 37-50

Keyword(s):

Probability Distributions ◽

Parametric Families

Download Full-text

FUNCTIONAL ANALYSIS FOR PARAMETRIC FAMILIES OF FUNCTIONAL DATA

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127412502264 ◽

2012 ◽

Vol 22 (09) ◽

pp. 1250226 ◽

Cited By ~ 6

Author(s):

ANGELA DE SANCTIS ◽

TONIO DI BATTISTA

Keyword(s):

Functional Analysis ◽

Functional Data ◽

Functional Form ◽

Parametric Family ◽

Summary Statistics ◽

Central Idea ◽

Parametric Families

Assuming a Parametric Family of Functional Data, the problem of computing summary statistics of the same functional form is investigated. The central idea is to compile the statistics on the parameters instead of on the functions themselves. With the hypothesis of a monotonic dependence from parameters, we highlight the special features of this statistics.

Download Full-text

Curve Registration of Functional Data for Approximate Bayesian Computation

Stats ◽

10.3390/stats4030045 ◽

2021 ◽

Vol 4 (3) ◽

pp. 762-775

Author(s):

Anthony Ebert ◽

Kerrie Mengersen ◽

Fabrizio Ruggeri ◽

Paul Wu

Keyword(s):

Objective Function ◽

Functional Data ◽

Approximate Bayesian Computation ◽

Probability Distributions ◽

Distance Measures ◽

Bayesian Computation ◽

Time Axis ◽

Parameter Inference ◽

Curve Registration ◽

Approximate Bayesian

Approximate Bayesian computation is a likelihood-free inference method which relies on comparing model realisations to observed data with informative distance measures. We obtain functional data that are not only subject to noise along their y axis but also to a random warping along their x axis, which we refer to as the time axis. Conventional distances on functions, such as the L2 distance, are not informative under these conditions. The Fisher–Rao metric, previously generalised from the space of probability distributions to the space of functions, is an ideal objective function for aligning one function to another by warping the time axis. We assess the usefulness of alignment with the Fisher–Rao metric for approximate Bayesian computation with four examples: two simulation examples, an example about passenger flow at an international airport, and an example of hydrological flow modelling. We find that the Fisher–Rao metric works well as the objective function to minimise for alignment; however, once the functions are aligned, it is not necessarily the most informative distance for inference. This means that likelihood-free inference may require two distances: one for alignment and one for parameter inference.

Download Full-text

Clarifying and quantifying the geometric correlation for probability distributions of inter-sensor monitoring data: A functional data analytic methodology

Mechanical Systems and Signal Processing ◽

10.1016/j.ymssp.2019.106540 ◽

2020 ◽

Vol 138 ◽

pp. 106540

Author(s):

Zhicheng Chen ◽

Yuequan Bao ◽

Zhiyi Tang ◽

Jiahui Chen ◽

Hui Li

Keyword(s):

Functional Data ◽

Probability Distributions ◽

Monitoring Data ◽

Sensor Monitoring ◽

Data Analytic ◽

Geometric Correlation

Download Full-text

On Empirical Meaning of Randomness with Respect to Parametric Families of Probability Distributions

Theory of Computing Systems ◽

10.1007/s00224-010-9300-9 ◽

2010 ◽

Vol 50 (2) ◽

pp. 296-312 ◽

Cited By ~ 1

Author(s):

Vladimir V’yugin

Keyword(s):

Probability Distributions ◽

Parametric Families

Download Full-text

Asteroid and Comet Impact Speeds upon Mars: Significance for Panspermia and the Supply of Organics

International Astronomical Union Colloquium ◽

10.1017/s0252921100014718 ◽

1997 ◽

Vol 161 ◽

pp. 197-201 ◽

Cited By ~ 1

Author(s):

Duncan Steel

Keyword(s):

Probability Distributions ◽

Regions Of Interest ◽

Significant Fraction ◽

Organic Chemicals ◽

Impact Speed ◽

The Mean ◽

The Impact

AbstractWhilst lithopanspermia depends upon massive impacts occurring at a speed above some limit, the intact delivery of organic chemicals or other volatiles to a planet requires the impact speed to be below some other limit such that a significant fraction of that material escapes destruction. Thus the two opposite ends of the impact speed distributions are the regions of interest in the bioastronomical context, whereas much modelling work on impacts delivers, or makes use of, only the mean speed. Here the probability distributions of impact speeds upon Mars are calculated for (i) the orbital distribution of known asteroids; and (ii) the expected distribution of near-parabolic cometary orbits. It is found that cometary impacts are far more likely to eject rocks from Mars (over 99 percent of the cometary impacts are at speeds above 20 km/sec, but at most 5 percent of the asteroidal impacts); paradoxically, the objects impacting at speeds low enough to make organic/volatile survival possible (the asteroids) are those which are depleted in such species.

Download Full-text