Formation of sets of independent components of a multidimensional random variable based on a nonparametric pattern recognition algorithm

2021 ◽  
pp. 3-9
Author(s):  
Aleksandr V. Lapko ◽  
Vasiliy A. Lapko ◽  
Anna V. Bakhtina

The possibility of circumventing the problem of decomposition of the range of values of random variables when testing various hypotheses is considered. A brief review of the literature on this problem is given. A method for forming sets of independent components of a multidimensional random variable is proposed, based on hypotheses testing about the independence of paired combinations of components of a multidimensional random variable. The method uses a two-dimensional non-parametric algorithm for pattern recognition of the kernel type, corresponding to the criterion of maximum likelihood. In contrast to the traditional method based on the application of the Pearson criterion, the proposed approach avoids the problem of decomposing the range of values of random variables into multidimensional intervals. The results of computational experiments performed according to the method of forming sets of independent random variables are presented. Using the information obtained, an information graph is constructed, the vertices of which correspond to the components of a multidimensional random variable, and the edges determine their independence. Then the vertices of the complete subgraphs correspond to groups of independent components of a random variable. The obtained results form the basis for the synthesis of a multi-level nonparametric large volume data processing system, each level of which corresponds to a specific set of independent random variables.

2021 ◽  
Vol 5 (45) ◽  
pp. 767-772
Author(s):  
I.V. Zenkov ◽  
A.V. Lapko ◽  
V.A. Lapko ◽  
E.V. Kiryushina ◽  
V.N. Vokin

A new method for testing a hypothesis of the independence of multidimensional random variables is proposed. The technique under consideration is based on the use of a nonparametric pattern recognition algorithm that meets a maximum likelihood criterion. In contrast to the traditional formulation of the pattern recognition problem, there is no a priori training sample. The initial information is represented by statistical data, which are made up of the values of a multivariate random variable. The distribution laws of random variables in the classes are estimated according to the initial statistical data for the conditions of their dependence and independence. When selecting optimal bandwidths for nonparametric kernel-type probability density estimates, the minimum standard deviation is used as a criterion. Estimates of the probability of pattern recognition error in the classes are calculated. Based on the minimum value of the estimates of the probabilities of pattern recognition errors, a decision is made on the independence or dependence of the random variables. The technique developed is used in the spectral analysis of remote sensing data.


1968 ◽  
Vol 64 (2) ◽  
pp. 485-488 ◽  
Author(s):  
V. K. Rohatgi

Let {Xn: n ≥ 1} be a sequence of independent random variables and write Suppose that the random vairables Xn are uniformly bounded by a random variable X in the sense thatSet qn(x) = Pr(|Xn| > x) and q(x) = Pr(|Xn| > x). If qn ≤ q and E|X|r < ∞ with 0 < r < 2 then we have (see Loève(4), 242)where ak = 0, if 0 < r < 1, and = EXk if 1 ≤ r < 2 and ‘a.s.’ stands for almost sure convergence. the purpose of this paper is to study the rates of convergence ofto zero for arbitrary ε > 0. We shall extend to the present context, results of (3) where the case of identically distributed random variables was treated. The techniques used here are strongly related to those of (3).


Author(s):  
SOLESNE BOURGUIN ◽  
JEAN-CHRISTOPHE BRETON

We investigate generalizations of the Cramér theorem. This theorem asserts that a Gaussian random variable can be decomposed into the sum of independent random variables if and only if they are Gaussian. We prove asymptotic counterparts of such decomposition results for multiple Wiener integrals and prove that similar results are true for the (asymptotic) decomposition of the semicircular distribution into free multiple Wigner integrals.


2009 ◽  
Vol 46 (3) ◽  
pp. 721-731 ◽  
Author(s):  
Shibin Zhang ◽  
Xinsheng Zhang

In this paper, a stochastic integral of Ornstein–Uhlenbeck type is represented to be the sum of two independent random variables: one has a tempered stable distribution and the other has a compound Poisson distribution. In distribution, the compound Poisson random variable is equal to the sum of a Poisson-distributed number of positive random variables, which are independent and identically distributed and have a common specified density function. Based on the representation of the stochastic integral, we prove that the transition distribution of the tempered stable Ornstein–Uhlenbeck process is self-decomposable and that the transition density is a C∞-function.


2017 ◽  
Vol 12 (2) ◽  
pp. 412-432 ◽  
Author(s):  
Leonardo Rojas-Nandayapa ◽  
Wangyue Xie

AbstractWe consider phase-type scale mixture distributions which correspond to distributions of a product of two independent random variables: a phase-type random variable Y and a non-negative but otherwise arbitrary random variable S called the scaling random variable. We investigate conditions for such a class of distributions to be either light- or heavy-tailed, we explore subexponentiality and determine their maximum domains of attraction. Particular focus is given to phase-type scale mixture distributions where the scaling random variable S has discrete support – such a class of distributions has been recently used in risk applications to approximate heavy-tailed distributions. Our results are complemented with several examples.


1970 ◽  
Vol 7 (01) ◽  
pp. 89-98
Author(s):  
John Lamperti

In the first part of this paper, we will consider a class of Markov chains on the non-negative integers which resemble the Galton-Watson branching process, but with one major difference. If there are k individuals in the nth “generation”, and are independent random variables representing their respective numbers of offspring, then the (n + 1)th generation will contain max individuals rather than as in the branching case. Equivalently, the transition matrices Pij of the chains we will study are to be of the form where F(.) is the probability distribution function of a non-negative, integervalued random variable. The right-hand side of (1) is thus the probability that the maximum of i independent random variables distributed by F has the value j. Such a chain will be called a “maximal branching process”.


Author(s):  
Olesya Martyniuk ◽  
Stepan Popina ◽  
Serhii Martyniuk

Introduction. Mathematical modeling of economic processes is necessary for the unambiguous formulation and solution of the problem. In the economic sphere this is the most important aspect of the activity of any enterprise, for which economic-mathematical modeling is the tool that allows to make adequate decisions. However, economic indicators that are factors of a model are usually random variables. An economic-mathematical model is proposed for calculating the probability distribution function of the result of economic activity on the basis of the known dependence of this result on factors influencing it and density of probability distribution of these factors. Methods. The formula was used to calculate the random variable probability distribution function, which is a function of other independent random variables. The method of estimation of basic numerical characteristics of the investigated functions of random variables is proposed: mathematical expectation that in the probabilistic sense is the average value of the result of functioning of the economic structure, as well as its variance. The upper bound of the variation of the effective feature is indicated. Results. The cases of linear and power functions of two independent variables are investigated. Different cases of two-dimensional domain of possible values of indicators, which are continuous random variables, are considered. The application of research results to production functions is considered. Examples of estimating the probability distribution function of a random variable are offered. Conclusions. The research results allow in the probabilistic sense to estimate the result of the economic structure activity on the basis of the probabilistic distributions of the values of the dependent variables. The prospect of further research is to apply indirect control over economic performance based on economic and mathematical modeling.


2021 ◽  
Vol 45 (2) ◽  
pp. 253-260
Author(s):  
I.V. Zenkov ◽  
A.V. Lapko ◽  
V.A. Lapko ◽  
S.T. Im ◽  
V.P. Tuboltsev ◽  
...  

A nonparametric algorithm for automatic classification of large statistical data sets is proposed. The algorithm is based on a procedure for optimal discretization of the range of values of a random variable. A class is a compact group of observations of a random variable corresponding to a unimodal fragment of the probability density. The considered algorithm of automatic classification is based on the «compression» of the initial information based on the decomposition of a multidimensional space of attributes. As a result, a large statistical sample is transformed into a data array composed of the centers of multidimensional sampling intervals and the corresponding frequencies of random variables. To substantiate the optimal discretization procedure, we use the results of a study of the asymptotic properties of a kernel-type regression estimate of the probability density. An optimal number of sampling intervals for the range of values of one- and two-dimensional random variables is determined from the condition of the minimum root-mean square deviation of the regression probability density estimate. The results obtained are generalized to the discretization of the range of values of a multidimensional random variable. The optimal discretization formula contains a component that is characterized by a nonlinear functional of the probability density. An analytical dependence of the detected component on the antikurtosis coefficient of a one-dimensional random variable is established. For independent components of a multidimensional random variable, a methodology is developed for calculating estimates of the optimal number of sampling intervals for random variables and their lengths. On this basis, a nonparametric algorithm for the automatic classification is developed. It is based on a sequential procedure for checking the proximity of the centers of multidimensional sampling intervals and relationships between frequencies of the membership of the random variables from the original sample of these intervals. To further increase the computational efficiency of the proposed automatic classification algorithm, a multithreaded method of its software implementation is used. The practical significance of the developed algorithms is confirmed by the results of their application in processing remote sensing data.


1997 ◽  
Vol 34 (02) ◽  
pp. 420-425 ◽  
Author(s):  
Moshe Shaked ◽  
Tityik Wong

Let X 1, X 2,… be a sequence of independent random variables and let N be a positive integer-valued random variable which is independent of the Xi. In this paper we obtain some stochastic comparison results involving min {X 1, X 2,…, XN ) and max{X 1, X 2,…, XN }.


Sign in / Sign up

Export Citation Format

Share Document