Sign, Wilcoxon and Mann-Whitney Tests for Functional Data: An Approach Based on Random Projections

Rafael Meléndez; Ramón Giraldo; Víctor Leiva

doi:10.3390/math9010044

Sign, Wilcoxon and Mann-Whitney Tests for Functional Data: An Approach Based on Random Projections

Mathematics ◽

10.3390/math9010044 ◽

2020 ◽

Vol 9 (1) ◽

pp. 44

Author(s):

Rafael Meléndez ◽

Ramón Giraldo ◽

Víctor Leiva

Keyword(s):

Functional Data ◽

Real Data ◽

Nonparametric Methods ◽

Testing Hypothesis ◽

Random Projections ◽

Data Set ◽

Standard Case ◽

Regression Methods ◽

Two Samples ◽

Non Gaussian

Sign, Wilcoxon and Mann-Whitney tests are nonparametric methods in one or two-sample problems. The nonparametric methods are alternatives used for testing hypothesis when the standard methods based on the Gaussianity assumption are not suitable to be applied. Recently, the functional data analysis (FDA) has gained relevance in statistical modeling. In FDA, each observation is a curve or function which usually is a realization of a stochastic process. In the literature of FDA, several methods have been proposed for testing hypothesis with samples coming from Gaussian processes. However, when this assumption is not realistic, it is necessary to utilize other approaches. Clustering and regression methods, among others, for non-Gaussian functional data have been proposed recently. In this paper, we propose extensions of the sign, Wilcoxon and Mann-Whitney tests to the functional data context as methods for testing hypothesis when we have one or two samples of non-Gaussian functional data. We use random projections to transform the functional problem into a scalar one, and then we proceed as in the standard case. Based on a simulation study, we show that the proposed tests have a good performance. We illustrate the methodology by applying it to a real data set.

Download Full-text

On the Estimation of Reliability of Weighted Weibull Distribution: A Comparative Study

International Journal of Statistics and Probability ◽

10.5539/ijsp.v5n4p1 ◽

2016 ◽

Vol 5 (4) ◽

pp. 1

Author(s):

Bander Al-Zahrani

Keyword(s):

Maximum Likelihood ◽

Weibull Distribution ◽

Real Data ◽

Reliability Function ◽

Nonparametric Methods ◽

Empirical Method ◽

Density Estimator ◽

Bayes Estimators ◽

Unknown Parameters ◽

Data Set

The paper gives a description of estimation for the reliability function of weighted Weibull distribution. The maximum likelihood estimators for the unknown parameters are obtained. Nonparametric methods such as empirical method, kernel density estimator and a modified shrinkage estimator are provided. The Markov chain Monte Carlo method is used to compute the Bayes estimators assuming gamma and Jeffrey priors. The performance of the maximum likelihood, nonparametric methods and Bayesian estimators is assessed through a real data set.

Download Full-text

Categorical Functional Data Analysis. The cfda R Package

Mathematics ◽

10.3390/math9233074 ◽

2021 ◽

Vol 9 (23) ◽

pp. 3074

Author(s):

Cristian Preda ◽

Quentin Grimonprez ◽

Vincent Vandewalle

Keyword(s):

Functional Data ◽

Multiple Correspondence Analysis ◽

Real Data ◽

Jump Process ◽

R Package ◽

Finite Basis ◽

Data Set ◽

Stochastic Jump ◽

Finite Set ◽

Infinite Set

Categorical functional data represented by paths of a stochastic jump process with continuous time and a finite set of states are considered. As an extension of the multiple correspondence analysis to an infinite set of variables, optimal encodings of states over time are approximated using an arbitrary finite basis of functions. This allows dimension reduction, optimal representation, and visualisation of data in lower dimensional spaces. The methodology is implemented in the cfda R package and is illustrated using a real data set in the clustering framework.

Download Full-text

A Class of Local Linear Estimators with Functional Data

Journal of Siberian Federal University Mathematics & Physics ◽

10.17516/1997-1397-2019-12-3-379-391 ◽

2019 ◽

pp. 379-391 ◽

Cited By ~ 1

Author(s):

Sara Leulmi ◽

Fatiha Messaci

Keyword(s):

Nonparametric Estimation ◽

Metric Space ◽

Functional Data ◽

Regression Function ◽

Real Data ◽

Random Variable ◽

Data Set ◽

Response Variable ◽

Local Linear ◽

Linear Estimators

We introduce a local linear nonparametric estimation for the generalized regression function of a scalar response variable given a random variable taking values in a semi metric space. We establish a rate of uniform consistency for the proposed estimators. Then, based on a real data set we illustrate the performance of a particular studied estimator with respect to other known estimators

Download Full-text

Identifying and Classifying Aberrant Response Patterns Through Functional Data Analysis

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998620911941 ◽

2020 ◽

Vol 45 (6) ◽

pp. 719-749

Author(s):

Eduardo Doval ◽

Pedro Delicado

Keyword(s):

Data Analysis ◽

Functional Data Analysis ◽

Functional Data ◽

Simulated Data ◽

Real Data ◽

Response Patterns ◽

Fit Indices ◽

Person Fit ◽

Data Set ◽

Aberrant Response Patterns

We propose new methods for identifying and classifying aberrant response patterns (ARPs) by means of functional data analysis. These methods take the person response function (PRF) of an individual and compare it with the pattern that would correspond to a generic individual of the same ability according to the item-person response surface. ARPs correspond to atypical difference functions. The ARP classification is done with functional data clustering applied to the PRFs identified as ARP. We apply these methods to two sets of simulated data (the first is used to illustrate the ARP identification methods and the second demonstrates classification of the response patterns flagged as ARP) and a real data set (a Grade 12 science assessment test, SAT, with 32 items answered by 600 examinees). For comparative purposes, ARPs are also identified with three nonparametric person-fit indices (Ht, Modified Caution Index, and ZU3). Our results indicate that the ARP detection ability of one of our proposed methods is comparable to that of person-fit indices. Moreover, the proposed classification methods enable ARP associated with either spuriously low or spuriously high scores to be distinguished.

Download Full-text

Explaining predictive models using Shapley values and non-parametric vine copulas

Dependence Modeling ◽

10.1515/demo-2021-0103 ◽

2021 ◽

Vol 9 (1) ◽

pp. 62-81

Author(s):

Kjersti Aas ◽

Thomas Nagler ◽

Martin Jullum ◽

Anders Løland

Keyword(s):

Traditional Approach ◽

Simulated Data ◽

Real Data ◽

Data Sets ◽

Data Set ◽

Wide Range ◽

Shapley Values ◽

Non Gaussian ◽

Vine Copula ◽

Vine Copulas

Abstract In this paper the goal is to explain predictions from complex machine learning models. One method that has become very popular during the last few years is Shapley values. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. If the features in reality are dependent this may lead to incorrect explanations. Hence, there have recently been attempts of appropriately modelling/estimating the dependence between the features. Although the previously proposed methods clearly outperform the traditional approach assuming independence, they have their weaknesses. In this paper we propose two new approaches for modelling the dependence between the features. Both approaches are based on vine copulas, which are flexible tools for modelling multivariate non-Gaussian distributions able to characterise a wide range of complex dependencies. The performance of the proposed methods is evaluated on simulated data sets and a real data set. The experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than their competitors.

Download Full-text

Occam’s inversion to generate smooth, two‐dimensional models from magnetotelluric data

Geophysics ◽

10.1190/1.1442813 ◽

1990 ◽

Vol 55 (12) ◽

pp. 1613-1624 ◽

Cited By ~ 812

Author(s):

C. deGroot‐Hedlin ◽

S. Constable

Keyword(s):

Joint Inversion ◽

Synthetic Data ◽

Real Data ◽

Theoretical Models ◽

Resolving Power ◽

Magnetotelluric Data ◽

Data Set ◽

Occam’S Inversion ◽

Non Gaussian ◽

True Structure

Magnetotelluric (MT) data are inverted for smooth 2-D models using an extension of the existing 1-D algorithm, Occam’s inversion. Since an MT data set consists of a finite number of imprecise data, an infinity of solutions to the inverse problem exists. Fitting field or synthetic electromagnetic data as closely as possible results in theoretical models with a maximum amount of roughness, or structure. However, by relaxing the misfit criterion only a small amount, models which are maximally smooth may be generated. Smooth models are less likely to result in overinterpretation of the data and reflect the true resolving power of the MT method. The models are composed of a large number of rectangular prisms, each having a constant conductivity. [Formula: see text] information, in the form of boundary locations only or both boundary locations and conductivity, may be included, providing a powerful tool for improving the resolving power of the data. Joint inversion of TE and TM synthetic data generated from known models allows comparison of smooth models with the true structure. In most cases, smoothed versions of the true structure may be recovered in 12–16 iterations. However, resistive features with a size comparable to depth of burial are poorly resolved. Real MT data present problems of non‐Gaussian data errors, the breakdown of the two‐dimensionality assumption and the large number of data in broadband soundings; nevertheless, real data can be inverted using the algorithm.

Download Full-text

Parameters of the Diffusion Leaky Integrate-and-Fire Neuronal Model for a Slowly Fluctuating Signal

Neural Computation ◽

10.1162/neco.2008.11-07-653 ◽

2008 ◽

Vol 20 (11) ◽

pp. 2696-2714 ◽

Cited By ~ 17

Author(s):

Umberto Picchini ◽

Susanne Ditlevsen ◽

Andrea De Gaetano ◽

Petr Lansky

Keyword(s):

Statistical Power ◽

Estimation Method ◽

Real Data ◽

Random Variable ◽

Neuronal Model ◽

Experimental Unit ◽

Data Set ◽

Regression Methods ◽

Integrate And Fire ◽

Neuronal Systems

Stochastic leaky integrate-and-fire (LIF) neuronal models are common theoretical tools for studying properties of real neuronal systems. Experimental data of frequently sampled membrane potential measurements between spikes show that the assumption of constant parameter values is not realistic and that some (random) fluctuations are occurring. In this article, we extend the stochastic LIF model, allowing a noise source determining slow fluctuations in the signal. This is achieved by adding a random variable to one of the parameters characterizing the neuronal input, considering each interspike interval (ISI) as an independent experimental unit with a different realization of this random variable. In this way, the variation of the neuronal input is split into fast (within-interval) and slow (between-intervals) components. A parameter estimation method is proposed, allowing the parameters to be estimated simultaneously over the entire data set. This increases the statistical power, and the average estimate over all ISIs will be improved in the sense of decreased variance of the estimator compared to previous approaches, where the estimation has been conducted separately on each individual ISI. The results obtained on real data show good agreement with classical regression methods.

Download Full-text

Non-Conventional Approaches To Property Value Assessment

Journal of Applied Business Research (JABR) ◽

10.19030/jabr.v22i3.1421 ◽

2011 ◽

Vol 22 (3) ◽

Cited By ~ 3

Author(s):

Jozef M. Zurada ◽

Alan S. Levitan ◽

Jian Guan

Keyword(s):

Neural Networks ◽

Fuzzy Logic ◽

Principal Component ◽

Real Data ◽

Feature Reduction ◽

Value Assessment ◽

Property Value ◽

Data Set ◽

Regression Methods ◽

Conventional Methods

Lack of precision is common in property value assessment. Recently non-conventional methods, such as neural networks based methods, have been introduced in property value assessment as an attempt to better address this lack of precision and uncertainty. Although fuzzy logic has been suggested as another possible solution, no other artificial intelligence methods have been applied to real estate value assessment other than neural network based methods. This paper presents the results of using two new non-conventional methods, fuzzy logic and memory-based reasoning, in evaluating residential property values for a real data set. The paper compares the results with those obtained using neural networks and multiple regression. Methods of feature reduction, such as principal component analysis and variable selection, have also been used for possible improvement of the final results.  The results indicate that no single one of the new methods is consistently superior for the given data set.

Download Full-text

EXPONENTIATED HALF-LOGISTIC LOMAX DISTRIBUTION WITH PROPERTIES AND APPLICATION

NED University Journal of Research ◽

10.35453/nedjr-ascn-2018-0033 ◽

2019 ◽

Vol XVI (2) ◽

pp. 1-11

Author(s):

Farrukh Jamal ◽

Hesham Mohammed Reyad ◽

Soha Othman Ahmed ◽

Muhammad Akbar Ali Shah ◽

Emrah Altun

Keyword(s):

Real Data ◽

Continuous Model ◽

Model Parameters ◽

Data Set ◽

Lomax Distribution ◽

Mathematical Properties ◽

Proposed Model ◽

Probability Weighted Moment ◽

Record Statistics ◽

Maximum Likelihood Criterion

A new three-parameter continuous model called the exponentiated half-logistic Lomax distribution is introduced in this paper. Basic mathematical properties for the proposed model were investigated which include raw and incomplete moments, skewness, kurtosis, generating functions, Rényi entropy, Lorenz, Bonferroni and Zenga curves, probability weighted moment, stress strength model, order statistics, and record statistics. The model parameters were estimated by using the maximum likelihood criterion and the behaviours of these estimates were examined by conducting a simulation study. The applicability of the new model is illustrated by applying it on a real data set.

Download Full-text

Evaluation for estimating of the PDF and the CDF of Generalized Inverted Exponential Distribution with Application in Industry

Advances in Mathematics: Scientific Journal ◽

10.37418/amsj.9.1.39 ◽

2020 ◽

pp. 507-522

Author(s):

Parisa Torkaman

Keyword(s):

Least Squares ◽

Exponential Distribution ◽

Mean Squared Error ◽

Weighted Least Squares ◽

Real Data ◽

Minimum Variance ◽

Cumulative Distribution ◽

Estimation Methods ◽

Data Set ◽

Better Than

The generalized inverted exponential distribution is introduced as a lifetime model with good statistical properties. This paper, the estimation of the probability density function and the cumulative distribution function of with five different estimation methods: uniformly minimum variance unbiased(UMVU), maximum likelihood(ML), least squares(LS), weighted least squares (WLS) and percentile(PC) estimators are considered. The performance of these estimation procedures, based on the mean squared error (MSE) by numerical simulations are compared. Simulation studies express that the UMVU estimator performs better than others and when the sample size is large enough the ML and UMVU estimators are almost equivalent and efficient than LS, WLS and PC. Finally, the result using a real data set are analyzed.

Download Full-text