Journal of Statistical Distributions and Applications
Latest Publications


TOTAL DOCUMENTS

126
(FIVE YEARS 37)

H-INDEX

11
(FIVE YEARS 1)

Published By Springer (Biomed Central Ltd.)

2195-5832, 2195-5832

Author(s):  
C. Satheesh Kumar ◽  
Subha R. Nair

AbstractIn this paper we consider a generalization of a log-transformed version of the inverse Weibull distribution. Several theoretical properties of the distribution are studied in detail including expressions for its probability density function, reliability function, hazard rate function, quantile function, characteristic function, raw moments, percentile measures, entropy measures, median, mode etc. Certain structural properties of the distribution along with expressions for reliability measures as well as the distribution and moments of order statistics are obtained. Also we discuss the maximum likelihood estimation of the parameters of the proposed distribution and illustrate the usefulness of the model through real life examples. In addition, the asymptotic behaviour of the maximum likelihood estimators are examined with the help of simulated data sets.


Author(s):  
Hien Duy Nguyen ◽  
TrungTin Nguyen ◽  
Faicel Chamroukhi ◽  
Geoffrey John McLachlan

AbstractMixture of experts (MoE) models are widely applied for conditional probability density estimation problems. We demonstrate the richness of the class of MoE models by proving denseness results in Lebesgue spaces, when inputs and outputs variables are both compactly supported. We further prove an almost uniform convergence result when the input is univariate. Auxiliary lemmas are proved regarding the richness of the soft-max gating function class, and their relationships to the class of Gaussian gating functions.


Author(s):  
Anthony G. Pakes

AbstractA family of generalised Planck (GP) laws is defined and its structural properties explored. Sometimes subject to parameter restrictions, a GP law is a randomly scaled gamma law; it arises as the equilibrium law of a perturbed version of the Feller mean reverting diffusion; the density functions can be decreasing, unimodal or bimodal; it is infinitely divisible. It is argued that the GP law is not a generalised gamma convolution. Characterisations are obtained in terms of invariance under random contraction of a weighted version of a related law. The GP law is a particular instance of equilibrium laws obtained from a recursion suggested by a genetic mutation-selection balance model. Some related infinitely divisible laws are exhibited.


Author(s):  
Duha Hamed ◽  
Ahmad Alzaghal

AbstractA new generalized class of Lindley distribution is introduced in this paper. This new class is called the T-Lindley{Y} class of distributions, and it is generated by using the quantile functions of uniform, exponential, Weibull, log-logistic, logistic and Cauchy distributions. The statistical properties including the modes, moments and Shannon’s entropy are discussed. Three new generalized Lindley distributions are investigated in more details. For estimating the unknown parameters, the maximum likelihood estimation has been used and a simulation study was carried out. Lastly, the usefulness of this new proposed class in fitting lifetime data is illustrated using four different data sets. In the application section, the strength of members of the T-Lindley{Y} class in modeling both unimodal as well as bimodal data sets is presented. A member of the T-Lindley{Y} class of distributions outperformed other known distributions in modeling unimodal and bimodal lifetime data sets.


Author(s):  
Kyung Serk Cho ◽  
Hon Keung Tony Ng

AbstractA tolerance interval is a statistical interval that covers at least 100ρ% of the population of interest with a 100(1−α)% confidence, where ρ and α are pre-specified values in (0, 1). In many scientific fields, such as pharmaceutical sciences, manufacturing processes, clinical sciences, and environmental sciences, tolerance intervals are used for statistical inference and quality control. Despite the usefulness of tolerance intervals, the procedures to compute tolerance intervals are not commonly implemented in statistical software packages. This paper aims to provide a comparative study of the computational procedures for tolerance intervals in some commonly used statistical software packages including JMP, Minitab, NCSS, Python, R, and SAS. On the other hand, we also investigate the effect of misspecifying the underlying probability model on the performance of tolerance intervals. We study the performance of tolerance intervals when the assumed distribution is the same as the true underlying distribution and when the assumed distribution is different from the true distribution via a Monte Carlo simulation study. We also propose a robust model selection approach to obtain tolerance intervals that are relatively insensitive to the model misspecification. We show that the proposed robust model selection approach performs well when the underlying distribution is unknown but candidate distributions are available.


Author(s):  
Demba Fofana ◽  
E. O. George ◽  
Dale Bowman

Abstract Background Analyzing gene expression data rigorously requires taking assumptions into consideration but also relies on using information about network relations that exist among genes. Combining these different elements cannot only improve statistical power, but also provide a better framework through which gene expression can be properly analyzed. Material and methods We propose a novel statistical model that combines assumptions and gene network information into the analysis. Assumptions are important since every test statistic is valid only when required assumptions hold. So, we propose hybrid p-values and show that, under the null hypothesis of primary interest, these p-values are uniformly distributed. These proposed hybrid p-values take assumptions into consideration. We incorporate gene network information into the analysis because neighboring genes share biological functions. This correlation factor is taken into account via similar prior probabilities for neighboring genes. Results With a series of simulations our approach is compared with other approaches. Area Under the ROC Curves (AUCs) are constructed to compare the different methodologies; the AUC based on our methodology is larger than others. For regression analysis, AUC from our proposed method contains AUCs of Spearman test and of Pearson test. In addition, true negative rates (TNRs) also known as specificities are higher with our approach than with the other approaches. For two group comparison analysis, for instance, with a sample size of n=10, specificity corresponding to our proposed methodology is 0.716146 and specificities for t-test and rank sum are 0.689223 and 0.69797, respectively. Our method that combines assumptions and network information into the analysis is shown to be more powerful. Conclusions These proposed procedures are introduced as a general class of methods that can incorporate procedure-selection, account for multiple-testing, and incorporate graphical network information into the analysis. We obtain very good performance in simulations, and in real data analysis.


Author(s):  
Cindy Xin Feng

AbstractCounts data with excessive zeros are frequently encountered in practice. For example, the number of health services visits often includes many zeros representing the patients with no utilization during a follow-up time. A common feature of this type of data is that the count measure tends to have excessive zero beyond a common count distribution can accommodate, such as Poisson or negative binomial. Zero-inflated or hurdle models are often used to fit such data. Despite the increasing popularity of ZI and hurdle models, there is still a lack of investigation of the fundamental differences between these two types of models. In this article, we reviewed the zero-inflated and hurdle models and highlighted their differences in terms of their data generating processes. We also conducted simulation studies to evaluate the performances of both types of models. The final choice of regression model should be made after a careful assessment of goodness of fit and should be tailored to a particular data in question.


Author(s):  
Charles K. Amponsah ◽  
Tomasz J. Kozubowski ◽  
Anna K. Panorska

AbstractWe propose a new stochastic model describing the joint distribution of (X,N), where N is a counting variable while X is the sum of N independent gamma random variables. We present the main properties of this general model, which include marginal and conditional distributions, integral transforms, moments and parameter estimation. We also discuss in more detail a special case where N has a heavy tailed discrete Pareto distribution. An example from finance illustrates the modeling potential of this new mixed bivariate distribution.


Author(s):  
Alexander D. Knudson ◽  
Tomasz J. Kozubowski ◽  
Anna K. Panorska ◽  
A. Grant Schissler

AbstractWe propose a flexible multivariate stochastic model for over-dispersed count data. Our methodology is built upon mixed Poisson random vectors (Y1,…,Yd), where the {Yi} are conditionally independent Poisson random variables. The stochastic rates of the {Yi} are multivariate distributions with arbitrary non-negative margins linked by a copula function. We present basic properties of these mixed Poisson multivariate distributions and provide several examples. A particular case with geometric and negative binomial marginal distributions is studied in detail. We illustrate an application of our model by conducting a high-dimensional simulation motivated by RNA-sequencing data.


Author(s):  
Yixuan Zou ◽  
Jan Hannig ◽  
Derek S. Young

AbstractZero-inflated and hurdle models are widely applied to count data possessing excess zeros, where they can simultaneously model the process from how the zeros were generated and potentially help mitigate the effects of overdispersion relative to the assumed count distribution. Which model to use depends on how the zeros are generated: zero-inflated models add an additional probability mass on zero, while hurdle models are two-part models comprised of a degenerate distribution for the zeros and a zero-truncated distribution. Developing confidence intervals for such models is challenging since no closed-form function is available to calculate the mean. In this study, generalized fiducial inference is used to construct confidence intervals for the means of zero-inflated Poisson and Poisson hurdle models. The proposed methods are assessed by an intensive simulation study. An illustrative example demonstrates the inference methods.


Sign in / Sign up

Export Citation Format

Share Document