scholarly journals The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three Real Data Sets

2021 ◽  
Author(s):  
Jakob Raymaekers ◽  
Peter J. Rousseeuw

AbstractMany real data sets contain numerical features (variables) whose distribution is far from normal (Gaussian). Instead, their distribution is often skewed. In order to handle such data it is customary to preprocess the variables to make them more normal. The Box–Cox and Yeo–Johnson transformations are well-known tools for this. However, the standard maximum likelihood estimator of their transformation parameter is highly sensitive to outliers, and will often try to move outliers inward at the expense of the normality of the central part of the data. We propose a modification of these transformations as well as an estimator of the transformation parameter that is robust to outliers, so the transformed data can be approximately normal in the center and a few outliers may deviate from it. It compares favorably to existing techniques in an extensive simulation study and on real data.


2020 ◽  
Vol 9 (1) ◽  
pp. 61-81
Author(s):  
Lazhar BENKHELIFA

A new lifetime model, with four positive parameters, called the Weibull Birnbaum-Saunders distribution is proposed. The proposed model extends the Birnbaum-Saunders distribution and provides great flexibility in modeling data in practice. Some mathematical properties of the new distribution are obtained including expansions for the cumulative and density functions, moments, generating function, mean deviations, order statistics and reliability. Estimation of the model parameters is carried out by the maximum likelihood estimation method. A simulation study is presented to show the performance of the maximum likelihood estimates of the model parameters. The flexibility of the new model is examined by applying it to two real data sets.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Sandeep Kumar Maurya ◽  
Sanjay K Singh ◽  
Umesh Singh

A one parameter right skewed, upside down bathtub type, heavy-tailed distribution is derived. Various statistical properties and maximum likelihood approaches for estimation purpose are studied. Five different real data sets with four different models are considered to illustrate the suitability of the proposed model.


Author(s):  
Fiaz Ahmad Bhatti ◽  
G. G. Hamedani ◽  
Haitham M. Yousof ◽  
Azeem Ali ◽  
Munir Ahmad

A flexible lifetime distribution with increasing, decreasing, inverted bathtub and modified bathtub hazard rate called Modified Burr XII-Inverse Weibull (MBXII-IW) is introduced and studied. The density function of MBXII-IW is exponential, left-skewed, right-skewed and symmetrical shaped.  Descriptive measures on the basis of quantiles, moments, order statistics and reliability measures are theoretically established. The MBXII-IW distribution is characterized via different techniques. Parameters of MBXII-IW distribution are estimated using maximum likelihood method. The simulation study is performed to illustrate the performance of the maximum likelihood estimates (MLEs). The potentiality of MBXII-IW distribution is demonstrated by its application to real data sets: serum-reversal times and quarterly earnings.


Symmetry ◽  
2020 ◽  
Vol 12 (3) ◽  
pp. 440 ◽  
Author(s):  
Abdulhakim A. Al-babtain ◽  
I. Elbatal ◽  
Haitham M. Yousof

In this article, we introduced a new extension of the binomial-exponential 2 distribution. We discussed some of its structural mathematical properties. A simple type Copula-based construction is also presented to construct the bivariate- and multivariate-type distributions. We estimated the model parameters via the maximum likelihood method. Finally, we illustrated the importance of the new model by the study of two real data applications to show the flexibility and potentiality of the new model in modeling skewed and symmetric data sets.


Symmetry ◽  
2019 ◽  
Vol 11 (11) ◽  
pp. 1361
Author(s):  
Héctor J. Gómez ◽  
Diego I. Gallardo ◽  
Osvaldo Venegas

In this article we study the properties, inference, and statistical applications to a parametric generalization of the truncation positive normal distribution, introducing a new parameter so as to increase the flexibility of the new model. For certain combinations of parameters, the model includes both symmetric and asymmetric shapes. We study the model’s basic properties, maximum likelihood estimators and Fisher information matrix. Finally, we apply it to two real data sets to show the model’s good performance compared to other models with positive support: the first, related to the height of the drum of the roller and the second, related to daily cholesterol consumption.


Mathematics ◽  
2021 ◽  
Vol 9 (19) ◽  
pp. 2477
Author(s):  
Seitebaleng Makgai ◽  
Andriette Bekker ◽  
Mohammad Arashi

The Dirichlet distribution is a well-known candidate in modeling compositional data sets. However, in the presence of outliers, the Dirichlet distribution fails to model such data sets, making other model extensions necessary. In this paper, the Kummer–Dirichlet distribution and the gamma distribution are coupled, using the beta-generating technique. This development results in the proposal of the Kummer–Dirichlet gamma distribution, which presents greater flexibility in modeling compositional data sets. Some general properties, such as the probability density functions and the moments are presented for this new candidate. The method of maximum likelihood is applied in the estimation of the parameters. The usefulness of this model is demonstrated through the application of synthetic and real data sets, where outliers are present.


2021 ◽  
Author(s):  
Karin Schork ◽  
Michael Turewicz ◽  
Julian Uszkoreit ◽  
Jörg Rahnenführer ◽  
Martin Eisenacher

Motivation: In bottom-up proteomics, proteins are enzymatically digested before measurement with mass spectrometry. The relationship between proteins and peptides can be represented by bipartite graphs. This representation is useful to aid protein inference and quantification, which is complex due to the occurrence of shared peptides. We conducted a comprehensive analysis of bipartite graphs using theoretical peptides from in silico digestion of protein databases as well as quantified peptides quantified from real data sets. Results: The graphs based on quantified peptides are smaller and have less complex structures compared to graphs using theoretical peptides. The proportion of protein nodes without unique peptides and of graphs that contain such proteins are considerably greater for real data. Large differences between the two analyzed organisms (mouse and yeast) on database as well as quantitative level have been observed. Insights of this analysis may be useful for the development of protein inference and quantification algorithms.


2020 ◽  
Vol 57 (4) ◽  
pp. 444-464
Author(s):  
Gauss M. Cordeiro ◽  
Thiago G. Ramires ◽  
Edwin M. M. Ortega ◽  
Rodrigo R. Pescim

We define the extended beta family of distributions to generalize the beta generator pioneered by Eugene et al. [10]. This paper is cited in at least 970 scientific articles and extends more than fifty well-known distributions. Any continuous distribution can be generalized by means of this family. The proposed family can present greater flexibility to model skewed data. Some of its mathematical properties are investigated and maximum likelihood is adopted to estimate its parameters. Further, for different parameter settings and sample sizes, some simulations are conducted. The superiority of the proposed family is illustrated by means of two real data sets.


Mathematics ◽  
2020 ◽  
Vol 8 (9) ◽  
pp. 1537
Author(s):  
Juan M. Astorga ◽  
Jimmy Reyes ◽  
Karol I. Santoro ◽  
Osvaldo Venegas ◽  
Héctor W. Gómez

This article introduces an extension of the Power Muth (PM) distribution for modeling positive data sets with a high coefficient of kurtosis. The resulting distribution has greater kurtosis than the PM distribution. We show that the density can be represented based on the incomplete generalized integro-exponential function. We study some of its properties and moments, and its coefficients of asymmetry and kurtosis. We apply estimations using the moments and maximum likelihood methods and present a simulation study to illustrate parameter recovery. The results of application to two real data sets indicate that the new model performs very well in the presence of outliers.


Sign in / Sign up

Export Citation Format

Share Document