scholarly journals Modelling complex geological circular data with the projected normal distribution and mixtures of von Mises distributions

Solid Earth ◽  
2014 ◽  
Vol 5 (2) ◽  
pp. 631-639 ◽  
Author(s):  
R. M. Lark ◽  
D. Clifford ◽  
C. N. Waters

Abstract. Circular data are commonly encountered in the earth sciences and statistical descriptions and inferences about such data are necessary in structural geology. In this paper we compare two statistical distributions appropriate for complex circular data sets: the mixture of von Mises and the projected normal distribution. We show how the number of components in a mixture of von Mises distribution may be chosen, and how one may choose between the projected normal distribution and the mixture of von Mises for a particular data set. We illustrate these methods with a few structural geological data, showing how the fitted models can complement geological interpretation and permit statistical inference. One of our data sets suggests a special case of the projected normal distribution which we discuss briefly.

2013 ◽  
Vol 5 (2) ◽  
pp. 2181-2202
Author(s):  
R. M. Lark ◽  
D. Clifford ◽  
C. N. Waters

Abstract. Angular data are commonly encountered in the earth sciences and statistical descriptions and inferences about such data are necessary in structural geology. In this paper we compare two statistical distributions appropriate for complex angular data sets: the mixture of von Mises and the projected normal distribution. We show how the number of components in a mixture of von Mises distribution may be chosen, and how one may chose between the projected normal distribution and mixture of von Mises for a particular data set. We illustrate these methods with some structural geological data, showing how the fitted models can complement geological interpretation and permit statistical inference. One of our data sets suggests a special case of the projected normal distribution which we discuss briefly.


2011 ◽  
Vol 11 (3) ◽  
pp. 185-201 ◽  
Author(s):  
Gabriel Nuñez-Antonio ◽  
Eduardo Gutiérrez-Peña ◽  
Gabriel Escarela

2017 ◽  
Vol 5 (4) ◽  
pp. 1
Author(s):  
I. E. Okorie ◽  
A. C. Akpanta ◽  
J. Ohakwe ◽  
D. C. Chikezie ◽  
C. U. Onyemachi ◽  
...  

This paper introduces a new generator of probability distribution-the adjusted log-logistic generalized (ALLoG) distribution and a new extension of the standard one parameter exponential distribution called the adjusted log-logistic generalized exponential (ALLoGExp) distribution. The ALLoGExp distribution is a special case of the ALLoG distribution and we have provided some of its statistical and reliability properties. Notably, the failure rate could be monotonically decreasing, increasing or upside-down bathtub shaped depending on the value of the parameters $\delta$ and $\theta$. The method of maximum likelihood estimation was proposed to estimate the model parameters. The importance and flexibility of he ALLoGExp distribution was demonstrated with a real and uncensored lifetime data set and its fit was compared with five other exponential related distributions. The results obtained from the model fittings shows that the ALLoGExp distribution provides a reasonably better fit than the one based on the other fitted distributions. The ALLoGExp distribution is therefore ecommended for effective modelling of lifetime data sets.


2020 ◽  
Author(s):  
Michał Ciach ◽  
Błażej Miasojedow ◽  
Grzegorz Skoraczyński ◽  
Szymon Majewski ◽  
Michał Startek ◽  
...  

AbstractA common theme in many applications of computational mass spectrometry is fitting a linear combination of reference spectra to an experimental one in order to estimate the quantities of different ions, potentially with overlapping isotopic envelopes. In this work, we study this procedure in an abstract setting, in order to develop new approaches applicable to a diverse range of experiments. We introduce an application of a new spectral dissimilarity measure, known in other fields as the Wasserstein or the Earth Mover’s distance, in order to overcome the sensitivity of ordinary linear regression to measurement inaccuracies. Usinga a data set of 200 mass spectra, we demonstrate that our approach is capable of accurate estimation of ion proportions without extensive pre-processing required for state-of-the-art methods. The conclusions are further substantiated using data sets simulated in a way that mimics most of the measurement inaccuracies occurring in real experiments. We have implemented our methods in a Python 3 package, freely available at https://github.com/mciach/masserstein.


2020 ◽  
Vol 15 ◽  
pp. 42-51
Author(s):  
Shou-Jen Chang-Chien ◽  
Wajid Ali ◽  
Miin-Shen Yang

Clustering is a method for analyzing grouped data. Circular data were well used in various applications, such as wind directions, departure directions of migrating birds or animals, etc. The expectation & maximization (EM) algorithm on mixtures of von Mises distributions is popularly used for clustering circular data. In general, the EM algorithm is sensitive to initials and not robust to outliers in which it is also necessary to give a number of clusters a priori. In this paper, we consider a learning-based schema for EM, and then propose a learning-based EM algorithm on mixtures of von Mises distributions for clustering grouped circular data. The proposed clustering method is without any initial and robust to outliers with automatically finding the number of clusters. Some numerical and real data sets are used to compare the proposed algorithm with existing methods. Experimental results and comparisons actually demonstrate these good aspects of effectiveness and superiority of the proposed learning-based EM algorithm.


2021 ◽  
Vol 23 (5) ◽  
Author(s):  
Gregor Jordan ◽  
Roland F. Staack

AbstractThe testing of protein drug candidates for inducing the generation of anti-drug antibodies (ADA) plays a fundamental role in drug development. The basis of the testing strategy includes a screening assay followed by a confirmatory test. Screening assay cut points (CP) are calculated mainly based on two approaches, either non-parametric, when the data set does not appear normally distributed, or parametric, in the case of a normal distribution. A normal distribution of data is preferred and may be achieved after outlier exclusion and, if necessary, transformation of the data. The authors present a Weibull transformation and a comparison with a decision tree-based approach that was tested on 10 data sets (healthy human volunteer matrix, different projects). Emphasis is placed on a transformation calculation that can be easily reproduced to make it accessible to non-mathematicians. The cut point value and the effect on the false positive rate as well as the number of excluded samples of both methods are compared.


Author(s):  
Fatin Najihah Badarisam ◽  
Adzhar Rambli ◽  
Mohammad Illyas Sidik

<span>This paper focuses on comparing two discordancy tests between robust and non-robust statistic to detect a single outlier in univariate circular data. So far, to the best author knowledge that there is no literature make a comparison between both tests of <em>RCDu Statistic</em> and </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span>. The test statistics are based on the circular median and spacing theory. In addition, those statistics can detect multiple and patches outliers. The performance tests of <em>RCDu Statistic</em> and </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span> are tested in outlier proportion of correct detection, masking and swamping effect. At the beginning stage, we obtained the cut-off points for the <em>RCDu Statistic</em> and </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span> by applying Monte Carlo simulation studies. Then, generated sample from von Mises (VM) with the combination of sample size and concentration parameter. The estimating process of cut-off points for both statistics is repeated 3000 times at 10%, 5% and 1% upper percentiles. As a result, the <em>RCDu Statistic</em> perform well in detecting a correct single outlier. Moreover, the <em>RCDu Statistic</em> has a lower masking rate compared to </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span>.  However, the </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span> is better than <em>RCDu Statistic</em> for swamping effect due to a lower swamping rate. Thus, <em>RCDu Statistic</em> performs better than </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span> in detecting a single outlier for von Mises (VM) sample. As an illustration, both statistics were applied to the real data set from a conducted experiments series to investigate the northen cricket frogs homing ability.</span>


2010 ◽  
Vol 3 (1) ◽  
pp. 293-307 ◽  
Author(s):  
P. J. Applegate ◽  
N. M. Urban ◽  
B. J. C. Laabs ◽  
K. Keller ◽  
R. B. Alley

Abstract. Geomorphic process modeling allows us to evaluate different methods for estimating moraine ages from cosmogenic exposure dates, and may provide a means to identify the processes responsible for the excess scatter among exposure dates on individual moraines. Cosmogenic exposure dating is an elegant method for estimating the ages of moraines, but individual exposure dates are sometimes biased by geomorphic processes. Because exposure dates may be either "too young" or "too old," there are a variety of methods for estimating the ages of moraines from exposure dates. In this paper, we present Monte Carlo-based models of moraine degradation and inheritance of cosmogenic nuclides, and we use the models to examine the effectiveness of these methods. The models estimate the statistical distributions of exposure dates that we would expect to obtain from single moraines, given reasonable geomorphic assumptions. The model of moraine degradation is based on prior examples, but the inheritance model is novel. The statistical distributions of exposure dates from the moraine degradation model are skewed toward young values; in contrast, the statistical distributions of exposure dates from the inheritance model are skewed toward old values. Sensitivity analysis shows that this difference is robust for reasonable parameter choices. Thus, the skewness can help indicate whether a particular data set has problems with inheritance or moraine degradation. Given representative distributions from these two models, we can determine which methods of estimating moraine ages are most successful in recovering the correct age for test cases where this value is known. The mean is a poor estimator of moraine age for data sets drawn from skewed parent distributions, and excluding outliers before calculating the mean does not improve this mismatch. The extreme estimators (youngest date and oldest date) perform well under specific circumstances, but fail in other cases. We suggest a simple estimator that uses the skewnesses of individual data sets to determine whether the youngest date, mean, or oldest date will provide the best estimate of moraine age. Although this method is perhaps the most globally robust of the estimators we tested, it sometimes fails spectacularly. The failure of simple methods to provide accurate estimates of moraine age points toward a need for more sophisticated statistical treatments.


1977 ◽  
Vol 7 (3) ◽  
pp. 481-487 ◽  
Author(s):  
W. L. Hafley ◽  
H. T. Schreuder

The beta, Johnson's SB, Weibull, lognormal, gamma, and normal distributions are discussed in terms of their flexibility in the skewness squared (β1) − kurtosis (β2) plane. The SB and the beta are clearly the most flexible distributions since they represent surfaces in the plane, whereas the Weibull, lognormal, and gamma are represented by lines, and the normal is represented by a single point.The six distributions are fit to 21 data sets for which both diameters and heights are available. The log likelihood criterion is used to rank the six distributions in regard to their fit to each data set. Overall, Johnson's SB distribution gave the best performance in terms of quality of fit to the variety of sample distributions.


Sign in / Sign up

Export Citation Format

Share Document