scholarly journals A comparison on two discordancy tests to detect outlier in von mises (VM) sample

Author(s):  
Fatin Najihah Badarisam ◽  
Adzhar Rambli ◽  
Mohammad Illyas Sidik

<span>This paper focuses on comparing two discordancy tests between robust and non-robust statistic to detect a single outlier in univariate circular data. So far, to the best author knowledge that there is no literature make a comparison between both tests of <em>RCDu Statistic</em> and </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span>. The test statistics are based on the circular median and spacing theory. In addition, those statistics can detect multiple and patches outliers. The performance tests of <em>RCDu Statistic</em> and </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span> are tested in outlier proportion of correct detection, masking and swamping effect. At the beginning stage, we obtained the cut-off points for the <em>RCDu Statistic</em> and </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span> by applying Monte Carlo simulation studies. Then, generated sample from von Mises (VM) with the combination of sample size and concentration parameter. The estimating process of cut-off points for both statistics is repeated 3000 times at 10%, 5% and 1% upper percentiles. As a result, the <em>RCDu Statistic</em> perform well in detecting a correct single outlier. Moreover, the <em>RCDu Statistic</em> has a lower masking rate compared to </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span>.  However, the </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span> is better than <em>RCDu Statistic</em> for swamping effect due to a lower swamping rate. Thus, <em>RCDu Statistic</em> performs better than </span><em><span>𝐺</span><sub><span>1</span></sub><span> Statistic</span></em><span> in detecting a single outlier for von Mises (VM) sample. As an illustration, both statistics were applied to the real data set from a conducted experiments series to investigate the northen cricket frogs homing ability.</span>

Author(s):  
Parisa Torkaman

The generalized inverted exponential distribution is introduced as a lifetime model with good statistical properties. This paper, the estimation of the probability density function and the cumulative distribution function of with five different estimation methods: uniformly minimum variance unbiased(UMVU), maximum likelihood(ML), least squares(LS), weighted least squares (WLS) and percentile(PC) estimators are considered. The performance of these estimation procedures, based on the mean squared error (MSE) by numerical simulations are compared. Simulation studies express that the UMVU estimator performs better than others and when the sample size is large enough the ML and UMVU estimators are almost equivalent and efficient than LS, WLS and PC. Finally, the result using a real data set are analyzed.


F1000Research ◽  
2020 ◽  
Vol 8 ◽  
pp. 2024
Author(s):  
Joshua P. Zitovsky ◽  
Michael I. Love

Allelic imbalance occurs when the two alleles of a gene are differentially expressed within a diploid organism and can indicate important differences in cis-regulation and epigenetic state across the two chromosomes. Because of this, the ability to accurately quantify the proportion at which each allele of a gene is expressed is of great interest to researchers. This becomes challenging in the presence of small read counts and/or sample sizes, which can cause estimators for allelic expression proportions to have high variance. Investigators have traditionally dealt with this problem by filtering out genes with small counts and samples. However, this may inadvertently remove important genes that have truly large allelic imbalances. Another option is to use pseudocounts or Bayesian estimators to reduce the variance. To this end, we evaluated the accuracy of four different estimators, the latter two of which are Bayesian shrinkage estimators: maximum likelihood, adding a pseudocount to each allele, approximate posterior estimation of GLM coefficients (apeglm) and adaptive shrinkage (ash). We also wrote C++ code to quickly calculate ML and apeglm estimates and integrated it into the apeglm package. The four methods were evaluated on two simulations and one real data set. Apeglm consistently performed better than ML according to a variety of criteria, and generally outperformed use of pseudocounts as well. Ash also performed better than ML in one of the simulations, but in the other performance was more mixed. Finally, when compared to five other packages that also fit beta-binomial models, the apeglm package was substantially faster and more numerically reliable, making our package useful for quick and reliable analyses of allelic imbalance. Apeglm is available as an R/Bioconductor package at http://bioconductor.org/packages/apeglm.


Filomat ◽  
2018 ◽  
Vol 32 (17) ◽  
pp. 5931-5947
Author(s):  
Hatami Mojtaba ◽  
Alamatsaz Hossein

In this paper, we propose a new transformation of circular random variables based on circular distribution functions, which we shall call inverse distribution function (id f ) transformation. We show that M?bius transformation is a special case of our id f transformation. Very general results are provided for the properties of the proposed family of id f transformations, including their trigonometric moments, maximum entropy, random variate generation, finite mixture and modality properties. In particular, we shall focus our attention on a subfamily of the general family when id f transformation is based on the cardioid circular distribution function. Modality and shape properties are investigated for this subfamily. In addition, we obtain further statistical properties for the resulting distribution by applying the id f transformation to a random variable following a von Mises distribution. In fact, we shall introduce the Cardioid-von Mises (CvM) distribution and estimate its parameters by the maximum likelihood method. Finally, an application of CvM family and its inferential methods are illustrated using a real data set containing times of gun crimes in Pittsburgh, Pennsylvania.


In this paper, we have defined a new two-parameter new Lindley half Cauchy (NLHC) distribution using Lindley-G family of distribution which accommodates increasing, decreasing and a variety of monotone failure rates. The statistical properties of the proposed distribution such as probability density function, cumulative distribution function, quantile, the measure of skewness and kurtosis are presented. We have briefly described the three well-known estimation methods namely maximum likelihood estimators (MLE), least-square (LSE) and Cramer-Von-Mises (CVM) methods. All the computations are performed in R software. By using the maximum likelihood method, we have constructed the asymptotic confidence interval for the model parameters. We verify empirically the potentiality of the new distribution in modeling a real data set.


2020 ◽  
Vol 44 (5) ◽  
pp. 362-375
Author(s):  
Tyler Strachan ◽  
Edward Ip ◽  
Yanyan Fu ◽  
Terry Ackerman ◽  
Shyh-Huei Chen ◽  
...  

As a method to derive a “purified” measure along a dimension of interest from response data that are potentially multidimensional in nature, the projective item response theory (PIRT) approach requires first fitting a multidimensional item response theory (MIRT) model to the data before projecting onto a dimension of interest. This study aims to explore how accurate the PIRT results are when the estimated MIRT model is misspecified. Specifically, we focus on using a (potentially misspecified) two-dimensional (2D)-MIRT for projection because of its advantages, including interpretability, identifiability, and computational stability, over higher dimensional models. Two large simulation studies (I and II) were conducted. Both studies examined whether the fitting of a 2D-MIRT is sufficient to recover the PIRT parameters when multiple nuisance dimensions exist in the test items, which were generated, respectively, under compensatory MIRT and bifactor models. Various factors were manipulated, including sample size, test length, latent factor correlation, and number of nuisance dimensions. The results from simulation studies I and II showed that the PIRT was overall robust to a misspecified 2D-MIRT. Smaller third and fourth simulation studies were done to evaluate recovery of the PIRT model parameters when the correctly specified higher dimensional MIRT or bifactor model was fitted with the response data. In addition, a real data set was used to illustrate the robustness of PIRT.


Author(s):  
Arun Kumar Chaudhary ◽  
Vijay Kumar

In this study, we have introduced a three-parameter probabilistic model established from type I half logistic-Generating family called half logistic modified exponential distribution. The mathematical and statistical properties of this distribution are also explored. The behavior of probability density, hazard rate, and quantile functions are investigated. The model parameters are estimated using the three well known estimation methods namely maximum likelihood estimation (MLE), least-square estimation (LSE) and Cramer-Von-Mises estimation (CVME) methods. Further, we have taken a real data set and verified that the presented model is quite useful and more flexible for dealing with a real data set. KEYWORDS— Half-logistic distribution, Estimation, CVME ,LSE, , MLE


2021 ◽  
Vol 36 ◽  
pp. 01006
Author(s):  
Kooi Huat Ng ◽  
Kok Haur Ng ◽  
Jeng Young Liew

It is crucial to realize when a process has changed and to what extent it has changed, then it would certainly ease the task. On occasion that practitioners could determine the time point of the change, they would have a smaller search window to pursue for the special cause. As a result, the special cause can be discovered quicker and the necessary actions to improve quality can be triggered sooner. In this paper, we had demonstrated the use of so-called exploratory data analysis robust modified individuals control chart incorporating the M-scale estimator and had made some comparisons to the existing charts. The proposed modified robust individuals control chart which incorporates the M-scale estimator in order to compute the process standard deviation offers substantial improvements over the existing median absolute deviation framework. With respect to the application in real data set, the proposed approach appears to perform better than the typical robust control chart, and outperforms other conventional charts particularly in the presence of contamination. Thus, it is for these reasons that the proposed modified robust individuals control chart is preferred especially when there is a possible existence of outliers in data collection process.


2019 ◽  
Vol 9 (18) ◽  
pp. 3801 ◽  
Author(s):  
Hyuk-Yoon Kwon

In this paper, we propose a method to construct a lightweight key-value store based on the Windows native features. The main idea is providing a thin wrapper for the key-value store on top of a built-in storage in Windows, called Windows registry. First, we define a mapping of the components in the key-value store onto the components in the Windows registry. Then, we present a hash-based multi-level registry index so as to distribute the key-value data balanced and to efficiently access them. Third, we implement basic operations of the key-value store (i.e., Get, Put, and Delete) by manipulating the Windows registry using the Windows native APIs. We call the proposed key-value store WR-Store. Finally, we propose an efficient ETL (Extract-Transform-Load) method to migrate data stored in WR-Store into any other environments that support existing key-value stores. Because the performance of the Windows registry has not been studied much, we perform the empirical study to understand the characteristics of WR-Store, and then, tune the performance of WR-Store to find the best parameter setting. Through extensive experiments using synthetic and real data sets, we show that the performance of WR-Store is comparable to or even better than the state-of-the-art systems (i.e., RocksDB, BerkeleyDB, and LevelDB). Especially, we show the scalability of WR-Store. That is, WR-Store becomes much more efficient than the other key-value stores as the size of data set increases. In addition, we show that the performance of WR-Store is maintained even in the case of intensive registry workloads where 1000 processes accessing to the registry actively are concurrently running.


2021 ◽  
Vol 8 (1) ◽  
pp. 01-09
Author(s):  
Sanku Dey ◽  
Mahendra Saha ◽  
Sankar Goswami

This paper addresses the different methods of estimation of the unknown parameter of one parameter A(α) distribution from the frequentist point of view. We briefly describe different approaches, namely, maximum likelihood estimator, least square and weighted least square estimators, maximum product spacing estimators, Cram´er-von Mises estimator and compare those using extensive numerical simulations. Next, we obtain parametric bootstrap confidence interval of the parameter using frequentist approaches. Finally, one real data set has been analysed for illustrative purposes.


2018 ◽  
Vol 33 (1) ◽  
pp. 31-43
Author(s):  
Bol A. M. Atem ◽  
Suleman Nasiru ◽  
Kwara Nantomah

Abstract This article studies the properties of the Topp–Leone linear exponential distribution. The parameters of the new model are estimated using maximum likelihood estimation, and simulation studies are performed to examine the finite sample properties of the parameters. An application of the model is demonstrated using a real data set. Finally, a bivariate extension of the model is proposed.


Sign in / Sign up

Export Citation Format

Share Document