scholarly journals Imputing missing distances in molecular phylogenetics

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5321 ◽  
Author(s):  
Xuhua Xia

Missing data are frequently encountered in molecular phylogenetics, but there has been no accurate distance imputation method available for distance-based phylogenetic reconstruction. The general framework for distance imputation is to explore tree space and distance values to find an optimal combination of output tree and imputed distances. Here I develop a least-square method coupled with multivariate optimization to impute multiple missing distance in a distance matrix or from a set of aligned sequences with missing genes so that some sequences share no homologous sites (whose distances therefore need to be imputed). I show that phylogenetic trees can be inferred from distance matrices with about 10% of distances missing, and the accuracy of the resulting phylogenetic tree is almost as good as the tree from full information. The new method has the advantage over a recently published one in that it does not assume a molecular clock and is more accurate (comparable to maximum likelihood method based on simulated sequences). I have implemented the function in DAMBE software, which is freely available athttp://dambe.bio.uottawa.ca.

2020 ◽  
Vol 8 (2) ◽  
pp. 610-630 ◽  
Author(s):  
Mohamed Ibrahim ◽  
Emrah Altun EA ◽  
Haitham M. Yousof

In this paper and after introducing a new model along with its properties, we estimate the unknown parameter of the new model using the Maximum likelihood method, Cram er-Von-Mises method, bootstrapping method, least square method and weighted least square method. We assess the performance of all estimation method employing simulations. All methods perform well but bootstrapping method is the best in modeling relief times whereas the maximum likelihood method is the best in modeling survival times. Censored data modeling with covariates is addressed along with the index plot of the modified deviance residuals and its Q-Q plot.


2021 ◽  
Vol 26 (2) ◽  
Author(s):  
Ali Mutair ◽  
Nada Sabah Karam

In this paper, the reliability formula of the stress-strength model is derived for probability  of a component having strength X falling between two stresses T and Z, based on The New Weibull-Pareto Distribution with unknown parameter  and known and common parameters  and . Four methods for estimating the The New Weibull-Pareto parameters are discussed which are the Maximum Likelihood, Method of Moment, Least Square Method and Weighted Least Square Method, and the comparison between these estimations based on a simulation study by the mean square error criteria for each of the small, medium and large samples. The most important conclusion is that this comparison confirms that the performance of the maximum likelihood estimator works better for all experiments studied.


2021 ◽  
Vol 1 (4 (109)) ◽  
pp. 64-73
Author(s):  
Serhii Zabolotnii ◽  
Vladyslav Khotunov ◽  
Anatolii Chepynoha ◽  
Olexandr Tkachenko

This paper considers the application of a method for maximizing polynomials in order to find estimates of the parameters of a multifactorial linear regression provided the random errors of the regression model follow an exponential power distribution. The method used is conceptually close to a maximum likelihood method because it is based on the maximization of selective statistics in the neighborhood of the true values of the evaluated parameters. However, in contrast to the classical parametric approach, it employs a partial probabilistic description in the form of a limited number of statistics of higher orders. The adaptive algorithm of statistical estimation has been synthesized, which takes into consideration the properties of regression residues and makes it possible to find refined values for the estimates of the parameters of a linear multifactorial regression using the numerical Newton-Rafson iterative procedure. Based on the apparatus of the quantity of extracted information, the analytical expressions have been derived that make it possible to analyze the theoretical accuracy (asymptotic variances) of estimates for the method of maximizing polynomials depending on the magnitude of the exponential power distribution parameters. Statistical modeling was employed to perform a comparative analysis of the variance of estimates obtained using the method of maximizing polynomials with the accuracy of classical methods: the least squares and maximum likelihood. Regions of the greatest efficiency for each studied method have been constructed, depending on the magnitude of the parameter of the form of exponential power distribution and sample size. It has been shown that estimates from the polynomial maximization method may demonstrate a much lower variance compared to the estimates from a least-square method. And, in some cases (for flat-topped distributions and in the absence of a priori information), may exceed the estimates from the maximum likelihood method in terms of accuracy


2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Guodong Wang ◽  
Zhanwen Niu ◽  
Zhen He

Accelerated life test is commonly used for the estimation of high-reliability product. In this paper, we present a simple and efficient approach to estimate the coefficients of acceleration models. Assuming that both scale and shape parameters of Weibull lifetime distribution vary with stress factors, we estimate the parameters of Weibull distribution using maximum likelihood method and reduce the bias of shape parameter estimator. Considering the heteroscedasticity, we compute the estimates of the coefficients of acceleration models through weighted least square method. Additionally, we obtain the confidence interval of low percentile via bootstrapping. We compare the proposed method with other methods using a real lifetime example. Finally, we study the performance of the proposed method by simulation. The simulation results show that our proposed method is effective.


2012 ◽  
Vol 518-523 ◽  
pp. 5616-5622
Author(s):  
Xiao Ming Han ◽  
Ding Xue ◽  
Bo Hu

The Zhalantun district of eastern part of Inner Mongolia is located at the northern section of Greater Khingan seismic belt, and it has complicated geological structure, with relatively dynamic moderately strong earthquake in past times and modern times. The seismic activities in this district is selected as the research object; based on the integrity analysis toward seismic sequence in the district, least square method is used to conduct time scanning calculation of b-value, and maximum likelihood method is used to conduct space scanning calculation of b-value. The b-value during the time scanning is the mean b-value of research zone in every scanning window, so its amplitude of variation is not quite great, with range of variation of b-value basically staying within 0.78-1.13, and range of error staying within 0.04-0.065. The space scanning results indicate that the space distribution range of b-value of Zhalantun district basically stays within 0.4-1.6, and the range of error is 0.045-0.085. The low b-value zone is the north central section of Alun River breakage, with b-value basically distributed within 0.5-0.7, which indicates that the earth crust medium of this zone is under the state of high horizontal stress accumulation and it is the dangerous zone where moderately strong earthquake or more occurs in the future.


2020 ◽  
Author(s):  
Jun Huang ◽  
Yuting Liu ◽  
Tianqi Zhu ◽  
Ziheng Yang

Abstract The phylogenetic bootstrap is the most commonly used method for assessing statistical confidence in estimated phylogenies by non-Bayesian methods such as maximum parsimony and maximum likelihood (ML). It is observed that bootstrap support tends to be high in large genomic datasets whether or not the inferred trees and clades are correct. Here we study the asymptotic behavior of bootstrap support for the ML tree in large datasets when the competing phylogenetic trees are equally right or equally wrong. We consider phylogenetic reconstruction as a problem of statistical model selection when the compared models are nonnested and misspecified. The bootstrap is found to have qualitatively different dynamics from Bayesian inference, and does not exhibit the polarized behavior of posterior model probabilities, consistent with the empirical observation that the bootstrap is more conservative than Bayesian probabilities. Nevertheless bootstrap support similarly shows fluctuations among large datasets, with no convergence to a point value, when the compared models are equally right or equally wrong. Thus in large datasets strong support for wrong trees or models is likely to occur. Our analysis provides a partial explanation for the high bootstrap support values for incorrect clades observed in empirical data analysis.


Author(s):  
Maulida Nurhidayati

The Autoregressive model is a time series univariate model for stationary models. In estimating parameters on this model can be done by several methods, namely yule-walker method, Least Square, and Maximum Likelihood. Each method has a different principle for estimating model parameters so that the results obtained will also be different. Based on this, in this study, the AR(1) model parameter estimation was estimated by generating data simulated 1000 times to see the performance of Yule-Walker, Least Square, and Maximum Likelihood methods. In addition, the comparison of these three methods is also done on ROA BPRS data that follows the AR(1) model. The results showed that the Maximum Likelihood method was able to provide mode results and comparison of the most suitable estimation results for simulation data and produce the smallest MAE values in the data in sample and MAPE, MSE, and MAE the smallest in the out sample data. These results show that the Maximum Likelihood method is the best method for modeling data that follows the AR(1) model.


Sign in / Sign up

Export Citation Format

Share Document