scholarly journals Build a Better Bootstrap and the RAWR Shall Beat a Random Path to Your Door: Phylogenetic Support Estimation Revisited

2020 ◽  
Author(s):  
Wei Wang ◽  
Kevin J. Liu

AbstractMotivationThe standard bootstrap method is used throughout science and engineering to perform general-purpose non-parametric resampling and re-estimation. Among the most widely cited and widely used such applications is the phylogenetic bootstrap method, which Felsenstein proposed in 1985 as a means to place statistical confidence intervals on an estimated phylogeny (or estimate “phylogenetic support”). A key simplifying assumption of the bootstrap method is that input data are independent and identically distributed (i.i.d.). However, the i.i.d. assumption is an over-simplification for biomolecular sequence analysis, as Felsenstein noted. Special-purpose fully parametric or semi-parametric methods for phylogenetic support estimation have since been introduced, some of which are intended to address this concern.ResultsIn this study, we introduce a new sequence-aware non-parametric resampling technique, which we refer to as RAWR (“RAndom Walk Resampling”). RAWR consists of random walks that synthesize and extend the standard bootstrap method and the “mirrored inputs” idea of Landan and Graur. We apply RAWR to the task of phylogenetic support estimation. RAWR’s performance is compared to the state of the art using synthetic and empirical data that span a range of dataset sizes and evolutionary divergence. We show that RAWR support estimates offer comparable or typically superior type I and type II error compared to phylogenetic bootstrap support as well as GUIDANCE2, a state-of-the-art purpose-built fully parametric method. Additional simulation study experiments help to clarify practical considerations regarding RAWR support estimation. We conclude with thoughts on future research directions and the untapped potential for sequence-aware non-parametric resampling and re-estimation.AvailabilityData and software are publicly available under open-source software and open data licenses at: https://gitlab.msu.edu/liulab/[email protected]

2018 ◽  
Author(s):  
Wei Wang ◽  
Jack Smith ◽  
Hussein A. Hejase ◽  
Kevin J. Liu

AbstractNon-parametric and semi-parametric resampling procedures are widely used to perform support estimation in computational biology and bioinformatics. Among the most widely used methods in this class is the standard bootstrap method, which consists of random sampling with replacement. While not requiring assumptions about any particular parametric model for resampling purposes, the bootstrap and related techniques assume that sites are independent and identically distributed (i.i.d.). The i.i.d. assumption can be an over-simplification for many problems in computational biology and bioinformatics. In particular, sequential dependence within biomolecular sequences is often an essential biological feature due to biochemical function, evolutionary processes such as recombination, and other factors.To relax the simplifying i.i.d. assumption, we propose a new non-parametric/semi-parametric sequential resampling technique that generalizes “Heads-or-Tails” mirrored inputs, a simple but clever technique due to Landan and Graur. The generalized procedure takes the form of random walks along either aligned or unaligned biomolecular sequences. We refer to our new method as the SERES (or “SEquential RESampling”) method.To demonstrate the flexibility of the new technique, we apply SERES to two different applications – one involving aligned inputs and the other involving unaligned inputs. Using simulated and empirical data, we show that SERES-based support estimation yields comparable or typically better performance compared to state-of-the-art methods for both applications.


Author(s):  
Gavin C. Hudson-Lamb ◽  
Johan P. Schoeman ◽  
Emma H. Hooijberg ◽  
Sonja K. Heinrich ◽  
Adrian S.W. Tordiffe

Published haematologic and serum biochemistry reference intervals are very scarce for captive cheetahs and even more for free-ranging cheetahs. The current study was performed to establish reference intervals for selected serum biochemistry analytes in cheetahs. Baseline serum biochemistry analytes were analysed from 66 healthy Namibian cheetahs. Samples were collected from 30 captive cheetahs at the AfriCat Foundation and 36 free-ranging cheetahs from central Namibia. The effects of captivity-status, age, sex and haemolysis score on the tested serum analytes were investigated. The biochemistry analytes that were measured were sodium, potassium, magnesium, chloride, urea and creatinine. The 90% confidence interval of the reference limits was obtained using the non-parametric bootstrap method. Reference intervals were preferentially determined by the non-parametric method and were as follows: sodium (128 mmol/L – 166 mmol/L), potassium (3.9 mmol/L – 5.2 mmol/L), magnesium (0.8 mmol/L – 1.2 mmol/L), chloride (97 mmol/L – 130 mmol/L), urea (8.2 mmol/L – 25.1 mmol/L) and creatinine (88 µmol/L – 288 µmol/L). Reference intervals from the current study were compared with International Species Information System values for cheetahs and found to be narrower. Moreover, age, sex and haemolysis score had no significant effect on the serum analytes in this study. Separate reference intervals for captive and free-ranging cheetahs were also determined. Captive cheetahs had higher urea values, most likely due to dietary factors. This study is the first to establish reference intervals for serum biochemistry analytes in cheetahs according to international guidelines. These results can be used for future health and disease assessments in both captive and free-ranging cheetahs.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Jiaqiang Zhu ◽  
Shiquan Sun ◽  
Xiang Zhou

AbstractSpatial transcriptomic studies are becoming increasingly common and large, posing important statistical and computational challenges for many analytic tasks. Here, we present SPARK-X, a non-parametric method for rapid and effective detection of spatially expressed genes in large spatial transcriptomic studies. SPARK-X not only produces effective type I error control and high power but also brings orders of magnitude computational savings. We apply SPARK-X to analyze three large datasets, one of which is only analyzable by SPARK-X. In these data, SPARK-X identifies many spatially expressed genes including those that are spatially expressed within the same cell type, revealing new biological insights.


2017 ◽  
Vol 27 (9) ◽  
pp. 2775-2794 ◽  
Author(s):  
Yan Zhuang ◽  
Ying Guan ◽  
Libin Qiu ◽  
Meisheng Lai ◽  
Ming T Tan ◽  
...  

Longitudinal ordinal data are common in biomedical research. Although various methods for the analysis of such data have been proposed in the past few decades, they are limited in several ways. For instance, the constraints on parameters in the proportional odds model may result in convergence problems; the rank-based aligned rank transform method imposes constraints on other parameters and the distributional assumptions with parametric model. We propose a novel rank-based non-parametric method that models the profile rather than the distribution of the data to make an effective statistical inference without the constraint conditions. We construct the test statistic of the interaction first, and then construct the test statistics of the main effects separately with or without the interaction, while “adjusted coefficient” for the case of ties is derived. A simulation study is conducted for comparison between rank-based non-parametric and rank-transformed analysis of variance. The results show that type I errors of the two methods are both maintained closer to the priori level, but the statistical power of rank-based non-parametric is greater than that of rank-transformed analysis of variance, suggesting higher efficiency of the former. We then apply rank-based non-parametric to two real studies on acne and osteoporosis, and the results also illustrate the effectiveness of rank-based non-parametric, particularly when the distribution is skewed.


Author(s):  
Muhammad Yousaf ◽  
Petr Bris

A systematic literature review (SLR) from 1991 to 2019 is carried out about EFQM (European Foundation for Quality Management) excellence model in this paper. The aim of the paper is to present state of the art in quantitative research on the EFQM excellence model that will guide future research lines in this field. The articles were searched with the help of six strings and these six strings were executed in three popular databases i.e. Scopus, Web of Science, and Science Direct. Around 584 peer-reviewed articles examined, which are directly linked with the subject of quantitative research on the EFQM excellence model. About 108 papers were chosen finally, then the purpose, data collection, conclusion, contributions, and type of quantitative of the selected papers are discussed and analyzed briefly in this study. Thus, this study identifies the focus areas of the researchers and knowledge gaps in empirical quantitative literature on the EFQM excellence model. This article also presents the lines of future research.


2019 ◽  
Vol 14 (2) ◽  
pp. 146-151 ◽  
Author(s):  
Junaid Khan ◽  
Amit Alexander ◽  
Mukta Agrawal ◽  
Ajazuddin ◽  
Sunil Kumar Dubey ◽  
...  

Diabetes and its complications are a significant health concern throughout the globe. There are physiological differences in the mechanism of type-I and type-II diabetes and the conventional drug therapy as well as insulin administration seem to be insufficient to address the problem at large successfully. Hypoglycemic swings, frequent dose adjustments and resistance to the drug are major problems associated with drug therapy. Cellular approaches through stem cell based therapeutic interventions offer a promising solution to the problem. The need for pancreatic transplants in case of Type- I diabetes can also be by-passed/reduced due to the formation of insulin producing β cells via stem cells. Embryonic Stem Cells (ESCs) and induced Pluripotent Stem Cells (iPSCs), successfully used for generating insulin producing β cells. Although many experiments have shown promising results with stem cells in vitro, their clinical testing still needs more exploration. The review attempts to bring into light the clinical studies favoring the transplantation of stem cells in diabetic patients with an objective of improving insulin secretion and improving degeneration of different tissues in response to diabetes. It also focuses on the problems associated with successful implementation of the technique and possible directions for future research.


Mathematics ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 1169
Author(s):  
Juan Bógalo ◽  
Pilar Poncela ◽  
Eva Senra

Real-time monitoring of the economy is based on activity indicators that show regular patterns such as trends, seasonality and business cycles. However, parametric and non-parametric methods for signal extraction produce revisions at the end of the sample, and the arrival of new data makes it difficult to assess the state of the economy. In this paper, we compare two signal extraction procedures: Circulant Singular Spectral Analysis, CiSSA, a non-parametric technique in which we can extract components associated with desired frequencies, and a parametric method based on ARIMA modelling. Through a set of simulations, we show that the magnitude of the revisions produced by CiSSA converges to zero quicker, and it is smaller than that of the alternative procedure.


Author(s):  
Judith H. Parkinson-Schwarz ◽  
Arne C. Bathke

AbstractIn this paper, we propose a new non-parametric test for equality of distributions. The test is based on the recently introduced measure of (niche) overlap and its rank-based estimator. As the estimator makes only one basic assumption on the underlying distribution, namely continuity, the test is universal applicable in contrast to many tests that are restricted to only specific scenarios. By construction, the new test is capable of detecting differences in location and scale. It thus complements the large class of rank-based tests that are constructed based on the non-parametric relative effect. In simulations this new test procedure obtained higher power and lower type I error compared to two common tests in several settings. The new procedure shows overall good performance. Together with its simplicity, this test can be used broadly.


Sign in / Sign up

Export Citation Format

Share Document