A novel rank-based non-parametric method for longitudinal ordinal data

2017 ◽  
Vol 27 (9) ◽  
pp. 2775-2794 ◽  
Author(s):  
Yan Zhuang ◽  
Ying Guan ◽  
Libin Qiu ◽  
Meisheng Lai ◽  
Ming T Tan ◽  
...  

Longitudinal ordinal data are common in biomedical research. Although various methods for the analysis of such data have been proposed in the past few decades, they are limited in several ways. For instance, the constraints on parameters in the proportional odds model may result in convergence problems; the rank-based aligned rank transform method imposes constraints on other parameters and the distributional assumptions with parametric model. We propose a novel rank-based non-parametric method that models the profile rather than the distribution of the data to make an effective statistical inference without the constraint conditions. We construct the test statistic of the interaction first, and then construct the test statistics of the main effects separately with or without the interaction, while “adjusted coefficient” for the case of ties is derived. A simulation study is conducted for comparison between rank-based non-parametric and rank-transformed analysis of variance. The results show that type I errors of the two methods are both maintained closer to the priori level, but the statistical power of rank-based non-parametric is greater than that of rank-transformed analysis of variance, suggesting higher efficiency of the former. We then apply rank-based non-parametric to two real studies on acne and osteoporosis, and the results also illustrate the effectiveness of rank-based non-parametric, particularly when the distribution is skewed.

Horticulturae ◽  
2019 ◽  
Vol 5 (3) ◽  
pp. 57 ◽  
Author(s):  
Edward Durner

Most statistical techniques commonly used in horticultural research are parametric tests that are valid only for normal data with homogeneous variances. While parametric tests are robust when the data ‘slightly’ deviate from normality, a significant departure from normality leads to reduced power and the probability of a type I error increases. Transformations often used to normalize non-normal data can be time consuming, cumbersome and confusing and common non-parametric tests are not appropriate for evaluating interactive effects common in horticultural research. The aligned rank transformation allows non-parametric testing for interactions and main effects using standard ANOVA techniques. This has not been widely adapted due to its rigorous mathematical nature, however, a downloadable (ARTool) is now available, which performs the math needed for the transformation. This study provides step-by-step instructions for integrating ARTool with the free edition of SAS (SAS University Edition) in an easily employed method for testing normality, transforming data with aligned ranks, and analysing data using standard ANOVAs.


2020 ◽  
Author(s):  
Wei Wang ◽  
Kevin J. Liu

AbstractMotivationThe standard bootstrap method is used throughout science and engineering to perform general-purpose non-parametric resampling and re-estimation. Among the most widely cited and widely used such applications is the phylogenetic bootstrap method, which Felsenstein proposed in 1985 as a means to place statistical confidence intervals on an estimated phylogeny (or estimate “phylogenetic support”). A key simplifying assumption of the bootstrap method is that input data are independent and identically distributed (i.i.d.). However, the i.i.d. assumption is an over-simplification for biomolecular sequence analysis, as Felsenstein noted. Special-purpose fully parametric or semi-parametric methods for phylogenetic support estimation have since been introduced, some of which are intended to address this concern.ResultsIn this study, we introduce a new sequence-aware non-parametric resampling technique, which we refer to as RAWR (“RAndom Walk Resampling”). RAWR consists of random walks that synthesize and extend the standard bootstrap method and the “mirrored inputs” idea of Landan and Graur. We apply RAWR to the task of phylogenetic support estimation. RAWR’s performance is compared to the state of the art using synthetic and empirical data that span a range of dataset sizes and evolutionary divergence. We show that RAWR support estimates offer comparable or typically superior type I and type II error compared to phylogenetic bootstrap support as well as GUIDANCE2, a state-of-the-art purpose-built fully parametric method. Additional simulation study experiments help to clarify practical considerations regarding RAWR support estimation. We conclude with thoughts on future research directions and the untapped potential for sequence-aware non-parametric resampling and re-estimation.AvailabilityData and software are publicly available under open-source software and open data licenses at: https://gitlab.msu.edu/liulab/[email protected]


1982 ◽  
Vol 7 (3) ◽  
pp. 207-214 ◽  
Author(s):  
Jennifer J. Clinch ◽  
H. J. Keselman

The ANOVA, Welch, and Brown and Forsyth tests for mean equality were compared using Monte Carlo methods. The tests’ rates of Type I error and power were examined when populations were non-normal, variances were heterogeneous, and group sizes were unequal. The ANOVA F test was most affected by the assumption violations. The test proposed by Brown and Forsyth appeared, on the average, to be the “best” test statistic for testing an omnibus hypothesis of mean equality.


2019 ◽  
Vol 2019 (3) ◽  
pp. 310-330 ◽  
Author(s):  
Marika Swanberg ◽  
Ira Globus-Harris ◽  
Iris Griffith ◽  
Anna Ritz ◽  
Adam Groce ◽  
...  

Abstract Hypothesis testing is one of the most common types of data analysis and forms the backbone of scientific research in many disciplines. Analysis of variance (ANOVA) in particular is used to detect dependence between a categorical and a numerical variable. Here we show how one can carry out this hypothesis test under the restrictions of differential privacy. We show that the F -statistic, the optimal test statistic in the public setting, is no longer optimal in the private setting, and we develop a new test statistic F1 with much higher statistical power. We show how to rigorously compute a reference distribution for the F1 statistic and give an algorithm that outputs accurate p-values. We implement our test and experimentally optimize several parameters. We then compare our test to the only previous work on private ANOVA testing, using the same effect size as that work. We see an order of magnitude improvement, with our test requiring only 7% as much data to detect the effect.


2019 ◽  
Vol 17 (2) ◽  
Author(s):  
Yan Wang ◽  
Thanh Pham ◽  
Diep Nguyen ◽  
Eun Sook Kim ◽  
Yi-Hsin Chen ◽  
...  

A simulation study was conducted to examine the efficacy of conditional analysis of variance (ANOVA) methods where the initial homogeneity of variance screening leads to the choice between the ANOVA F test and robust ANOVA methods. Type I error control and statistical power were investigated under various conditions.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Jiaqiang Zhu ◽  
Shiquan Sun ◽  
Xiang Zhou

AbstractSpatial transcriptomic studies are becoming increasingly common and large, posing important statistical and computational challenges for many analytic tasks. Here, we present SPARK-X, a non-parametric method for rapid and effective detection of spatially expressed genes in large spatial transcriptomic studies. SPARK-X not only produces effective type I error control and high power but also brings orders of magnitude computational savings. We apply SPARK-X to analyze three large datasets, one of which is only analyzable by SPARK-X. In these data, SPARK-X identifies many spatially expressed genes including those that are spatially expressed within the same cell type, revealing new biological insights.


1984 ◽  
Vol 9 (2) ◽  
pp. 129-149 ◽  
Author(s):  
Stephen F. Olejnik ◽  
James Algina

Parametric analysis of covariance was compared to analysis of covariance with data transformed using ranks. Using a computer simulation approach, the two strategies were compared in terms of the proportion of Type I errors made and statistical power when the conditional distribution of errors was normal and homoscedastic, normal and heteroscedastic, non-normal and homoscedastic, and non-normal and heteroscedastic. The results indicated that parametric ANCOVA was robust to violations of either normality or homoscedasticity. However, when both assumptions were violated, the observed α levels underestimated the nominal α level when sample sizes were small and α = .05. Rank ANCOVA led to a slightly liberal test of the hypothesis when the covariate was non-normal, the sample size was small, and the errors were heteroscedastic. Practical significant power differences favoring the rank ANCOVA procedures were observed with moderate sample sizes and a variety of conditional distributions.


2001 ◽  
Vol 78 (3) ◽  
pp. 303-316 ◽  
Author(s):  
P. TILQUIN ◽  
W. COPPIETERS ◽  
J. M. ELSEN ◽  
F. LANTIER ◽  
C. MORENO ◽  
...  

Most QTL mapping methods assume that phenotypes follow a normal distribution, but many phenotypes of interest are not normally distributed, e.g. bacteria counts (or colony-forming units, CFU). Such data are extremely skewed to the right and can present a high amount of zero values, which are ties from a statistical point of view. Our objective is therefore to assess the efficiency of four QTL mapping methods applied to bacteria counts: (1) least-squares (LS) analysis, (2) maximum-likelihood (ML) analysis, (3) non-parametric (NP) mapping and (4) nested ANOVA (AN). A transformation based on quantiles is used to mimic observed distributions of bacteria counts. Single positions (1 marker, 1 QTL) as well as chromosome scans (11 markers, 1 QTL) are simulated. When compared with the analysis of a normally distributed phenotype, the analysis of raw bacteria counts leads to a strong decrease in power for parametric methods, but no decrease is observed for NP. However, when a mathematical transformation (MT) is applied to bacteria counts prior to analysis, parametric methods have the same power as NP. Furthermore, parametric methods, when coupled with MT, outperform NP when bacteria counts have a very high proportion of zeros (70·8%). Our results show that the loss of power is mainly explained by the asymmetry of the phenotypic distribution, for parametric methods, and by the existence of ties, for the non-parametric method. Therefore, mapping of QTL for bacterial diseases, as well as for other diseases assessed by a counting process, should focus on the occurrence of ties in phenotypes before choosing the appropriate QTL mapping method.


Sign in / Sign up

Export Citation Format

Share Document