Equivalence Testing for Multiple Regression

2021 ◽  
Author(s):  
Udi Alter ◽  
Alyssa Counsell

AbstractPsychological research is rife with inappropriately concluding lack of association or no effect between a predictor and the outcome in regression models following statistically nonsignificant results. This approach is methodologically flawed, however, because failing to reject the null hypothesis using traditional, difference-based tests does not mean the null is true (i.e., no relationship). This flawed methodology leads to high rates of incorrect conclusions that flood the literature. This thesis introduces a novel, methodologically sound alternative. I demonstrate how equivalence testing can be applied to evaluate whether a predictor has negligible effects on the outcome variable in multiple regression. I constructed a simulation study to evaluate the performance (i.e., power and error rates) of two equivalence-based tests and compared it to the common, but inappropriate, method of concluding no effect by failing to reject the null hypothesis of the traditional test. I further propose two R functions to accompany this thesis and supply researchers with open-access and easy-to-use tools that they can flexibly adopt in their own research. The use of the proposed equivalence-based methods and R functions is then illustrated using examples from the literature, and recommendations for results reporting and interpretations are discussed. My results demonstrate that using tests of equivalence instead of the traditional test is the appropriate statistical choice: Tests of equivalence show high rates of correct conclusions, especially with larger sample sizes, and low rates of incorrect conclusions, whereas the traditional method demonstrates unacceptably high incorrect conclusion rates.

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Runzhi Zhang ◽  
Alejandro R. Walker ◽  
Susmita Datta

Abstract Background Composition of microbial communities can be location-specific, and the different abundance of taxon within location could help us to unravel city-specific signature and predict the sample origin locations accurately. In this study, the whole genome shotgun (WGS) metagenomics data from samples across 16 cities around the world and samples from another 8 cities were provided as the main and mystery datasets respectively as the part of the CAMDA 2019 MetaSUB “Forensic Challenge”. The feature selecting, normalization, three methods of machine learning, PCoA (Principal Coordinates Analysis) and ANCOM (Analysis of composition of microbiomes) were conducted for both the main and mystery datasets. Results Features selecting, combined with the machines learning methods, revealed that the combination of the common features was effective for predicting the origin of the samples. The average error rates of 11.93 and 30.37% of three machine learning methods were obtained for main and mystery datasets respectively. Using the samples from main dataset to predict the labels of samples from mystery dataset, nearly 89.98% of the test samples could be correctly labeled as “mystery” samples. PCoA showed that nearly 60% of the total variability of the data could be explained by the first two PCoA axes. Although many cities overlapped, the separation of some cities was found in PCoA. The results of ANCOM, combined with importance score from the Random Forest, indicated that the common “family”, “order” of the main-dataset and the common “order” of the mystery dataset provided the most efficient information for prediction respectively. Conclusions The results of the classification suggested that the composition of the microbiomes was distinctive across the cities, which could be used to identify the sample origins. This was also supported by the results from ANCOM and importance score from the RF. In addition, the accuracy of the prediction could be improved by more samples and better sequencing depth.


2020 ◽  
Vol 30 (1) ◽  
pp. e38066
Author(s):  
Jimmie Leppink

Research in education is often associated with comparing group averages and linear relations in sufficiently large samples and evidence-based practice is about using the outcomes of that research in the practice of education. However, there are questions that are important for the practice of education that cannot really be addressed by comparisons of group averages and linear relations, no matter how large the samples. Besides, different types of constraints including logistic, financial, and ethical ones may make larger-sample research unfeasible or at least questionable. What has remained less known in many fields is that there are study designs and statistical methods for research involving small samples or even individuals that allow us to address questions of importance for the practice of education. This article discusses one type of such situations and provides a simple coherent statistical approach that provides point and interval estimates of differences of interest regardless of the type of the outcome variable and that is of use in other types of studies involving large samples, small samples, and single individuals.


2020 ◽  
Author(s):  
Runzhi Zhang ◽  
Alejandro R. Walker ◽  
Susmita Datta

Abstract BackgroundComposition of microbial communities can be location specific, and the different abundance of taxon within location could help us to unravel city-specific signature and predict the sample origin locations accurately. In this study, the whole genome shotgun (WGS) metagenomics data from samples across 16 cities around the world and samples from another 8 cities were provided as the main and mystery datasets respectively as the part of the CAMDA 2019 MetaSUB “Forensic Challenge”. The feature selection, normalization, three methods of machine learning, PCoA (Principal Coordinates Analysis) and ANCOM (Analysis of composition of microbiomes) were conducted for both the main and mystery datasets.ResultsFeature selection, combined with the machines learning methods, revealed that the combination of the common features was effective for predicting the origin of the samples. The average error rates of 11.6% and 30.0% of three machine learning methods were obtained for main and mystery datasets respectively. Using the samples from main dataset to predict the labels of samples from mystery dataset, nearly 89.98% of the test samples could be correctly labeled as “mystery” samples. PCoA showed that nearly 60% of the total variability of the data could be explained by the first two PCoA axes. Although many cities overlapped, the separation of some cities was found in PCoA. The results of ANCOM, combined with importance score from the Random Forest, indicated that the common “family”, “order” of the main-dataset and the common “order” of the mystery dataset provided the most efficient information for prediction respectively.ConclusionsThe results of the classification suggested that the composition of the microbiomes was distinctive across the cities, which was also supported by the results from ANCOM and importance score from the RF. The analysis utilized in this study can be of great help in field of forensic science to efficiently predict the origin of the samples. And the accurate of the prediction could be improved by more samples and better sequencing depth.


2020 ◽  
pp. 002076402093632 ◽  
Author(s):  
Kanika K Ahuja ◽  
Debanjan Banerjee ◽  
Kritika Chaudhary ◽  
Chehak Gidwani

Background: The Coronavirus disease 2019 (COVID-19) has emerged as a global health threat. Biological disasters like this can generate immense prejudice, xenophobia, stigma and othering, all of which have adverse consequences on health and well-being. In a country as diverse and populous in India, such crisis can trigger communalism and mutual blame. Keeping this in context, this study explored the relationship between well-being and xenophobic attitudes towards Muslims, collectivism and fear of COVID-19 in India. Methods: The study was carried out on 600 non-Islamic Indians (231 males, 366 females and 3 others; mean age: 38.76 years), using convenience sampling. An online survey containing Fear of Coronavirus scale, Warwick–Edinburgh Mental Well-Being Scale and Collectivism Scale was used. Xenophobia was assessed using two scales: generalized prejudice towards Muslims and specific xenophobic tendencies towards Muslims during COVID-19. The data were analysed using correlational methods and multiple regression. Results: The findings showed that positively significant relationship exists between well-being and age as well as with collectivism, while an inversely significant relationship between well-being and fear of COVID-19 was found. The results of the multiple regression analysis shows that fear of COVID-19, age, collectivism and generalized xenophobia, in the order of their importance, together contributed to nearly 20% of variance in well-being. Conclusion: The findings are reflective of the importance of collectivism in enhancing well-being in these times of uncertainty. Xenophobia, one of the common offshoots of pandemics, can also harm the overall well-being. Implications are discussed in the light of India’s diverse socio-religious background and global context.


CJEM ◽  
2017 ◽  
Vol 19 (S1) ◽  
pp. S101
Author(s):  
K. Johns ◽  
S. Smith ◽  
E. Karreman ◽  
A. Kastelic

Introduction: Extended length of stay (LOS) in emergency departments (EDs) and overcrowding are a problems for the Canadian healthcare system, which can lead to the creation of a healthcare access block, a reduced health outcome for acute care patients, and decreased satisfaction with the health care system. The goal of this study is to identify and assess specific factors that predict length of stay in EDs for those patients who fall in the highest LOS category. Methods: A total of 130 patient charts from EDs in Regina were reviewed. Charts included in this study were from the 90th-100th percentile of time-users, who were registered during February 2016, and were admitted to hospital from the ED. Patient demographic data and ED visit data were collected. T-tests and multiple regression analyses were conducted to identify any significant predictors of our outcome variable, LOS. Results: None of the demographic variables showed a significant relationship with LOS (age: p=.36; sex: p=.92, CTAS: p=.48), nor did most of the included ED visit data such as door to doctor time (p=.34) and time for imaging studies (X-ray: p=.56; ultrasound: p=.50; CT p=.45). However, the time between the request for consult until the decision to admit did show a significant relationship with LOS (p<.01).Potential confounding variables analyzed were social work consult requests (p=.14), number of emergency visits on day of registration (p=.62), and hour of registration (00-12 or 12-24-p<.01). After adjustment for time of registration, using hierarchical multiple regression, time from consult request to admit decision maintained a significant predictor (p<.01) of LOS. Conclusion: After adjusting for the influence of confounding factors, “consult request to admit decision” was by far the strongest predictor of LOS of all included variables in our study. The results of this study were limited to some extent by inconsistencies in the documentation of some of the analyzed metrics. Establishing standardized documentation could reduce this issue in future studies of this nature. Future areas of interest include establishing a standard reference for our variables, a further analysis into why consult requests are a major predictor, and how to alleviate this in the future.


2016 ◽  
Vol 5 (5) ◽  
pp. 16 ◽  
Author(s):  
Guolong Zhao

To evaluate a drug, statistical significance alone is insufficient and clinical significance is also necessary. This paper explains how to analyze clinical data with considering both statistical and clinical significance. The analysis is practiced by combining a confidence interval under null hypothesis with that under non-null hypothesis. The combination conveys one of the four possible results: (i) both significant, (ii) only significant in the former, (iii) only significant in the latter or (iv) neither significant. The four results constitute a quadripartite procedure. Corresponding tests are mentioned for describing Type I error rates and power. The empirical coverage is exhibited by Monte Carlo simulations. In superiority trials, the four results are interpreted as clinical superiority, statistical superiority, non-superiority and indeterminate respectively. The interpretation is opposite in inferiority trials. The combination poses a deflated Type I error rate, a decreased power and an increased sample size. The four results may helpful for a meticulous evaluation of drugs. Of these, non-superiority is another profile of equivalence and so it can also be used to interpret equivalence. This approach may prepare a convenience for interpreting discordant cases. Nevertheless, a larger data set is usually needed. An example is taken from a real trial in naturally acquired influenza.


1989 ◽  
Vol 38 (1-2) ◽  
pp. 93-114
Author(s):  
Indranil Mukhopadhyay

In this paper non-parametric tests for the multiple resrcssion set up which take into account the known permutation symmetry of the variates under the null hypothesis, are suggested. It bas been shown that under permutation symmetry the proposed procedure is more efficient than the standard nonparametric procedure for the multiple regression problem (see e.g. Puri and Sen (1985) ). A special situation where it is further known in advance that the regressions are identical is also considered briefly.


2017 ◽  
Vol 35 (6_suppl) ◽  
pp. 234-234
Author(s):  
Elisa M. Ledet ◽  
Joshua Schiff ◽  
Patrick Cotogno ◽  
Charlotte Manogue ◽  
Emma M. Ernst ◽  
...  

234 Background: Cell free DNA (cfDNA) present in plasma of cancer pts can reflect tumoral alterations. Genomic alterations in cfDNA alter prognosis and abiraterone/enzalutamide resistance in mCRPC. The goal of this evaluation was to characterize AR amplifications (Amps) and various somatic point mutations (Muts) detected in mCRPC cfDNA and to relate those changes to other common alterations in the cfDNA landscape. Methods: A heterogenous group of 46 mCRPC patients (pts) with evidence of clinical progression from Tulane Cancer Center underwent cfDNA analysis using Guardant360 test (Guardant Health, Redwood City, CA). This evaluation included full exonic coverage of 70 genes and amplifications in 18 genes. Mutations reported herein include both known pathogenic mutations as well as mutations uncharacterized for functional importance. Results: 69.5% (n = 32) of the mCRPC pts evaluated had an AR alteration. Of the pts with AR alterations, 46.8% (n = 16) had AR Amps, 43.7% (n = 14) had AR Muts, and only 6.25% (n = 2) had both. In this cohort, AR alterations were the most commonly observed aberration. In addition to amplifications, 12 different AR Muts were detected. AR Muts included: T878A (n = 9), H875Y (n = 5), W742C (n = 4), AR L702H (n = 3), and others. To better understand the relationship between AR alteration and other commonly detected cfDNA aberrations, association between BRAF (35.5%), TP53 (46.7%), and MYC (22.2%) alterations and AR were assessed. Among these genes, TP53 alterations were all Muts and MYC alterations were all Amps. BRAF alterations were predominantly Amps (N = 15) though Muts were also detected (N = 6). Neither TP53 Muts or MYC Amps were significantly associated with AR alterations. On the other hand, BRAF alterations were significantly associated with AR Amps (p = 0.041); 60% (9/15) pts with AR Amps also had BRAF alteration (Odds ratio = 7.71, 95% CI 1.284- 46.366). Conclusions: AR alterations in cfDNA impact both disease progression and response to therapy. Co-segregation of AR and BRAF alterations may have significant prognostic and therapeutic implications. Further research and larger sample size is needed to further elucidate associations between the common somatic alterations detected in mCRPC.


1974 ◽  
Vol 35 (3) ◽  
pp. 1271-1274
Author(s):  
Robert D. Abbott

The conclusions of Bernhardson and Fisher regarding the “direct contribution” of the social desirability scale value and judged probability of occurrence in the population to the prediction of the proportion of respondents answering True to personality items were re-examined basing new estimates of “direct contribution” upon multiple regression model comparisons which emphasize the common “contribution” between the judged probability of occurrence and social desirability and are not influenced by the ordering of variables in the regression equation.


Sign in / Sign up

Export Citation Format

Share Document