scholarly journals The nuts and bolts of hypothesis testing

2015 ◽  
Vol 3 (3) ◽  
pp. 139-144 ◽  
Author(s):  
Stephanie L. Pugh ◽  
Annette Molinaro

Abstract When reading an article published in a medical journal, statistical tests are mentioned and the results are often supported by a P value. What are these tests? What is a P value and what is its meaning? P values are used to interpret the result of a statistical test. Both are intrinsic parts of hypothesis testing, which is a decision-making tool based on probability. Most medical and epidemiological studies are designed using a hypothesis test so understanding the key principles of a hypothesis test are crucial to interpreting results of a study. From null and alternative hypotheses to the issue of multiple tests, this paper introduces concepts related to hypothesis testing that are crucial to its implementation and interpretation.

2021 ◽  
Author(s):  
Huseyin Duman ◽  
Doğan Uğur Şanlı

<p>In the analysis of GNSS time series, when the sampling frequency and time-series lengths are almost identical, it is possible to highlight a linear relationship between the series repeatabilities (i.e. WRMS) and noise magnitudes. In the literature, linear equations as a function of WRMSs allowed many researchers to estimate the noise magnitudes. However, this was built upon homoskedasticity. We experienced the higher WRMSs, the more erroneous analysis results using the noise magnitudes from the linear equations stated. We hence studied whether or not homoscedasticity clearly describes the modeling errors. To test that, we used the published results of GPS baseline components from the previous work in the literature and realized here that each component forms part of the totality. We introduced all baseline component results as a whole into statistical analysis to check heteroskedasticity. We established null and alternative hypotheses on the residuals which are homoscedastic (H0) or heteroskedastic (HA). We adopted both the Breusch-Pagan test and the Goldfeld-Quandt test to prove heteroskedasticity and obtained p-values for both methods. The p-value, which is the probability measure, equals to almost zero for both test methods, that is, we fail to accept the null hypothesis. Consequently, we can confidently state that the relationship between the WRMSs and the noise magnitudes is heteroskedastic.</p><p><strong>Keywords:</strong> Noise magnitudes, repeatabilities, heteroskedasticity, time-series analysis</p>


Entropy ◽  
2020 ◽  
Vol 22 (6) ◽  
pp. 630 ◽  
Author(s):  
Boris Ryabko

The problem of constructing effective statistical tests for random number generators (RNG) is considered. Currently, there are hundreds of RNG statistical tests that are often combined into so-called batteries, each containing from a dozen to more than one hundred tests. When a battery test is used, it is applied to a sequence generated by the RNG, and the calculation time is determined by the length of the sequence and the number of tests. Generally speaking, the longer is the sequence, the smaller are the deviations from randomness that can be found by a specific test. Thus, when a battery is applied, on the one hand, the “better” are the tests in the battery, the more chances there are to reject a “bad” RNG. On the other hand, the larger is the battery, the less time it can spend on each test and, therefore, the shorter is the test sequence. In turn, this reduces the ability to find small deviations from randomness. To reduce this trade-off, we propose an adaptive way to use batteries (and other sets) of tests, which requires less time but, in a certain sense, preserves the power of the original battery. We call this method time-adaptive battery of tests. The suggested method is based on the theorem which describes asymptotic properties of the so-called p-values of tests. Namely, the theorem claims that, if the RNG can be modeled by a stationary ergodic source, the value − l o g π ( x 1 x 2 … x n ) / n goes to 1 − h when n grows, where x 1 x 2 … is the sequence, π ( ) is the p-value of the most powerful test, and h is the limit Shannon entropy of the source.


2021 ◽  
Vol 8 (2) ◽  
pp. 136-144
Author(s):  
Dian Zuiatna ◽  
Elvi Era Liesmayani ◽  
Reni Julia Tan

One of the threats that can harm pregnant women and fetuses is anemia. In Indonesia, in light of the consequences of Riskesdas in 2013, the pervasiveness of weakness in pregnant ladies was 37.1%. The motivation behind this examination was to decide the impact of spinach juice on expanding hemoglobin levels in pregnant ladies in the first and second trimesters at the Niar Pratama center in 2020. The exploration plan in understanding with this investigation was a semi test utilizing the One Group Pretest Posttest approach. The study was conducted in September 2020. The sample in this study was 10 people. Analysis of this statistical test using the t test (Test Paired Sample T Test). The results of this study using statistical tests obtained a p-value of 0.000 <0.05, so that there is an effect between giving spinach juice to increasing hemoglobin levels in pregnant women in the second and second trimesters. In light of the aftereffects of examination on the effect of spinach juice on expanding hemoglobin levels in pregnant ladies in the first and second trimesters at the Niar Pratama Clinic in 2020, explicitly there is an impact between giving spinach juice to increment hemoglobin levels in pregnant ladies.   Keywords: Spinach Juice, Hb, Pregnant Women ABSTRAK   Salah satu ancaman yang dapat membahayakan ibu hamil dan janin adalah anemia. Di Indonesia, berdasarkan hasil Riskesdas tahun 2013, prevalensi anemia pada ibu hamil sebesar 37,1%. Tujuan penelitian ini adalah untuk mengetahui pengaruh jus bayam terhadap peningkatan kadar hemoglobin pada ibu hamil trimester I dan II di Klinik Pratama Niar tahun 2020. Desain penelitian yang sesuai dengan penelitian ini adalah Quasi Eksperimen dengan menggunakan pendekatan One Group Pretest Posttest. Penelitian dilakukan pada bulan September tahun 2020. Sampel  pada penelitian ini yaitu sebanyak 10 orang. Analisa uji statistik ini menggunakan uji t (Uji Paired Sampel T Test).  Hasil dari penelitian ini menggunakan uji statistik didapatkan nilai p-value sebesar 0,000 < 0,05, sehingga ada pengaruh antara pemberian jus bayam terhadap peningkatan kadar hemoglobin pada ibu hamil trimester I dan II. Berdasarkan hasil penelitian mengenai dampak jus bayam terhadap peningkatan kadar hemoglobin pada ibu hamil trimester I dan II di Klinik Pratama Niar tahun 2020, secara spesifik terdapat pengaruh antara pemberian jus bayam untuk meningkatkan kadar hemoglobin pada ibu hamil.   Kata Kunci: Jus Bayam, Hb, Ibu Hamil


2019 ◽  
Vol 72 (04) ◽  
pp. 931-947
Author(s):  
Priyanka L. Lineswala ◽  
Darshna D. Jagiwala ◽  
Shweta N. Shah

The Navigation with Indian Constellation (NavIC)/Indian Regional Navigation Satellite System (IRNSS) is an emerging satellite navigation system that provides an independent navigation system for positioning and timing services in India and up to 1,500 km from its borderline. The dual frequency NavIC system uses the L5 frequency and S-band for navigation. These navigation signals are extremely weak and susceptible to interference when they are received on Earth's surface. Moreover, the performance of these bands may be degraded by other band or out-of-band communication systems, which can become the major threat to the performance of a NavIC receiver. The main focus of this paper is to detect real-time interference of Wi-Fi signals in the S-band of the NavIC receiver. The results are prepared with respect to the Power Spectral Density (PSD), execution of acquisition stage and the detection of Wi-Fi interference with two sample hypothesis testing methods including the Kolmogorov-Smirnov (KS)-test, the t-test and the Variance (var)-test. A performance analysis of the p-value is used to measure the evidence of interference existence for hypothesis testing, decision hypothesis and probability of detection are evaluated for each hypothesis method. The results show the severity of the Wi-Fi signal as a potential source of interference for future NavIC applications.


2018 ◽  
Vol 1 (1) ◽  
pp. 20-30
Author(s):  
Vigih Hery Kristanto

Tujuan dari penelitian ini adalah untuk mengetahui apakah penggunaan Lesson Plan berbasis Multiple Intelligences secara signifikan dapat meningkatkan prestasi belajar matematika siswa. Penelitian ini merupakan penelitian kuantitatif, sehingga menggunakan analisis statistik. Uji statistik yang digunakan untuk pengujian hipotesis adalah uji statistik non parametrik metode Wilcoxon. Instrumen yang digunakan adalah tes prestasi belajar matematika dengan kualitas valid yang diberikan oleh validator. Hasil uji hipotesis menunjukkan bahwa nilai Zobs = 4,9961, dengan Daerah Kritik (DK) = {Zobs | Zobs > 1,645}, hal ini mengakibatkan keputusan uji H0 ditolak. Dengan demikian, diperoleh kesimpulan bahwa penggunaan Lesson Plan berbasis Multiple Intelligences dapat meningkatkan prestasi belajar matematika siswa secara signifikan. Abstract The purpose of this research was to determine whether the use of Lesson Plan based Multiple Intelligences can significantly increase students' mathematics achievement. This research is quantitative, so using statistical analysis. The statistical test used to test the hypothesis is non-parametric statistical tests Wilcoxon method. The instrument used is a mathematics achievement test with valid quality provided by the validator. Hypothesis test results showed that the value Zobs = 4,9961, with criticism Regions (DK) = {Zobs | Zobs > 1.645}, this resulted in the decision H0 test. Thus, the conclusion that the use of Multiple Intelligences based Lesson Plan can increase students' mathematics achievement significantly.


Author(s):  
Richard McCleary ◽  
David McDowall ◽  
Bradley J. Bartos

Chapter 6 addresses the sub-category of internal validity defined by Shadish et al., as statistical conclusion validity, or “validity of inferences about the correlation (covariance) between treatment and outcome.” The common threats to statistical conclusion validity can arise, or become plausible through either model misspecification or through hypothesis testing. The risk of a serious model misspecification is inversely proportional to the length of the time series, for example, and so is the risk of mistating the Type I and Type II error rates. Threats to statistical conclusion validity arise from the classical and modern hybrid significance testing structures, the serious threats that weigh heavily in p-value tests are shown to be undefined in Beyesian tests. While the particularly vexing threats raised by modern null hypothesis testing are resolved through the elimination of the modern null hypothesis test, threats to statistical conclusion validity would inevitably persist and new threats would arise.


2015 ◽  
Vol 4 (1) ◽  
Author(s):  
João M. C. Santos Silva ◽  
Silvana Tenreyro ◽  
Frank Windmeijer

AbstractIn economic applications it is often the case that the variate of interest is non-negative and its distribution has a mass-point at zero. Many regression strategies have been proposed to deal with data of this type but, although there has been a long debate in the literature on the appropriateness of different models, formal statistical tests to choose between the competing specifications are not often used in practice. We use the non-nested hypothesis testing framework of Davidson and MacKinnon (Davidson and MacKinnon 1981. “Several Tests for Model Specification in the Presence of Alternative Hypotheses.”


2019 ◽  
Vol 86 (12) ◽  
pp. 773-783 ◽  
Author(s):  
Katy Klauenberg ◽  
Clemens Elster

AbstractIn metrology, the normal distribution is often taken for granted, e. g. when evaluating the result of a measurement and its uncertainty, or when establishing the equivalence of measurements in key or supplementary comparisons. The correctness of this inference and subsequent conclusions is dependent on the normality assumption, such that a validation of this assumption is essential. Hypothesis testing is the formal statistical framework to do so, and this introduction will describe how statistical tests detect violations of a distributional assumption.In the metrological context we will advise on how to select such a hypothesis test, how to set it up, how to perform it and which conclusion(s) can be drawn. In addition, we calculate the number of measurements needed to decide whether a process departs from a normal distribution and quantify how sure one is about this decision then. These aspects are illustrated for the powerful Shapiro-Wilk test and by an example in legal metrology. For this application we recommend to perform 330 measurements. Briefly we also touch upon the issues of multiple testing and rounded measurements.


2020 ◽  
Author(s):  
Noah N'Djaye Nikolai van Dongen ◽  
Eric-Jan Wagenmakers ◽  
Jan Sprenger

A tradition that goes back to Karl R. Popper assesses the value of a statistical test primarily by its severity: was it a honest and stringent attempt to prove the theory wrong? For "error statisticians" such as Deborah Mayo (1996, 2018), and frequentists more generally, severity is a key virtue in hypothesis tests. Conversely, failure to incorporate severity into statistical inference, as it allegedly happens in Bayesian inference, counts as a major methodological shortcoming. Our paper pursues a double goal: First, we argue that the error-statistical explication of severity has substantive drawbacks (i.e., neglect of research context; lack of connection to specificity of predictions; problematic similarity of degrees of severity to one-sided p-values). Second, we argue that severity matters for Bayesian inference via the value of specific, risky predictions: severity boosts the expected evidential value of a Bayesian hypothesis test. We illustrate severity-based reasoning in Bayesian statistics by means of a practical example and discuss its advantages and potential drawbacks.


Sign in / Sign up

Export Citation Format

Share Document