scholarly journals Use of two-point models in “Model choice in time-series studies of air pollution and mortality”

2020 ◽  
Vol 13 (2) ◽  
pp. 225-232 ◽  
Author(s):  
Mieczysław Szyszkowicz

AbstractIn this work, a new technique is proposed to study short-term exposure and adverse health effects. The presented approach uses hierarchical clusters with the following structure: each pair of two sequential days in 1 year is embedded in the year. We have 183 clusters per year with the embedded structure <year:2 days>. Time-series analysis is conducted using a conditional Poisson regression with the constructed clusters as a stratum. Unmeasured confounders such as seasonal and long-term trends are not modelled but are controlled by the structure of the clusters. The proposed technique is illustrated using four freely accessible databases, which contain complex simulated data. These data are available as the compressed R workspace files. Results based on the simulated data were very close to the truth based on the presented methodology. In addition, the case-crossover method with 1-month and 2-week window, and a conditional Poisson regression on 3-day clusters as a stratum, was also applied to the simulated data. Difficulties (high type I error rate) were observed for the case-crossover method in the presence of high concurvity in the simulated data. The proposed methods using various forms of a stratum were further applied to the Chicago mortality data. The considered methods have often different qualitative and quantitative estimations.

2000 ◽  
Vol 14 (1) ◽  
pp. 1-10 ◽  
Author(s):  
Joni Kettunen ◽  
Niklas Ravaja ◽  
Liisa Keltikangas-Järvinen

Abstract We examined the use of smoothing to enhance the detection of response coupling from the activity of different response systems. Three different types of moving average smoothers were applied to both simulated interbeat interval (IBI) and electrodermal activity (EDA) time series and to empirical IBI, EDA, and facial electromyography time series. The results indicated that progressive smoothing increased the efficiency of the detection of response coupling but did not increase the probability of Type I error. The power of the smoothing methods depended on the response characteristics. The benefits and use of the smoothing methods to extract information from psychophysiological time series are discussed.


Author(s):  
Luan L. Lee ◽  
Miguel G. Lizarraga ◽  
Natanael R. Gomes ◽  
Alessandro L. Koerich

This paper describes a prototype for Brazilian bankcheck recognition. The description is divided into three topics: bankcheck information extraction, digit amount recognition and signature verification. In bankcheck information extraction, our algorithms provide signature and digit amount images free of background patterns and bankcheck printed information. In digit amount recognition, we dealt with the digit amount segmentation and implementation of a complete numeral character recognition system involving image processing, feature extraction and neural classification. In signature verification, we designed and implemented a static signature verification system suitable for banking and commercial applications. Our signature verification algorithm is capable of detecting both simple, random and skilled forgeries. The proposed automatic bankcheck recognition prototype was intensively tested by real bankcheck data as well as simulated data providing the following performance results: for skilled forgeries, 4.7% equal error rate; for random forgeries, zero Type I error and 7.3% Type II error; for bankcheck numerals, 92.7% correct recognition rate.


2017 ◽  
Vol 28 (4) ◽  
pp. 1157-1169 ◽  
Author(s):  
Hua He ◽  
Hui Zhang ◽  
Peng Ye ◽  
Wan Tang

Excessive zeros are common in practice and may cause overdispersion and invalidate inference when fitting Poisson regression models. There is a large body of literature on zero-inflated Poisson models. However, methods for testing whether there are excessive zeros are less well developed. The Vuong test comparing a Poisson and a zero-inflated Poisson model is commonly applied in practice. However, the type I error of the test often deviates seriously from the nominal level, rendering serious doubts on the validity of the test in such applications. In this paper, we develop a new approach for testing inflated zeros under the Poisson model. Unlike the Vuong test for inflated zeros, our method does not require a zero-inflated Poisson model to perform the test. Simulation studies show that when compared with the Vuong test our approach not only better at controlling type I error rate, but also yield more power.


Author(s):  
Mehdi Moradi ◽  
Manuel Montesino-SanMartin ◽  
M. Dolores Ugarte ◽  
Ana F. Militino

AbstractWe propose an adaptive-sliding-window approach (LACPD) for the problem of change-point detection in a set of time-ordered observations. The proposed method is combined with sub-sampling techniques to compensate for the lack of enough data near the time series’ tails. Through a simulation study, we analyse its behaviour in the presence of an early/middle/late change-point in the mean, and compare its performance with some of the frequently used and recently developed change-point detection methods in terms of power, type I error probability, area under the ROC curves (AUC), absolute bias, variance, and root-mean-square error (RMSE). We conclude that LACPD outperforms other methods by maintaining a low type I error probability. Unlike some other methods, the performance of LACPD does not depend on the time index of change-points, and it generally has lower bias than other alternative methods. Moreover, in terms of variance and RMSE, it outperforms other methods when change-points are close to the time series’ tails, whereas it shows a similar (sometimes slightly poorer) performance as other methods when change-points are close to the middle of time series. Finally, we apply our proposal to two sets of real data: the well-known example of annual flow of the Nile river in Awsan, Egypt, from 1871 to 1970, and a novel remote sensing data application consisting of a 34-year time-series of satellite images of the Normalised Difference Vegetation Index in Wadi As-Sirham valley, Saudi Arabia, from 1986 to 2019. We conclude that LACPD shows a good performance in detecting the presence of a change as well as the time and magnitude of change in real conditions.


2020 ◽  
Author(s):  
Corey Peltier ◽  
Reem Muharib ◽  
April Haas ◽  
Art Dowdy

Single-case research designs (SCRDs) are used to evaluate functional relations between an independent variable and dependent variable(s). When analyzing data related to autism spectrum disorder, SCRDs are frequently used. Namely, SCRDs allow for empirical evidence in support of practices that improve socially significant outcomes for individuals diagnosed with ASD. To determine a functional relation in SCRDs, a time-series graph is constructed and visual analysts evaluate data patterns. Preliminary evidence suggest that the approach used to scale the ordinate (i.e., y-axis) and the proportions of the x-axis length to y-axis height (i.e., data points per x- to y-axis ratio) impact visual analysts’ decisions regarding a functional relation and the magnitude of treatment effect, resulting in an increased likelihood of a Type I errors. The purpose for this systematic review was to evaluate all time-series graphs published in the last decade (i.e., 2010-2020) in four premier journals in the field of ASD: Journal of Autism and Developmental Disorders, Research in Autism Spectrum Disorders, Autism, and Focus on Autism and Other Developmental Disabilities. The systematic search yielded 348 articles including 2,675 graphs. We identified large variation across and within types of SCRDs for the standardized X:Y and DPPXYR. In addition, 73% of graphs were below a DPPXYR of 0.14, providing evidence of the Type I error rate. A majority of graphs used an appropriate ordinate scaling method that would not increase Type I error rates. Implications for future research and practice are provided.


Mathematics ◽  
2018 ◽  
Vol 6 (11) ◽  
pp. 269 ◽  
Author(s):  
Sergio Camiz ◽  
Valério Pillar

The identification of a reduced dimensional representation of the data is among the main issues of exploratory multidimensional data analysis and several solutions had been proposed in the literature according to the method. Principal Component Analysis (PCA) is the method that has received the largest attention thus far and several identification methods—the so-called stopping rules—have been proposed, giving very different results in practice, and some comparative study has been carried out. Some inconsistencies in the previous studies led us to try to fix the distinction between signal from noise in PCA—and its limits—and propose a new testing method. This consists in the production of simulated data according to a predefined eigenvalues structure, including zero-eigenvalues. From random populations built according to several such structures, reduced-size samples were extracted and to them different levels of random normal noise were added. This controlled introduction of noise allows a clear distinction between expected signal and noise, the latter relegated to the non-zero eigenvalues in the samples corresponding to zero ones in the population. With this new method, we tested the performance of ten different stopping rules. Of every method, for every structure and every noise, both power (the ability to correctly identify the expected dimension) and type-I error (the detection of a dimension composed only by noise) have been measured, by counting the relative frequencies in which the smallest non-zero eigenvalue in the population was recognized as signal in the samples and that in which the largest zero-eigenvalue was recognized as noise, respectively. This way, the behaviour of the examined methods is clear and their comparison/evaluation is possible. The reported results show that both the generalization of the Bartlett’s test by Rencher and the Bootstrap method by Pillar result much better than all others: both are accounted for reasonable power, decreasing with noise, and very good type-I error. Thus, more than the others, these methods deserve being adopted.


2019 ◽  
pp. 014544551986021 ◽  
Author(s):  
Antonia R. Giannakakos ◽  
Marc J. Lanovaz

Single-case experimental designs often require extended baselines or the withdrawal of treatment, which may not be feasible or ethical in some practical settings. The quasi-experimental AB design is a potential alternative, but more research is needed on its validity. The purpose of our study was to examine the validity of using nonoverlap measures of effect size to detect changes in AB designs using simulated data. In our analyses, we determined thresholds for three effect size measures beyond which the type I error rate would remain below 0.05 and then examined whether using these thresholds would provide sufficient power. Overall, our analyses show that some effect size measures may provide adequate control over type I error rate and sufficient power when analyzing data from AB designs. In sum, our results suggest that practitioners may use quasi-experimental AB designs in combination with effect size to rigorously assess progress in practice.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Farhad Hormozdiari ◽  
Junghyun Jung ◽  
Eleazar Eskin ◽  
Jong Wha J. Joo

AbstractIn standard genome-wide association studies (GWAS), the standard association test is underpowered to detect associations between loci with multiple causal variants with small effect sizes. We propose a statistical method, Model-based Association test Reflecting causal Status (MARS), that finds associations between variants in risk loci and a phenotype, considering the causal status of variants, only requiring the existing summary statistics to detect associated risk loci. Utilizing extensive simulated data and real data, we show that MARS increases the power of detecting true associated risk loci compared to previous approaches that consider multiple variants, while controlling the type I error.


2016 ◽  
Author(s):  
Aaron T. L. Lun ◽  
John C. Marioni

AbstractAn increasing number of studies are using single-cell RNA-sequencing (scRNA-seq) to characterize the gene expression profiles of individual cells. One common analysis applied to scRNA-seq data involves detecting differentially expressed (DE) genes between cells in different biological groups. However, many experiments are designed such that the cells to be compared are processed in separate plates or chips, meaning that the groupings are confounded with systematic plate effects. This confounding aspect is frequently ignored in DE analyses of scRNA-seq data. In this article, we demonstrate that failing to consider plate effects in the statistical model results in loss of type I error control. A solution is proposed whereby counts are summed from all cells in each plate and the count sums for all plates are used in the DE analysis. This restores type I error control in the presence of plate effects without compromising detection power in simulated data. Summation is also robust to varying numbers and library sizes of cells on each plate. Similar results are observed in DE analyses of real data where the use of count sums instead of single-cell counts improves specificity and the ranking of relevant genes. This suggests that summation can assist in maintaining statistical rigour in DE analyses of scRNA-seq data with plate effects.


2018 ◽  
Author(s):  
Guoshuai Cai ◽  
Jennifer M. Franks ◽  
Michael L. Whitfield

AbstractMotivationVarious methods have been proposed, each with its own limitations. Some naive normal-based tests have low testing power with invalid normal distribution assumptions for RNA-seq read counts, whereas count-based methods lack a biologically meaningful interpretation and have limited capability for integration with other analysis packages for mRNA abundance. In this study, we propose an improved method, RoMA, to accurately detect differential expression and unlock the integration with upstream and downstream analyses on mRNA abundance in RNA-seq studies.ResultsRoMA incorporates information from both mRNA abundance and raw counts. Studies on simulated data and two real datasets showed that RoMA provides an accurate quantification of mRNA abundance and a data adjustment-tolerant DE analysis with high AUC, low FDR, and an efficient control of type I error rate. This study provides a valid strategy for mRNA abundance modeling and data analysis integration for RNA-seq studies, which will greatly facilitate the identification and interpretation of DE genes.Availability and implementationRoMA is available at https://github.com/GuoshuaiCai/[email protected] or [email protected]


Sign in / Sign up

Export Citation Format

Share Document