A statistical correction is in order

2007 ◽  
Vol 88 (3) ◽  
pp. 768
Author(s):  
Taufiek Konrad Rajab ◽  
Christian Wallwiener ◽  
Markus Wallwiener ◽  
Bernhard Kraemer
GigaScience ◽  
2020 ◽  
Vol 9 (11) ◽  
Author(s):  
Alexandra J Lee ◽  
YoSon Park ◽  
Georgia Doing ◽  
Deborah A Hogan ◽  
Casey S Greene

Abstract Motivation In the past two decades, scientists in different laboratories have assayed gene expression from millions of samples. These experiments can be combined into compendia and analyzed collectively to extract novel biological patterns. Technical variability, or "batch effects," may result from combining samples collected and processed at different times and in different settings. Such variability may distort our ability to extract true underlying biological patterns. As more integrative analysis methods arise and data collections get bigger, we must determine how technical variability affects our ability to detect desired patterns when many experiments are combined. Objective We sought to determine the extent to which an underlying signal was masked by technical variability by simulating compendia comprising data aggregated across multiple experiments. Method We developed a generative multi-layer neural network to simulate compendia of gene expression experiments from large-scale microbial and human datasets. We compared simulated compendia before and after introducing varying numbers of sources of undesired variability. Results The signal from a baseline compendium was obscured when the number of added sources of variability was small. Applying statistical correction methods rescued the underlying signal in these cases. However, as the number of sources of variability increased, it became easier to detect the original signal even without correction. In fact, statistical correction reduced our power to detect the underlying signal. Conclusion When combining a modest number of experiments, it is best to correct for experiment-specific noise. However, when many experiments are combined, statistical correction reduces our ability to extract underlying patterns.


2015 ◽  
Vol 80 ◽  
pp. 76-80 ◽  
Author(s):  
H.-P. Gänser ◽  
J. Maierhofer ◽  
T. Christiner

2020 ◽  
Vol 25 (9) ◽  
pp. 05020032
Author(s):  
Enrica Perra ◽  
Francesco Viola ◽  
Roberto Deidda ◽  
Domenico Caracciolo ◽  
Claudio Paniconi ◽  
...  

2018 ◽  
Vol 49 (1) ◽  
pp. 341-348 ◽  
Author(s):  
Benjamin F. Jarvis

This comment reconsiders advice offered by Bruch and Mare regarding sampling choice sets in conditional logistic regression models of residential mobility. Contradicting Bruch and Mare’s advice, past econometric research shows that no statistical correction is needed when using simple random sampling of unchosen alternatives to pare down respondents’ choice sets. Using data on stated residential preferences contained in the Los Angeles portion of the Multi-City Study of Urban Inequality, it is shown that following Bruch and Mare’s advice—to implement a statistical correction for simple random choice set sampling—leads to biased coefficient estimates. This bias is all but eliminated if the sampling correction is omitted.


2015 ◽  
Vol 11 (8) ◽  
pp. 1027-1047 ◽  
Author(s):  
Y. Brugnara ◽  
R. Auchmann ◽  
S. Brönnimann ◽  
R. J. Allan ◽  
I. Auer ◽  
...  

Abstract. The eruption of Mount Tambora (Indonesia) in April 1815 is the largest documented volcanic eruption in history. It is associated with a large global cooling during the following year, felt particularly in parts of Europe and North America, where the year 1816 became known as the "year without a summer". This paper describes an effort made to collect surface meteorological observations from the early instrumental period, with a focus on the years of and immediately following the eruption (1815–1817). Although the collection aimed in particular at pressure observations, correspondent temperature observations were also recovered. Some of the series had already been described in the literature, but a large part of the data, recently digitised from original weather diaries and contemporary magazines and newspapers, is presented here for the first time. The collection puts together more than 50 sub-daily series from land observatories in Europe and North America and from ships in the tropics. The pressure observations have been corrected for temperature and gravity and reduced to mean sea level. Moreover, an additional statistical correction was applied to take into account common error sources in mercury barometers. To assess the reliability of the corrected data set, the variance in the pressure observations is compared with modern climatologies, and single observations are used for synoptic analyses of three case studies in Europe. All raw observations will be made available to the scientific community in the International Surface Pressure Databank.


Sign in / Sign up

Export Citation Format

Share Document