Objective algorithm to separate signal from noise in a Poisson-distributed pixel data set

2013 ◽  
Vol 46 (3) ◽  
pp. 663-671 ◽  
Author(s):  
Tine Straasø ◽  
Dirk Müter ◽  
Henning Osholm Sørensen ◽  
Jens Als-Nielsen

A statistical method to determine the background level and separate signal from background in a Poisson-distributed background data set is described. The algorithm eliminates the pixel with the highest intensity value in an iterative manner until the sample variance equals the sample mean within the estimated uncertainties. The eliminated pixels then contain signal superimposed on the background, so the integrated signal can be obtained by summation or by a simple extension by profile fitting depending on the user's preferences. Two additional steps remove `outliers' and correct for the underestimated extension of the peak area, respectively. The algorithm can be easily modified to specific needs, and an application on crystal truncation rods is presented, dealing with a sloping background.

2021 ◽  
Vol 28 ◽  
pp. 146-150
Author(s):  
L. A. Atramentova

Using the data obtained in a cytogenetic study as an example, we consider the typical errors that are made when performing statistical analysis. Widespread but flawed statistical analysis inevitably produces biased results and increases the likelihood of incorrect scientific conclusions. Errors occur due to not taking into account the study design and the structure of the analyzed data. The article shows how the numerical imbalance of the data set leads to a bias in the result. Using a dataset as an example, it explains how to balance the complex. It shows the advantage of presenting sample indicators with confidence intervals instead of statistical errors. Attention is drawn to the need to take into account the size of the analyzed shares when choosing a statistical method. It shows how the same data set can be analyzed in different ways depending on the purpose of the study. The algorithm of correct statistical analysis and the form of the tabular presentation of the results are described. Keywords: data structure, numerically unbalanced complex, confidence interval.


Author(s):  
Wei Shen ◽  
Jianyong Wang ◽  
Ping Luo ◽  
Min Wang

Relation extraction from the Web data has attracted a lot of attention recently. However, little work has been done when it comes to the enterprise data regardless of the urgent needs to such work in real applications (e.g., E-discovery). One distinct characteristic of the enterprise data (in comparison with the Web data) is its low redundancy. Previous work on relation extraction from the Web data largely relies on the data's high redundancy level and thus cannot be applied to the enterprise data effectively. This chapter reviews related work on relation extraction and introduces an unsupervised hybrid framework REACTOR for semantic relation extraction over enterprise data. REACTOR combines a statistical method, classification, and clustering to identify various types of relations among entities appearing in the enterprise data automatically. REACTOR was evaluated over a real-world enterprise data set from HP that contains over three million pages and the experimental results show its effectiveness.


1985 ◽  
Vol 15 (2) ◽  
pp. 103-121 ◽  
Author(s):  
William S. Jewell ◽  
Rene Schnieper

AbstractCredibility theory refers to the use of linear least-squares theory to approximate the Bayesian forecast of the mean of a future observation; families are known where the credibility formula is exact Bayesian. Second-moment forecasts are also of interest, for example, in assessing the precision of the mean estimate. For some of these same families, the second-moment forecast is exact in linear and quadratic functions of the sample mean. On the other hand, for the normal distribution with normal-gamma prior on the mean and variance, the exact forecast of the variance is a linear function of the sample variance and the squared deviation of the sample mean from the prior mean. Bühlmann has given a credibility approximation to the variance in terms of the sample mean and sample variance.In this paper, we present a unified approach to estimating both first and second moments of future observations using linear functions of the sample mean and two sample second moments; the resulting least-squares analysis requires the solution of a 3 × 3 linear system, using 11 prior moments from the collective and giving joint predictions of all moments of interest. Previously developed special cases follow immediately. For many analytic models of interest, 3-dimensional joint prediction is significantly better than independent forecasts using the “natural” statistics for each moment when the number of samples is small. However, the expected squared-errors of the forecasts become comparable as the sample size increases.


2019 ◽  
Vol 629 ◽  
pp. A143 ◽  
Author(s):  
Nicolas Clerc ◽  
Edoardo Cucchetti ◽  
Etienne Pointecouteau ◽  
Philippe Peille

Context. X-ray observations of galaxy clusters provide insights into the nature of gaseous turbulent motions, their physical scales, and the fundamental processes to which they are related. Spatially-resolved, high-resolution spectral measurements of X-ray emission lines provide diagnostics on the nature of turbulent motions in emitting atmospheres. Since they are acting on scales comparable to the size of the objects, the uncertainty on these physical parameters is limited by the number of observational measurements, through sample variance. Aims. We propose a different and complementary approach to repeating numerical simulations for the computation of sample variance (i.e. Monte-Carlo sampling) by introducing new analytical developments for lines diagnosis. Methods. We considered the model of a “turbulent gas cloud”, consisting in isotropic and uniform turbulence described by a universal Kolmogorov power-spectrum with random amplitudes and phases in an optically thin medium. Following a simple prescription for the four-term correlation of Fourier coefficients, we derived generic expressions for the sample mean and variance of line centroid shift, line broadening, and projected velocity structure function. We performed a numerical validation based on Monte-Carlo simulations for two popular models of gas emissivity based on the β-model. Results. Generic expressions for the sample variance of line centroid shifts and broadening in arbitrary apertures are derived and match the simulations within their range of applicability. Generic expressions for the mean and variance of the structure function are provided and verified against simulations. An application to the Athena/X-IFU (Advanced Telescope for High-ENergy Astrophysics/X-ray Integral Field Unit) and XRISM/Resolve (X-ray Imaging and Spectroscopy Mission) instruments forecasts the potential of sensitive, spatially-resolved spectroscopy to probe the inertial range of turbulent velocity cascades in a Coma-like galaxy cluster. Conclusions. The formulas provided are of generic relevance and can be implemented in forecasts for upcoming or current X-ray instrumentation and observing programmes.


1982 ◽  
Vol 36 (3) ◽  
pp. 176 ◽  
Author(s):  
Jonathan J. Shuster
Keyword(s):  

2020 ◽  
Author(s):  
Camilla Marie Jensen ◽  
Tobias Erhardt ◽  
Giulia Sinnl ◽  
Hubertus Fischer

<p>Ice sheets are reliable archives of atmospheric impurities such as aerosols and gasses of both natural and anthropogenic origin. Impurity records from Greenland ice cores reveal much information about previous atmospheric conditions and long-range transport in the Northern hemisphere going back more than a hundred thousand years.</p><p>Here we present the data from the upper 1,411 m from the EGRIP ice core, measuring conductivity, dust, sodium, calcium, ammonium, and nitrate. These records contain information about ocean sources, transport of terrestrial dust, soil and vegetation emissions as well as biomass burning, volcanic eruptions, etc., covering approximately the past 15,000 years. This newly obtained data set is unique as it provides the first high-resolution information about several thousands of years of the mid-Holocene period in Greenland that none of the previous impurity records from the other deep Greenland ice cores had managed to cover before due to brittle ice. This will contribute to further understanding of the atmospheric conditions for the pre-industrial period.</p><p>The ammonium record contains peaks significantly higher than the background level. These peaks are caused by biomass burning or forest fires emitting plumes of ammonia large enough so that they can extend to the free troposphere and be efficiently transported all the way to the Greenland ice sheet. Here we present preliminary results of the wild fire frequency covering the entire Holocene, where the wild fires are defined as outliers in the ammonium record of annual means.</p>


Sign in / Sign up

Export Citation Format

Share Document