Objective algorithm to separate signal from noise in a Poisson-distributed pixel data set

A statistical method to determine the background level and separate signal from background in a Poisson-distributed background data set is described. The algorithm eliminates the pixel with the highest intensity value in an iterative manner until the sample variance equals the sample mean within the estimated uncertainties. The eliminated pixels then contain signal superimposed on the background, so the integrated signal can be obtained by summation or by a simple extension by profile fitting depending on the user's preferences. Two additional steps remove `outliers' and correct for the underestimated extension of the peak area, respectively. The algorithm can be easily modified to specific needs, and an application on crystal truncation rods is presented, dealing with a sloping background.

Download Full-text

The Regression of the Sample Variance on the Sample Mean

Journal of the London Mathematical Society ◽

10.1112/jlms/s1-21.1.22 ◽

1946 ◽

Vol s1-21 (1) ◽

pp. 22-28 ◽

Cited By ~ 12

Author(s):

M. C. K. Tweedie

Keyword(s):

Sample Variance ◽

Sample Mean

Download Full-text

Remarks on the independence of the sample mean and sample variance from a normal population

International Journal of Mathematical Education in Science and Technology ◽

10.1080/0020739900210411 ◽

1990 ◽

Vol 21 (4) ◽

pp. 585-587

Author(s):

Eleanor Goldstein ◽

Martin Katzen ◽

Henry Zatzkis

Keyword(s):

Normal Population ◽

Sample Variance ◽

Sample Mean

Download Full-text

Statistical analysis of cytogenetic data

Faktori eksperimental'noi evolucii organizmiv ◽

10.7124/feeo.v28.1391 ◽

2021 ◽

Vol 28 ◽

pp. 146-150

Author(s):

L. A. Atramentova

Keyword(s):

Data Structure ◽

Statistical Analysis ◽

Confidence Interval ◽

Statistical Method ◽

Confidence Intervals ◽

Study Design ◽

Cytogenetic Study ◽

Data Set ◽

Statistical Errors ◽

Cytogenetic Data

Using the data obtained in a cytogenetic study as an example, we consider the typical errors that are made when performing statistical analysis. Widespread but flawed statistical analysis inevitably produces biased results and increases the likelihood of incorrect scientific conclusions. Errors occur due to not taking into account the study design and the structure of the analyzed data. The article shows how the numerical imbalance of the data set leads to a bias in the result. Using a dataset as an example, it explains how to balance the complex. It shows the advantage of presenting sample indicators with confidence intervals instead of statistical errors. Attention is drawn to the need to take into account the size of the analyzed shares when choosing a statistical method. It shows how the same data set can be analyzed in different ways depending on the purpose of the study. The algorithm of correct statistical analysis and the form of the tabular presentation of the results are described. Keywords: data structure, numerically unbalanced complex, confidence interval.

Download Full-text

On Semantic Relation Extraction Over Enterprise Data

Innovations, Developments, and Applications of Semantic Web and Information Systems - Advances in Web Technologies and Engineering ◽

10.4018/978-1-5225-5042-6.ch003 ◽

2018 ◽

pp. 62-84 ◽

Cited By ~ 2

Author(s):

Wei Shen ◽

Jianyong Wang ◽

Ping Luo ◽

Min Wang

Keyword(s):

Statistical Method ◽

Real World ◽

Relation Extraction ◽

Semantic Relation ◽

Experimental Results ◽

Web Data ◽

Data Set ◽

Redundancy Level ◽

Hybrid Framework ◽

The Web

Relation extraction from the Web data has attracted a lot of attention recently. However, little work has been done when it comes to the enterprise data regardless of the urgent needs to such work in real applications (e.g., E-discovery). One distinct characteristic of the enterprise data (in comparison with the Web data) is its low redundancy. Previous work on relation extraction from the Web data largely relies on the data's high redundancy level and thus cannot be applied to the enterprise data effectively. This chapter reviews related work on relation extraction and introduces an unsupervised hybrid framework REACTOR for semantic relation extraction over enterprise data. REACTOR combines a statistical method, classification, and clustering to identify various types of relations among entities appearing in the enterprise data automatically. REACTOR was evaluated over a real-world enterprise data set from HP that contains over three million pages and the experimental results show its effectiveness.

Download Full-text

Credibility Approximations for Bayesian Prediction of Second Moments

Astin Bulletin ◽

10.2143/ast.15.2.2015022 ◽

1985 ◽

Vol 15 (2) ◽

pp. 103-121 ◽

Cited By ~ 3

Author(s):

William S. Jewell ◽

Rene Schnieper

Keyword(s):

Least Squares ◽

Linear Functions ◽

Quadratic Functions ◽

Sample Variance ◽

Credibility Theory ◽

Sample Mean ◽

Second Moment ◽

Second Moments ◽

Special Cases ◽

The Mean

AbstractCredibility theory refers to the use of linear least-squares theory to approximate the Bayesian forecast of the mean of a future observation; families are known where the credibility formula is exact Bayesian. Second-moment forecasts are also of interest, for example, in assessing the precision of the mean estimate. For some of these same families, the second-moment forecast is exact in linear and quadratic functions of the sample mean. On the other hand, for the normal distribution with normal-gamma prior on the mean and variance, the exact forecast of the variance is a linear function of the sample variance and the squared deviation of the sample mean from the prior mean. Bühlmann has given a credibility approximation to the variance in terms of the sample mean and sample variance.In this paper, we present a unified approach to estimating both first and second moments of future observations using linear functions of the sample mean and two sample second moments; the resulting least-squares analysis requires the solution of a 3 × 3 linear system, using 11 prior moments from the collective and giving joint predictions of all moments of interest. Previously developed special cases follow immediately. For many analytic models of interest, 3-dimensional joint prediction is significantly better than independent forecasts using the “natural” statistics for each moment when the number of samples is small. However, the expected squared-errors of the forecasts become comparable as the sample size increases.

Download Full-text

How to Estimate Statistical Characteristics Based on a Sample: Nonparametric Maximum Likelihood Approach Leads to Sample Mean, Sample Variance, etc.

Predictive Econometrics and Big Data - Studies in Computational Intelligence ◽

10.1007/978-3-319-70942-0_11 ◽

2017 ◽

pp. 192-197

Author(s):

Vladik Kreinovich ◽

Thongchai Dumrongpokaphan

Keyword(s):

Maximum Likelihood ◽

Sample Variance ◽

Statistical Characteristics ◽

Nonparametric Maximum Likelihood ◽

Sample Mean ◽

Maximum Likelihood Approach ◽

Likelihood Approach

Download Full-text

On ε-independence of sample mean and sample variance

Stability Problems for Stochastic Models - Lecture Notes in Mathematics ◽

10.1007/bfb0072727 ◽

1987 ◽

pp. 207-223

Author(s):

R. Yanushkevichius

Keyword(s):

Sample Variance ◽

Sample Mean

Download Full-text

Towards mapping turbulence in the intra-cluster medium

Astronomy and Astrophysics ◽

10.1051/0004-6361/201935676 ◽

2019 ◽

Vol 629 ◽

pp. A143 ◽

Cited By ~ 5

Author(s):

Nicolas Clerc ◽

Edoardo Cucchetti ◽

Etienne Pointecouteau ◽

Philippe Peille

Keyword(s):

Monte Carlo ◽

Structure Function ◽

Velocity Structure ◽

High Energy ◽

Sample Variance ◽

Physical Parameters ◽

Sample Mean ◽

X Ray ◽

Spatially Resolved ◽

Mean And Variance

Context. X-ray observations of galaxy clusters provide insights into the nature of gaseous turbulent motions, their physical scales, and the fundamental processes to which they are related. Spatially-resolved, high-resolution spectral measurements of X-ray emission lines provide diagnostics on the nature of turbulent motions in emitting atmospheres. Since they are acting on scales comparable to the size of the objects, the uncertainty on these physical parameters is limited by the number of observational measurements, through sample variance. Aims. We propose a different and complementary approach to repeating numerical simulations for the computation of sample variance (i.e. Monte-Carlo sampling) by introducing new analytical developments for lines diagnosis. Methods. We considered the model of a “turbulent gas cloud”, consisting in isotropic and uniform turbulence described by a universal Kolmogorov power-spectrum with random amplitudes and phases in an optically thin medium. Following a simple prescription for the four-term correlation of Fourier coefficients, we derived generic expressions for the sample mean and variance of line centroid shift, line broadening, and projected velocity structure function. We performed a numerical validation based on Monte-Carlo simulations for two popular models of gas emissivity based on the β-model. Results. Generic expressions for the sample variance of line centroid shifts and broadening in arbitrary apertures are derived and match the simulations within their range of applicability. Generic expressions for the mean and variance of the structure function are provided and verified against simulations. An application to the Athena/X-IFU (Advanced Telescope for High-ENergy Astrophysics/X-ray Integral Field Unit) and XRISM/Resolve (X-ray Imaging and Spectroscopy Mission) instruments forecasts the potential of sensitive, spatially-resolved spectroscopy to probe the inertial range of turbulent velocity cascades in a Coma-like galaxy cluster. Conclusions. The formulas provided are of generic relevance and can be implemented in forecasts for upcoming or current X-ray instrumentation and observing programmes.

Download Full-text

Nonparametric Optimality of the Sample Mean and Sample Variance

The American Statistician ◽

10.2307/2683173 ◽

1982 ◽

Vol 36 (3) ◽

pp. 176 ◽

Cited By ~ 1

Author(s):

Jonathan J. Shuster

Keyword(s):

Sample Variance ◽

Sample Mean

Download Full-text

First continuous high-resolution aerosol record from the East Greenland Ice Core Project (EGRIP), covering the last 15,000 years

10.5194/egusphere-egu2020-15202 ◽

2020 ◽

Author(s):

Camilla Marie Jensen ◽

Tobias Erhardt ◽

Giulia Sinnl ◽

Hubertus Fischer

Keyword(s):

High Resolution ◽

Biomass Burning ◽

Forest Fires ◽

Ice Core ◽

Greenland Ice Sheet ◽

Ice Cores ◽

Background Level ◽

Volcanic Eruptions ◽

Atmospheric Conditions ◽

Data Set

Ice sheets are reliable archives of atmospheric impurities such as aerosols and gasses of both natural and anthropogenic origin. Impurity records from Greenland ice cores reveal much information about previous atmospheric conditions and long-range transport in the Northern hemisphere going back more than a hundred thousand years.Here we present the data from the upper 1,411 m from the EGRIP ice core, measuring conductivity, dust, sodium, calcium, ammonium, and nitrate. These records contain information about ocean sources, transport of terrestrial dust, soil and vegetation emissions as well as biomass burning, volcanic eruptions, etc., covering approximately the past 15,000 years. This newly obtained data set is unique as it provides the first high-resolution information about several thousands of years of the mid-Holocene period in Greenland that none of the previous impurity records from the other deep Greenland ice cores had managed to cover before due to brittle ice. This will contribute to further understanding of the atmospheric conditions for the pre-industrial period.The ammonium record contains peaks significantly higher than the background level. These peaks are caused by biomass burning or forest fires emitting plumes of ammonia large enough so that they can extend to the free troposphere and be efficiently transported all the way to the Greenland ice sheet. Here we present preliminary results of the wild fire frequency covering the entire Holocene, where the wild fires are defined as outliers in the ammonium record of annual means.

Download Full-text