Properties of Normalization Methods Used in the Construction of Aggregate Measures

Abstract This article presents a study on the impact of missing objects, untypical objects and the impact of displacing coordinates of the object on the results of normalization in the construction of aggregate measures for ranking socio-economic objects. The study was conducted on simulated data sets generated in order to investigate the properties of normalization methods. This article focuses on the standardization formula responsible for moving the set of objects

Download Full-text

The Impact of Site Sample Size on the Reconstruction of Culture Histories

American Antiquity ◽

10.7183/0002-7316.76.3.547 ◽

2011 ◽

Vol 76 (3) ◽

pp. 547-572 ◽

Cited By ~ 3

Author(s):

Charles Perreault

Keyword(s):

Simulated Data ◽

Archaeological Sites ◽

Cultural Tradition ◽

Data Sets ◽

Data Set ◽

Rate Of Spread ◽

Accuracy And Precision ◽

Historical Processes ◽

The Impact ◽

Simulated Data Sets

I examine how our capacity to produce accurate culture-historical reconstructions changes as more archaeological sites are discovered, dated, and added to a data set. More precisely, I describe, using simulated data sets, how increases in the number of known sites impact the accuracy and precision of our estimations of (1) the earliest and (2) latest date of a cultural tradition, (3) the date and (4) magnitude of its peak popularity, as well as (5) its rate of spread and (6) disappearance in a population. I show that the accuracy and precision of inferences about these six historical processes are not affected in the same fashion by changes in the number of known sites. I also consider the impact of two simple taphonomic site destruction scenarios on the results. Overall, the results presented in this paper indicate that unless we are in possession of near-total samples of sites, and can be certain that there are no taphonomic biases in the universe of sites to be sampled, we will make inferences of varying precision and accuracy depending on the aspect of a cultural trait’s history in question.

Download Full-text

The Three-Cornered Hat Method for Estimating Error Variances of Three or More Atmospheric Data Sets – Part I: Overview and Evaluation

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech-d-19-0217.1 ◽

2020 ◽

Author(s):

Jeremiah P. Sjoberg ◽

Richard A. Anthes ◽

Therese Rieckh

Keyword(s):

Sample Size ◽

Historical Development ◽

Simulated Data ◽

Data Sets ◽

Vertical Resolution ◽

Random Errors ◽

Atmospheric Data ◽

Similarities And Differences ◽

The Impact ◽

Simulated Data Sets

AbstractThe three-cornered hat (3CH) method, which was originally developed to assess the random errors of atomic clocks, is a means for estimating the error variances of three different data sets. Here we give an overview of the historical development of the 3CH and select other methods for estimating error variances that use either two or three data sets. We discuss similarities and differences between these methods and the 3CH method.This study assesses the sensitivity of the 3CH method to the factors that limit its accuracy, including sample size, outliers, different magnitudes of errors between the data sets, biases, and unknown error correlations. Using simulated data sets for which the errors and their correlations among the data sets are known, this analysis shows the conditions under which the 3CH method provides the most and least accurate estimates. The effect of representativeness errors caused by differences in vertical resolution of data sets is investigated. These representativeness errors are generally small relative to the magnitude of the random errors in the data sets, and the impact of this source of errors can be reduced by appropriate filtering.

Download Full-text

Spectral Convolution Feature-Based SPD Matrix Representation for Signal Detection Using a Deep Neural Network

Entropy ◽

10.3390/e22090949 ◽

2020 ◽

Vol 22 (9) ◽

pp. 949

Author(s):

Jiangyi Wang ◽

Min Liu ◽

Xinwu Zeng ◽

Xiaoqiang Hua

Keyword(s):

Neural Network ◽

Signal Detection ◽

Convolutional Neural Network ◽

Deep Neural Network ◽

Detection Method ◽

Learning Algorithm ◽

Simulated Data ◽

Data Sets ◽

Feature Maps ◽

Simulated Data Sets

Convolutional neural networks have powerful performances in many visual tasks because of their hierarchical structures and powerful feature extraction capabilities. SPD (symmetric positive definition) matrix is paid attention to in visual classification, because it has excellent ability to learn proper statistical representation and distinguish samples with different information. In this paper, a deep neural network signal detection method based on spectral convolution features is proposed. In this method, local features extracted from convolutional neural network are used to construct the SPD matrix, and a deep learning algorithm for the SPD matrix is used to detect target signals. Feature maps extracted by two kinds of convolutional neural network models are applied in this study. Based on this method, signal detection has become a binary classification problem of signals in samples. In order to prove the availability and superiority of this method, simulated and semi-physical simulated data sets are used. The results show that, under low SCR (signal-to-clutter ratio), compared with the spectral signal detection method based on the deep neural network, this method can obtain a gain of 0.5–2 dB on simulated data sets and semi-physical simulated data sets.

Download Full-text

Benchmarking Statistical Multiple Sequence Alignment

10.1101/304659 ◽

2018 ◽

Cited By ~ 1

Author(s):

Michael Nute ◽

Ehsan Saleh ◽

Tandy Warnow

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structural Alignment ◽

Estimation Method ◽

Simulated Data ◽

Protein Sequences ◽

Data Sets ◽

Sequence Alignments ◽

Multiple Sequence ◽

Simulated Data Sets

AbstractThe estimation of multiple sequence alignments of protein sequences is a basic step in many bioinformatics pipelines, including protein structure prediction, protein family identification, and phylogeny estimation. Statistical co-estimation of alignments and trees under stochastic models of sequence evolution has long been considered the most rigorous technique for estimating alignments and trees, but little is known about the accuracy of such methods on biological benchmarks. We report the results of an extensive study evaluating the most popular protein alignment methods as well as the statistical co-estimation method BAli-Phy on 1192 protein data sets from established benchmarks as well as on 120 simulated data sets. Our study (which used more than 230 CPU years for the BAli-Phy analyses alone) shows that BAli-Phy is dramatically more accurate than the other alignment methods on the simulated data sets, but is among the least accurate on the biological benchmarks. There are several potential causes for this discordance, including model misspecification, errors in the reference alignments, and conflicts between structural alignment and evolutionary alignments; future research is needed to understand the most likely explanation for our observations. multiple sequence alignment, BAli-Phy, protein sequences, structural alignment, homology

Download Full-text

Bayesian Planet Searches for the 10 cm/s Radial Velocity Era

Proceedings of the International Astronomical Union ◽

10.1017/s1743921316002817 ◽

2015 ◽

Vol 11 (A29A) ◽

pp. 205-207

Author(s):

Philip C. Gregory

Keyword(s):

Radial Velocity ◽

State Of The Art ◽

Simulated Data ◽

Model Parameters ◽

Data Sets ◽

Stellar Activity ◽

Bayesian Fusion ◽

Multiple State ◽

Simulated Data Sets ◽

Apodization Function

AbstractA new apodized Keplerian model is proposed for the analysis of precision radial velocity (RV) data to model both planetary and stellar activity (SA) induced RV signals. A symmetrical Gaussian apodization function with unknown width and center can distinguish planetary signals from SA signals on the basis of the width of the apodization function. The general model for m apodized Keplerian signals also includes a linear regression term between RV and the stellar activity diagnostic In (R'hk), as well as an extra Gaussian noise term with unknown standard deviation. The model parameters are explored using a Bayesian fusion MCMC code. A differential version of the Generalized Lomb-Scargle periodogram provides an additional way of distinguishing SA signals and helps guide the choice of new periods. Sample results are reported for a recent international RV blind challenge which included multiple state of the art simulated data sets supported by a variety of stellar activity diagnostics.

Download Full-text