The simulation extrapolation technique meets ecology and evolution: A general and intuitive method to account for measurement error

Mapping Intimacies ◽

10.1101/535054 ◽

2019 ◽

Author(s):

Erica Ponzi ◽

Lukas F. Keller ◽

Stefanie Muff

Keyword(s):

Measurement Error ◽

Inbreeding Depression ◽

R Package ◽

Free Variable ◽

Heuristic Approach ◽

Error Modeling ◽

Real Field ◽

List Type ◽

Simulation Extrapolation ◽

Pedigree Errors

AbstractMeasurement error and other forms of uncertainty are commonplace in ecology and evolution and may bias estimates of parameters of interest. Although a variety of approaches to obtain unbiased estimators are available, these often require that errors are explicitly modeled and that a latent model for the unobserved error-free variable can be specified, which in practice is often difficult.Here we propose to generalize a heuristic approach to correct for measurement error, denoted as simulation extrapolation (SIMEX), to situations where explicit error modeling fails. We illustrate the application of SIMEX using the example of estimates of quantitative genetic parameters, e. g. inbreeding depression and heritability, in the presence of pedigree errors. Following the original SIMEX idea, the error in the pedigree is progressively increased to determine how the estimated quantities are affected by error. The observed trend is then extrapolated back to a hypothetical error-free pedigree, yielding unbiased estimates of inbreeding depression and heritability. We term this application of the SIMEX idea to pedigrees “PSIMEX”. We tested the method with simulated pedigrees with different pedigree structures and initial error proportions, and with real field data from a free-living population of song sparrows.The simulation study indicates that the accuracy and precision of the extrapolated error-free estimate for inbreeding depression and heritability are good. In the application to the song sparrow data, the error-corrected results could be validated against the actual values thanks to the availability of both an error-prone and an error-free pedigree, and the results indicate that the PSIMEX estimator is close to the actual value. For easy accessibility of the method, we provide the novel R-package PSIMEX.By transferring the SIMEX philosophy to error in pedigrees, we have illustrated how this heuristic approach can be generalized to situations where explicit latent models for the unobserved variables or for the error of the variables of interest are difficult to formulate. Thanks to the simplicity of the idea, many other error problems in ecology and evolution might be amenable to SIMEX-like error correction methods.

ON THE UNIFORM CONVERGENCE OF DECONVOLUTION ESTIMATORS FROM REPEATED MEASUREMENTS

Econometric Theory ◽

10.1017/s0266466620000572 ◽

2021 ◽

pp. 1-22

Author(s):

Daisuke Kurisu ◽

Taisuke Otsu

Keyword(s):

Multivariate Analysis ◽

Measurement Error ◽

Uniform Convergence ◽

Measurement Errors ◽

Convergence Rates ◽

Free Variable ◽

Error Model ◽

Repeated Measurements ◽

Empirical Characteristic Function ◽

Measurement Error Model

This paper studies the uniform convergence rates of Li and Vuong’s (1998, Journal of Multivariate Analysis 65, 139–165; hereafter LV) nonparametric deconvolution estimator and its regularized version by Comte and Kappus (2015, Journal of Multivariate Analysis 140, 31–46) for the classical measurement error model, where repeated noisy measurements on the error-free variable of interest are available. In contrast to LV, our assumptions allow unbounded supports for the error-free variable and measurement errors. Compared to Bonhomme and Robin (2010, Review of Economic Studies 77, 491–533) specialized to the measurement error model, our assumptions do not require existence of the moment generating functions of the square and product of repeated measurements. Furthermore, by utilizing a maximal inequality for the multivariate normalized empirical characteristic function process, we derive uniform convergence rates that are faster than the ones derived in these papers under such weaker conditions.

Characterization of a PMU-based method for transmission line parameters estimation with systematic measurement error modeling

10.23919/aeit53387.2021.9626908 ◽

2021 ◽

Author(s):

Carlo Muscas ◽

Paolo Attilio Pegoraro ◽

Carlo Sitzia ◽

Antonio Vincenzo Solinas ◽

Sara Sulis ◽

...

Keyword(s):

Measurement Error ◽

Transmission Line ◽

Error Modeling ◽

Parameters Estimation ◽

Systematic Measurement ◽

Systematic Measurement Error

FIREcaller: Detecting Frequently Interacting Regions from Hi-C Data

10.1101/619288 ◽

2019 ◽

Cited By ~ 3

Author(s):

Cheynna Crowley ◽

Yuchen Yang ◽

Yunjiang Qiu ◽

Benxia Hu ◽

Armen Abnousi ◽

...

Keyword(s):

Gene Regulation ◽

Spatial Organization ◽

R Package ◽

Specific Gene ◽

List Type ◽

Cell Type ◽

R Software ◽

Computational Tools ◽

Cell Type Specific ◽

User Friendly

AbstractHi-C experiments have been widely adopted to study chromatin spatial organization, which plays an essential role in genome function. We have recently identified frequently interacting regions (FIREs) and found that they are closely associated with cell-type-specific gene regulation. However, computational tools for detecting FIREs from Hi-C data are still lacking. In this work, we present FIREcaller, a stand-alone, user-friendly R package for detecting FIREs from Hi-C data. FIREcaller takes raw Hi-C contact matrices as input, performs within-sample and cross-sample normalization, and outputs continuous FIRE scores, dichotomous FIREs, and super-FIREs. Applying FIREcaller to Hi-C data from various human tissues, we demonstrate that FIREs and super-FIREs identified, in a tissue-specific manner, are closely related to gene regulation, are enriched for enhancer-promoter (E-P) interactions, tend to overlap with regions exhibiting epigenomic signatures of cis-regulatory roles, and aid the interpretation or GWAS variants. The FIREcaller package is implemented in R and freely available at https://yunliweb.its.unc.edu/FIREcaller.Highlights– Frequently Interacting Regions (FIREs) can be used to identify tissue and cell-type-specific cis-regulatory regions.– An R software, FIREcaller, has been developed to identify FIREs and clustered FIREs into super-FIREs.

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x0400300407 ◽

2003 ◽

Vol 3 (4) ◽

pp. 373-385 ◽

Cited By ~ 16

Author(s):

James W. Hardin ◽

Henrik Schmiediche ◽

Raymond J. Carroll

Keyword(s):

Measurement Error ◽

Generalized Linear Models ◽

Linear Models ◽

Extrapolation Method ◽

Simulation Extrapolation

Measurement error modeling and nutritional epidemiology association analyses

Canadian Journal of Statistics ◽

10.1002/cjs.10116 ◽

2011 ◽

pp. n/a-n/a

Author(s):

Ross L. Prentice ◽

Ying Huang

Keyword(s):

Measurement Error ◽

Error Modeling ◽

Nutritional Epidemiology ◽

Association Analyses

Rarefaction, alpha diversity, and statistics

10.1101/231878 ◽

2017 ◽

Cited By ~ 5

Author(s):

Amy Willis

Keyword(s):

Measurement Error ◽

Microbial Diversity ◽

Microbial Ecology ◽

Ecosystem Function ◽

Unknown Parameter ◽

Alpha Diversity ◽

Fundamental Question ◽

Error Modeling ◽

Extensive Literature ◽

Statistical Methodology

AbstractUnderstanding the drivers of microbial diversity is a fundamental question in microbial ecology. Extensive literature discusses different methods for describing microbial diversity and documenting its effects on ecosystem function. However, it is widely believed that diversity depends on the number of reads that are sequenced. I discuss a statistical perspective on diversity, framing the diversity of an environment as an unknown parameter, and discussing the bias and variance of plug-in and rarefied estimates. I argue that by failing to account for both bias and variance, we invalidate analysis of alpha diversity. I describe the state of the statistical literature for addressing these problems, and suggest that measurement error modeling can address issues with variance, but bias corrections need to be utilized as well. I encourage microbial ecologists to avoid motivating their investigations with alpha diversity analyses that do not use valid statistical methodology.

blockCV: an R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution models

10.1101/357798 ◽

2018 ◽

Cited By ~ 3

Author(s):

Roozbeh Valavi ◽

Jane Elith ◽

José J. Lahoz-Monfort ◽

Gurutzeta Guillera-Arroita

Keyword(s):

Species Distribution ◽

Cross Validation ◽

Species Distribution Models ◽

Predictive Performance ◽

R Package ◽

Species Distribution Modelling ◽

List Type ◽

Distribution Models ◽

Distribution Modelling ◽

Evaluation Approaches

SummaryWhen applied to structured data, conventional random cross-validation techniques can lead to underestimation of prediction error, and may result in inappropriate model selection.We present the R package blockCV, a new toolbox for cross-validation of species distribution modelling.The package can generate spatially or environmentally separated folds. It includes tools to measure spatial autocorrelation ranges in candidate covariates, providing the user with insights into the spatial structure in these data. It also offers interactive graphical capabilities for creating spatial blocks and exploring data folds.Package blockCV enables modellers to more easily implement a range of evaluation approaches. It will help the modelling community learn more about the impacts of evaluation approaches on our understanding of predictive performance of species distribution models.

Slow recovery from inbreeding depression generated by the complex genetic architecture of segregating deleterious mutations

10.1101/862631 ◽

2019 ◽

Author(s):

Paula E. Adams ◽

Anna L. Crist ◽

Ellen M. Young ◽

John H. Willis ◽

Patrick C. Phillips ◽

...

Keyword(s):

Population Size ◽

Inbreeding Depression ◽

Evolutionary Biology ◽

Large Population ◽

Genomic Diversity ◽

List Type ◽

Reproductive Systems ◽

Genomic Changes ◽

Deleterious Alleles ◽

Sister Mating

AbstractThe deleterious effects of inbreeding have been of extreme importance to evolutionary biology, but it has been difficult to characterize the complex interactions between genetic constraints and selection that lead to fitness loss and recovery after inbreeding. Viruses, bacteria, and the selfing nematode Caenorhabditis elegans have been shown to be capable of rapid recovery from the fixation of novel deleterious mutation, however the potential for fitness recovery from fixation of segregating variation under inbreeding in outcrossing organisms is poorly understood. C. remanei is an outcrossing relative of C. elegans with high polymorphic variation and extreme inbreeding depression. Here we sought to characterize changes C. remanei in patterns of genomic diversity after ∼30 generations of inbreeding via brother-sister mating followed by several hundred generations of recovery at large population size. As expected, inbreeding led to a large decline in reproductive fitness, but unlike results from mutation accumulation experiments, recovery from inbreeding at large populations sizes generated only very moderate recovery in fitness after 300 generations. At the genomic level, we found that while 66% of ancestral segregating SNPs were fixed in the inbred population, this was far fewer than expected under neutral processes. Under recovery, 36 SNPs across 30 genes involved in alimentary, muscular, nervous and reproductive systems changed reproducibly across all replicates, indicating that strong selection for fitness recovery does exist but is likely mutationally limited due to the large number of potential targets. Our results indicate that recovery from inbreeding depression via new compensatory mutations is likely to be constrained by the large number of segregating deleterious variants present in natural populations, limiting the capacity for rapid evolutionary rescue of small populations.Impact SummaryInbreeding is defined as mating between close relatives and can have a large effect on the genetic diversity and fitness of populations. This has been recognized for over 100 years of study in evolutionary biology, but the specific genomic changes that accompany inbreeding and the loss of fitness are still not known. Evolutionary theory predicts that inbred populations lose fitness through the fixation of many deleterious alleles and it is not known if populations can recover fitness after prolonged periods of inbreeding and deleterious fixations, or how long recovery may take. These questions are particularly important for wild populations experiencing declines. In this study we use laboratory populations of the nematode worm Caenorhabditis remanei to analyze the loss of fitness and genomic changes that accompany inbreeding via brother-sister mating, and to track the populations as they recover from inbreeding at large population size over 300 generations. We find that: Total progeny decreased by 65% after inbreedingThere were many nucleotides in the genome that remained heterozygous after inbreedingThere was an excess of inbreeding-resistant nucleotides on the X chromosomeThe number of progeny remained low after 300 generations of recovery from inbreeding30 genes changed significant in allele frequency during recovery, including genes involved in the alimentary, muscular, nervous and reproductive systemsTogether, our results demonstrate that recovery from inbreeding is difficult, likely due to the fixation of numerous deleterious alleles throughout the genome.

Genetic Allee effects and their interaction with ecological Allee effects

10.1101/061549 ◽

2016 ◽

Author(s):

Meike J. Wittmann ◽

Hanna Stuis ◽

Dirk Metzler

Keyword(s):

Population Size ◽

Inbreeding Depression ◽

Survival Probability ◽

Allee Effect ◽

Extinction Risk ◽

List Type ◽

Small Populations ◽

Allee Effects ◽

Deleterious Mutations ◽

Lethal Equivalents

SummaryIt is now widely accepted that genetic processes such as inbreeding depression and loss of genetic variation can increase the extinction risk of small populations. However, it is generally unclear whether extinction risk from genetic causes gradually increases with decreasing population size or whether there is a sharp transition around a specific threshold population size. In the ecological literature, such threshold phenomena are called “strong Allee effects” and they can arise for example from mate limitation in small populations.In this study, we aim to a) develop a meaningful notion of a “strong genetic Allee effect”, b) explore whether and under what conditions such an effect can arise from inbreeding depression due to recessive deleterious mutations, and c) quantify the interaction of potential genetic Allee effects with the well-known mate-finding Allee effect.We define a strong genetic Allee effect as a genetic process that causes a population’s survival probability to be a sigmoid function of its initial size. The inflection point of this function defines the critical population size. To characterize survival-probability curves, we develop and analyze simple stochastic models for the ecology and genetics of small populations.Our results indicate that inbreeding depression can indeed cause a strong genetic Allee effect, but only if individuals carry sufficiently many deleterious mutations (lethal equivalents) on average and if these mutations are spread across sufficiently many loci. Populations suffering from a genetic Allee effect often first grow, then decline as inbreeding depression sets in, and then potentially recover as deleterious mutations are purged. Critical population sizes of ecological and genetic Allee effects appear to be often additive, but even superadditive interactions are possible.Many published estimates for the number of lethal equivalents in birds and mammals fall in the parameter range where strong genetic Allee effects are expected. Unfortunately, extinction risk due to genetic Allee effects can easily be underestimated as populations with genetic problems often grow initially, but then crash later. Also interactions between ecological and genetic Allee effects can be strong and should not be neglected when assessing the viability of endangered or introduced populations.

ShapeRotator: an R tool for standardised rigid rotations of articulated Three-Dimensional structures with application for geometric morphometrics

10.1101/159392 ◽

2017 ◽

Cited By ~ 2

Author(s):

Marta Vidal-García ◽

Lashi Bandara ◽

J. Scott Keogh

Keyword(s):

Geometric Morphometrics ◽

Phenotypic Diversity ◽

Three Dimensional ◽

R Package ◽

Morphological Data ◽

List Type ◽

Shape Variation ◽

Data Set ◽

Articulated Structures ◽

Shape And Size

SummaryThe quantification of complex morphological patterns typically involves comprehensive shape and size analyses, usually obtained by gathering morphological data from all the structures that capture the phenotypic diversity of an organism or object. Articulated structures are a critical component of overall phenotypic diversity, but data gathered from these structures are difficult to incorporate in to modern analyses because of the complexities associated with jointly quantifying 3D shape in multiple structures.While there are existing methods for analysing shape variation in articulated structures in Two-Dimensional (2D) space, these methods do not work in 3D, a rapidly growing area of capability and research.Here we describe a simple geometric rigid rotation approach that removes the effect of random translation and rotation, enabling the morphological analysis of 3D articulated structures. Our method is based on Cartesian coordinates in 3D space so it can be applied to any morphometric problem that also uses 3D coordinates (e.g. spherical harmonics). We demonstrate the method by applying it to a landmark-based data set for analysing shape variation using geometric morphometrics.We have developed an R tool (ShapeRotator) so that the method can be easily implemented in the commonly used R package geomorph and MorphoJ software. This method will be a valuable tool for 3D morphological analyses in articulated structures by allowing an exhaustive examination of shape and size diversity.