Crossover interference and sex-specific genetic maps shape identical by descent sharing in close relatives

Mapping Intimacies ◽

10.1101/527655 ◽

2019 ◽

Cited By ~ 2

Author(s):

Madison Caballero ◽

Daniel N. Seidman ◽

Jens Sannerud ◽

Thomas D. Dyer ◽

Donna M. Lehman ◽

...

Keyword(s):

Standard Deviation ◽

Real Data ◽

Genetic Maps ◽

Pedigree Data ◽

Crossover Interference ◽

Interference Modeling ◽

Identical By Descent ◽

Close Relatives ◽

The Impact ◽

Number Of Segments

AbstractSimulations of close relatives and identical by descent (IBD) segments are common in genetic studies, yet most past efforts have utilized sex averaged genetic maps and ignored crossover interference, thus omitting features known to affect the breakpoints of IBD segments. We developed Ped-sim, a method for simulating relatives that can utilize either sex-specific or sex averaged genetic maps and also either a model of crossover interference or the traditional Poisson model for inter-crossover distances. To characterize the impact of previously ignored mechanisms, we simulated data for all four combinations of these factors. We found that modeling crossover interference decreases the standard deviation of the IBD proportion by 10.4% on average in full siblings through second cousins. By contrast, sex-specific maps increase this standard deviation by 4.2% on average, and also impact the number of segments relatives share. Most notably, using sex-specific maps, the number of segments half-siblings share is bimodal; and when combined with interference modeling, the probability that sixth cousins have non-zero IBD ranges from 9.0 to 13.1%, depending on the sexes of the individuals through which they are related. We present new analytical results for the distributions of IBD segments under these models and show they match results from simulations. Finally, we compared IBD sharing rates between simulated and real relatives and find that the combination of sex-specific maps and interference modeling most accurately captures IBD rates in real data. Ped-sim is open source and available fromhttps://github.com/williamslab/ped-sim.Author summarySimulations are ubiquitous throughout statistical genetics in order to generate data with known properties, enabling tests of inference methods and analyses of real world processes in settings where experimental data are challenging to collect. Simulating genetic data for relatives in a pedigree requires the synthesis of chromosomes parents transmit to their children. These chromosomes form as a mosaic of a given parent’s two chromosomes, with the location of switches between the two parental chromosomes known as crossovers. Detailed information about crossover generation based on real data from humans now exists, including the fact that men and women have overall different rates (women produce ~1.6 times more crossovers) and that real crossovers are subject tointerference—whereby crossovers are further apart from one another than expected under a model that selects their locations randomly. Our new method, Ped-sim, can simulate pedigree data using these less commonly modeled crossover features, and we used it to evaluate the importance of sex-specific rates and interference in real data. These comparisons show that both factors shape the amount of DNA two relatives share identically, and that their inclusion in models of crossover better fit data from real relatives.

Crossover interference and sex-specific genetic maps shape identical by descent sharing in close relatives

PLoS Genetics ◽

10.1371/journal.pgen.1007979 ◽

2019 ◽

Vol 15 (12) ◽

pp. e1007979 ◽

Cited By ~ 8

Author(s):

Madison Caballero ◽

Daniel N. Seidman ◽

Ying Qiao ◽

Jens Sannerud ◽

Thomas D. Dyer ◽

...

Keyword(s):

Genetic Maps ◽

Crossover Interference ◽

Identical By Descent ◽

Close Relatives

Distinguishing pedigree relationships using multi-way identical by descent sharing and sex-specific genetic maps

10.1101/753343 ◽

2019 ◽

Cited By ~ 2

Author(s):

Ying Qiao ◽

Jens Sannerud ◽

Sayantani Basu-Roy ◽

Caroline Hayward ◽

Amy L. Williams

Keyword(s):

Simulated Data ◽

Genetic Maps ◽

Fast Method ◽

Identical By Descent ◽

Generation Scotland ◽

Paternal Relationship ◽

Close Relatives ◽

Inference Methods ◽

Relationship Of ◽

The Relationship

AbstractThe proportion of samples with one or more close relatives in a genetic dataset increases rapidly with sample size, necessitating relatedness modeling and enabling pedigree-based analyses. Despite this, relatives are generally unreported and current inference methods typically detect only the degree of relatedness of sample pairs and not pedigree relationships. We developed CREST, an accurate and fast method that identifies the pedigree relationships of close relatives. CREST utilizes identical by descent (IBD) segments shared between a pair of samples and their mutual relatives, leveraging the fact that sharing rates among these individuals differ across pedigree configurations. Furthermore, CREST exploits the profound differences in sex-specific genetic maps to classify pairs as maternally or paternally related—e.g., paternal half-siblings—using the locations of autosomal IBD segments shared between the pair. In simulated data, CREST correctly classifies 91.5-99.5% of grandparent-grandchild (GP) pairs, 70.5-97.0% of avuncular (AV) pairs, and 79.0-98.0% of half-siblings (HS) pairs compared to PADRE’s rates of 38.5-76.0% of GP, 60.5-92.0% of AV, 73.0-95.0% of HS pairs. Turning to the real 20,032 sample Generation Scotland (GS) dataset, CREST correctly determines the relationship of 99.0% of GP, 85.7% of AV, and 95.0% of HS pairs that have sufficient mutual relative data, completing this analysis in 10.1 CPU hours including IBD detection. CREST’s maternal and paternal relationship inference is also accurate, as it flagged five pairs as incorrectly labeled in the GS pedigrees— three of which we confirmed as mistakes, and two with an uncertain relationship—yielding 99.7% of HS and 93.5% of GP pairs correctly classified.

COVID-19: Time-Dependent Effective Reproduction Number and Sub-notification Effect Estimation Modeling (Preprint)

10.2196/preprints.23997 ◽

2020 ◽

Author(s):

Eduardo Atem De Carvalho ◽

Rogerio Atem De Carvalho

Keyword(s):

New York ◽

Reproduction Number ◽

Moving Average ◽

Real Data ◽

Time Dependent ◽

Initial Value ◽

Health Authorities ◽

Reproduction Numbers ◽

Effect Estimation ◽

The Impact

BACKGROUND Since the beginning of the COVID-19 pandemic, researchers and health authorities have sought to identify the different parameters that govern their infection and death cycles, in order to be able to make better decisions. In particular, a series of reproduction number estimation models have been presented, with different practical results. OBJECTIVE This article aims to present an effective and efficient model for estimating the Reproduction Number and to discuss the impacts of sub-notification on these calculations. METHODS The concept of Moving Average Method with Initial value (MAMI) is used, as well as a model for Rt, the Reproduction Number, is derived from experimental data. The models are applied to real data and their performance is presented. RESULTS Analyses on Rt and sub-notification effects for Germany, Italy, Sweden, United Kingdom, South Korea, and the State of New York are presented to show the performance of the methods here introduced. CONCLUSIONS We show that, with relatively simple mathematical tools, it is possible to obtain reliable values for time-dependent, incubation period-independent Reproduction Numbers (Rt). We also demonstrate that the impact of sub-notification is relatively low, after the initial phase of the epidemic cycle has passed.

Robust Filtering Techniques for RTK Positioning in Harsh Propagation Environments

Sensors ◽

10.3390/s21041250 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1250

Author(s):

Daniel Medina ◽

Haoqing Li ◽

Jordi Vilà-Valls ◽

Pau Closas

Keyword(s):

Intelligent Transportation Systems ◽

Robust Statistics ◽

Real Data ◽

Transportation Systems ◽

Global Navigation Satellite Systems ◽

Mixed Integer ◽

Robust Filtering ◽

Precise Positioning ◽

The Impact ◽

Filtering Techniques

Global navigation satellite systems (GNSSs) play a key role in intelligent transportation systems such as autonomous driving or unmanned systems navigation. In such applications, it is fundamental to ensure a reliable precise positioning solution able to operate in harsh propagation conditions such as urban environments and under multipath and other disturbances. Exploiting carrier phase observations allows for precise positioning solutions at the complexity cost of resolving integer phase ambiguities, a procedure that is particularly affected by non-nominal conditions. This limits the applicability of conventional filtering techniques in challenging scenarios, and new robust solutions must be accounted for. This contribution deals with real-time kinematic (RTK) positioning and the design of robust filtering solutions for the associated mixed integer- and real-valued estimation problem. Families of Kalman filter (KF) approaches based on robust statistics and variational inference are explored, such as the generalized M-based KF or the variational-based KF, aiming to mitigate the impact of outliers or non-nominal measurement behaviors. The performance assessment under harsh propagation conditions is realized using a simulated scenario and real data from a measurement campaign. The proposed robust filtering solutions are shown to offer excellent resilience against outlying observations, with the variational-based KF showcasing the overall best performance in terms of Gaussian efficiency and robustness.

An adaptive social distancing SIR model for COVID-19 disease spreading and forecasting

Epidemiologic Methods ◽

10.1515/em-2020-0044 ◽

2021 ◽

Vol 10 (s1) ◽

Author(s):

Said Gounane ◽

Yassir Barkouch ◽

Abdelghafour Atlas ◽

Mostafa Bendahmane ◽

Fahd Karami ◽

...

Keyword(s):

Mathematical Theory ◽

Real Data ◽

Sir Model ◽

Epidemic Threshold ◽

Social Distancing ◽

Sir Epidemic ◽

Proposed Model ◽

Disease Spreading ◽

The Government ◽

The Impact

Abstract Recently, various mathematical models have been proposed to model COVID-19 outbreak. These models are an effective tool to study the mechanisms of coronavirus spreading and to predict the future course of COVID-19 disease. They are also used to evaluate strategies to control this pandemic. Generally, SIR compartmental models are appropriate for understanding and predicting the dynamics of infectious diseases like COVID-19. The classical SIR model is initially introduced by Kermack and McKendrick (cf. (Anderson, R. M. 1991. “Discussion: the Kermack–McKendrick Epidemic Threshold Theorem.” Bulletin of Mathematical Biology 53 (1): 3–32; Kermack, W. O., and A. G. McKendrick. 1927. “A Contribution to the Mathematical Theory of Epidemics.” Proceedings of the Royal Society 115 (772): 700–21)) to describe the evolution of the susceptible, infected and recovered compartment. Focused on the impact of public policies designed to contain this pandemic, we develop a new nonlinear SIR epidemic problem modeling the spreading of coronavirus under the effect of a social distancing induced by the government measures to stop coronavirus spreading. To find the parameters adopted for each country (for e.g. Germany, Spain, Italy, France, Algeria and Morocco) we fit the proposed model with respect to the actual real data. We also evaluate the government measures in each country with respect to the evolution of the pandemic. Our numerical simulations can be used to provide an effective tool for predicting the spread of the disease.

Pedigree Data Analysis With Crossover Interference

Genetics ◽

10.1093/genetics/164.4.1561 ◽

2003 ◽

Vol 164 (4) ◽

pp. 1561-1566

Author(s):

Sharon Browning

Keyword(s):

Data Analysis ◽

Linkage Analysis ◽

Importance Sampling ◽

Genetic Data ◽

Genotype Data ◽

Pedigree Data ◽

Chi Square ◽

Crossover Interference ◽

Map Construction ◽

Interference Models

AbstractWe propose a new method for calculating probabilities for pedigree genetic data that incorporates crossover interference using the chi-square models. Applications include relationship inference, genetic map construction, and linkage analysis. The method is based on importance sampling of unobserved inheritance patterns conditional on the observed genotype data and takes advantage of fast algorithms for no-interference models while using reweighting to allow for interference. We show that the method is effective for arbitrarily many markers with small pedigrees.

Constructing Large-Scale Genetic Maps Using an Evolutionary Strategy Algorithm

Genetics ◽

10.1093/genetics/165.4.2269 ◽

2003 ◽

Vol 165 (4) ◽

pp. 2269-2282

Author(s):

D Mester ◽

Y Ronin ◽

D Minkov ◽

E Nevo ◽

A Korol

Keyword(s):

Discrete Optimization ◽

High Performance ◽

Large Scale ◽

Simulated Data ◽

Real Data ◽

Genetic Maps ◽

Chromosome 1 ◽

Evolutionary Strategy ◽

Group A ◽

The One

Abstract This article is devoted to the problem of ordering in linkage groups with many dozens or even hundreds of markers. The ordering problem belongs to the field of discrete optimization on a set of all possible orders, amounting to n!/2 for n loci; hence it is considered an NP-hard problem. Several authors attempted to employ the methods developed in the well-known traveling salesman problem (TSP) for multilocus ordering, using the assumption that for a set of linked loci the true order will be the one that minimizes the total length of the linkage group. A novel, fast, and reliable algorithm developed for the TSP and based on evolution-strategy discrete optimization was applied in this study for multilocus ordering on the basis of pairwise recombination frequencies. The quality of derived maps under various complications (dominant vs. codominant markers, marker misclassification, negative and positive interference, and missing data) was analyzed using simulated data with ∼50-400 markers. High performance of the employed algorithm allows systematic treatment of the problem of verification of the obtained multilocus orders on the basis of computing-intensive bootstrap and/or jackknife approaches for detecting and removing questionable marker scores, thereby stabilizing the resulting maps. Parallel calculation technology can easily be adopted for further acceleration of the proposed algorithm. Real data analysis (on maize chromosome 1 with 230 markers) is provided to illustrate the proposed methodology.

An improved method to estimate Q based on the logarithmic spectrum of moving peak points

Interpretation ◽

10.1190/int-2017-0234.1 ◽

2019 ◽

Vol 7 (2) ◽

pp. T255-T263 ◽

Cited By ~ 4

Author(s):

Yanli Liu ◽

Zhenchun Li ◽

Guoquan Yang ◽

Qiang Liu

Keyword(s):

Seismic Waves ◽

Wave Attenuation ◽

Real Data ◽

Peak Frequency ◽

Seismic Reflection Data ◽

The Real ◽

Time Frequency ◽

Reflection Data ◽

Text Filtering ◽

The Impact

The quality factor ([Formula: see text]) is an important parameter for measuring the attenuation of seismic waves. Reliable [Formula: see text] estimation and stable inverse [Formula: see text] filtering are expected to improve the resolution of seismic data and deep-layer energy. Many methods of estimating [Formula: see text] are based on an individual wavelet. However, it is difficult to extract the individual wavelet precisely from seismic reflection data. To avoid this problem, we have developed a method of directly estimating [Formula: see text] from reflection data. The core of the methodology is selecting the peak-frequency points to linear fit their logarithmic spectrum and time-frequency product. Then, we calculated [Formula: see text] according to the relationship between [Formula: see text] and the optimized slope. First, to get the peak frequency points at different times, we use the generalized S transform to produce the 2D high-precision time-frequency spectrum. According to the seismic wave attenuation mechanism, the logarithmic spectrum attenuates linearly with the product of frequency and time. Thus, the second step of the method is transforming a 2D spectrum into 1D by variable substitution. In the process of transformation, we only selected the peak frequency points to participate in the fitting process, which can reduce the impact of the interference on the spectrum. Third, we obtain the optimized slope by least-squares fitting. To demonstrate the reliability of our method, we applied it to a constant [Formula: see text] model and the real data of a work area. For the real data, we calculated the [Formula: see text] curve of the seismic trace near a well and we get the high-resolution section by using stable inverse [Formula: see text] filtering. The model and real data indicate that our method is effective and reliable for estimating the [Formula: see text] value.

Container Throughput Forecasting Using Dynamic Factor Analysis and ARIMAX Model

PROMET - Traffic&Transportation ◽

10.7307/ptt.v29i5.2334 ◽

2017 ◽

Vol 29 (5) ◽

pp. 529-542 ◽

Cited By ~ 8

Author(s):

Marko Intihar ◽

Tomaž Kramberger ◽

Dejan Dragan

Keyword(s):

Factor Analysis ◽

Goodness Of Fit ◽

Moving Average ◽

Real Data ◽

Information Criteria ◽

Dynamic Factor Analysis ◽

Forecasting Model ◽

Dynamic Factor ◽

Macroeconomic Indicators ◽

The Impact

The paper examines the impact of integration of macroeconomic indicators on the accuracy of container throughput time series forecasting model. For this purpose, a Dynamic factor analysis and AutoRegressive Integrated Moving-Average model with eXogenous inputs (ARIMAX) are used. Both methodologies are integrated into a novel four-stage heuristic procedure. Firstly, dynamic factors are extracted from external macroeconomic indicators influencing the observed throughput. Secondly, the family of ARIMAX models of different orders is generated based on the derived factors. In the third stage, the diagnostic and goodness-of-fit testing is applied, which includes statistical criteria such as fit performance, information criteria, and parsimony. Finally, the best model is heuristically selected and tested on the real data of the Port of Koper. The results show that by applying macroeconomic indicators into the forecasting model, more accurate future throughput forecasts can be achieved. The model is also used to produce future forecasts for the next four years indicating a more oscillatory behaviour in (2018-2020). Hence, care must be taken concerning any bigger investment decisions initiated from the management side. It is believed that the proposed model might be a useful reinforcement of the existing forecasting module in the observed port.

Investigating the Impact of Competition and Incentive Design on Performance of Crowdfunding Projects: A Case of Independent Movies

Journal of theoretical and applied electronic commerce research ◽

10.3390/jtaer16040045 ◽

2021 ◽

Vol 16 (4) ◽

pp. 791-810

Author(s):

Li Chen

Keyword(s):

Real Data ◽

Theoretical Explanation ◽

Research Model ◽

Key Factors ◽

Incentive Design ◽

Competition Intensity ◽

Academic Researchers ◽

Competition Pressure ◽

The Impact ◽

Funding Level

Recently, crowdfunding has become a popular e-commerce model based on web 2.0 platforms for fundraisers to collect funding from a large group of supporters using the Internet. However, many projects failed to reach their funding targets. Despite the growing interest of academic researchers and e-commerce professionals in identifying drivers of crowdfunding success, important factors like competition and incentive design have not received much attention in prior research. In this study we aim to fill this gap by investigating the impact of competition and incentive design on the performance of crowdfunding projects. Drawing upon literature of entrepreneurship, we develop a research model involving key factors such as competition intensity and the number of reward levels. Using real data of 209 independent movie projects of an online crowdfunding platform, we test the proposed hypotheses of the impact of competition and incentive design on crowdfunding success. Our results show that competition plays a significant role in crowdfunding performance. The higher competition pressure is, the lower performance of crowdfunding projects will be. We also find that factors such as the number of reward levels and the plan of attending movie festivals are essential to the success of crowdfunding projects, but the funding level of getting the top reward does not exert a significant impact. Our study contributes to the e-commerce literature by further exploring the mechanism of crowdfunding success with theoretical explanation and empirical evidence. Researchers and professionals can apply our theoretical findings regarding competition and incentive design in other e-commerce platforms. Furthermore, our results provide useful managerial insights and operational policies for project founders and managers of crowdfunding platforms.