iterative peeling
Recently Published Documents


TOTAL DOCUMENTS

6
(FIVE YEARS 1)

H-INDEX

1
(FIVE YEARS 0)

2022 ◽  
pp. 1531-1555
Author(s):  
Chandra Chekuri ◽  
Kent Quanrud ◽  
Manuel R. Torres


2020 ◽  
Author(s):  
Andrew Whalen ◽  
John M Hickey

AbstractIn this paper we present a new imputation algorithm, AlphaImpute2, which performs fast and accurate pedigree and population based imputation for livestock populations of hundreds of thousands of individuals. Genetic imputation is a tool used in genetics to decrease the cost of genotyping a population, by genotyping a small number of individuals at high-density and the remaining individuals at low-density. Shared haplotype segments between the high-density and low-density individuals can then be used to fill in the missing genotypes of the low-density individuals. As the size of genetics datasets have grown, the computational cost of performing imputation has increased, particularly in agricultural breeding programs where there might be hundreds of thousands of genotyped individuals. To address this issue, we present a new imputation algorithm, AlphaImpute2, which performs population imputation by using a particle based approximation to the Li and Stephens which exploits the Positional Burrows Wheeler Transform, and performs pedigree imputation using an approximate version of multi-locus iterative peeling. We tested AlphaImpute2 on four simulated datasets designed to mimic the pedigrees found in a real pig breeding program. We compared AlphaImpute2 to AlphaImpute, AlphaPeel, findhap version 4, and Beagle 5.1. We found that AlphaImpute2 had the highest accuracy, with an accuracy of 0.993 for low-density individuals on the pedigree with 107,000 individuals, compared to an accuracy of 0.942 for Beagle 5.1, 0.940 for AlphaImpute, and 0.801 for findhap. AlphaImpute2 was also the fastest software tested, with a runtime of 105 minutes a pedigree of 107,000 individuals and 5,000 markers was 105 minutes, compared to 190 minutes for Beagle 5.1, 395 minutes for findhap, and 7,859 minutes AlphaImpute. We believe that AlphaImpute2 will enable fast and accurate large scale imputation for agricultural populations as they scale to hundreds of thousands or millions of genotyped individuals.



2020 ◽  
Author(s):  
Martin Johnsson ◽  
Andrew Whalen ◽  
Roger Ros-Freixedes ◽  
Gregor Gorjanc ◽  
Ching-Yi Chen ◽  
...  

AbstractBackgroundIn this paper, we estimated recombination rate variation within the genome and between individuals in the pig using multiocus iterative peeling for 150,000 pigs across nine genotyped pedigrees. We used this to estimate the heritability of recombination and perform a genome-wide association study of recombination in the pig.ResultsOur results confirmed known features of the pig recombination landscape, including differences in chromosome length, and marked sex differences. The recombination landscape was repeatable between lines, but at the same time, the lines also showed differences in average genome-wide recombination rate. The heritability of genome-wide recombination was low but non-zero (on average 0.07 for females and 0.05 for males). We found three genomic regions associated with recombination rate, one of them harbouring the RNF212 gene, previously associated with recombination rate in several other species.ConclusionOur results from the pig agree with the picture of recombination rate variation in vertebrates, with low but nonzero heritability, and a major locus that is homologous to one detected in several other species. This work also highlights the utility of using large-scale livestock data to understand biological processes.



2017 ◽  
Author(s):  
Andrew Whalen ◽  
Roger Ros-Freixedes ◽  
David L Wilson ◽  
Gregor Gorjanc ◽  
John M Hickey

AbstractIn this paper we extend multi-locus iterative peeling to be a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees. Our method, called hybrid peeling, uses multi-locus iterative peeling to estimate shared chromosome segments between parents and their offspring, and then uses single-locus iterative peeling to aggregate genomic information across multiple generations. Using a synthetic dataset, we first analysed the performance of hybrid peeling for calling and phasing alleles in disconnected families, families which contained only a focal individual and its parents and grandparents. Second, we analysed the performance of hybrid peeling for calling and phasing alleles in the context of the full pedigree. Third, we analysed the performance of hybrid peeling for imputing whole genome sequence data to the remaining individuals in the population. We found that hybrid peeling substantially increase the number of genotypes that were called and phased by leveraging sequence information on related individuals. The calling rate and accuracy increased when the full pedigree was used compared to a reduced pedigree of just parents and grandparents. Finally, hybrid peeling accurately imputed whole genome sequence information to non-sequenced individuals. We believe that this algorithm will enable the generation of low cost and high accuracy whole genome sequence data in many pedigreed populations. We are making this algorithm available as a standalone program called AlphaPeel.



2017 ◽  
Vol 2017 ◽  
pp. 1-17 ◽  
Author(s):  
Armin Ott ◽  
Alexander Hapfelmeier

Two nonparametric methods for the identification of subgroups with outstanding outcome values are described and compared to each other in a simulation study and an application to clinical data. The Patient Rule Induction Method (PRIM) searches for box-shaped areas in the given data which exceed a minimal size and average outcome. This is achieved via a combination of iterative peeling and pasting steps, where small fractions of the data are removed or added to the current box. As an alternative, Classification and Regression Trees (CART) prediction models perform sequential binary splits of the data to produce subsets which can be interpreted as subgroups of heterogeneous outcome. PRIM and CART were compared in a simulation study to investigate their strengths and weaknesses under various data settings, taking different performance measures into account. PRIM was shown to be superior in rather complex settings such as those with few observations, a smaller signal-to-noise ratio, and more than one subgroup. CART showed the best performance in simpler situations. A practical application of the two methods was illustrated using a clinical data set. For this application, both methods produced similar results but the higher amount of user involvement of PRIM became apparent. PRIM can be flexibly tuned by the user, whereas CART, although simpler to implement, is rather static.



2016 ◽  
Vol 2016 ◽  
pp. 1-10 ◽  
Author(s):  
Weijun Zeng ◽  
Huali Wang

We present a new approach for the analysis of iterative peeling decoding recovery algorithms in the context of Low-Density Parity-Check (LDPC) codes and compressed sensing. The iterative recovery algorithm is particularly interesting for its low measurement cost and low computational complexity. The asymptotic analysis can track the evolution of the fraction of unrecovered signal elements in each iteration, which is similar to the well-known density evolution analysis in the context of LDPC decoding algorithm. Our analysis shows that there exists a threshold on the density factor; if under this threshold, the recovery algorithm is successful; otherwise it will fail. Simulation results are also provided for verifying the agreement between the proposed asymptotic analysis and recovery algorithm. Compared with existing works of peeling decoding algorithm, focusing on the failure probability of the recovery algorithm, our proposed approach gives accurate evolution of performance with different parameters of measurement matrices and is easy to implement. We also show that the peeling decoding algorithm performs better than other schemes based on LDPC codes.



Sign in / Sign up

Export Citation Format

Share Document