haplotype information Latest Research Papers

AbstractParental relatedness of present-day humans varies substantially across the globe, but little is known about the past. Here we analyze ancient DNA, leveraging that parental relatedness leaves genomic traces in the form of runs of homozygosity. We present an approach to identify such runs in low-coverage ancient DNA data aided by haplotype information from a modern phased reference panel. Simulation and experiments show that this method robustly detects runs of homozygosity longer than 4 centimorgan for ancient individuals with at least 0.3 × coverage. Analyzing genomic data from 1,785 ancient humans who lived in the last 45,000 years, we detect low rates of first cousin or closer unions across most ancient populations. Moreover, we find a marked decay in background parental relatedness co-occurring with or shortly after the advent of sedentary agriculture. We observe this signal, likely linked to increasing local population sizes, across several geographic transects worldwide.

Download Full-text

NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks

Genome Biology ◽

10.1186/s13059-021-02472-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Mian Umair Ahsan ◽

Qian Liu ◽

Li Fang ◽

Kai Wang

Keyword(s):

Deep Neural Networks ◽

Short Read Sequencing ◽

Human Genomes ◽

Long Reads ◽

Novel Variants ◽

Long Read ◽

Local Realignment ◽

Variant Detection ◽

Genomic Regions ◽

Haplotype Information

AbstractLong-read sequencing enables variant detection in genomic regions that are considered difficult-to-map by short-read sequencing. To fully exploit the benefits of longer reads, here we present a deep learning method NanoCaller, which detects SNPs using long-range haplotype information, then phases long reads with called SNPs and calls indels with local realignment. Evaluation on 8 human genomes demonstrates that NanoCaller generally achieves better performance than competing approaches. We experimentally validate 41 novel variants in a widely used benchmarking genome, which could not be reliably detected previously. In summary, NanoCaller facilitates the discovery of novel variants in complex genomic regions from long-read sequencing.

Download Full-text

An efficient method to identify, date and describe admixture events using haplotype information

10.1101/2021.08.12.455263 ◽

2021 ◽

Author(s):

Pongsakorn Wangkumhang ◽

Matthew Greenfield ◽

Garrett Hellenthal

Keyword(s):

Roman Empire ◽

Baltic Sea ◽

Large Scale ◽

Computation Time ◽

North African ◽

The Baltic Sea ◽

New Approach ◽

Genome Wide ◽

The Baltic ◽

Haplotype Information

We present fastGLOBETROTTER, an efficient new haplotype-based technique to identify, date and describe admixture events using genome-wide autosomal data. With simulations, we demonstrate how fastGLOBETROTTER reduces computation time by 4-20 fold relative to the haplotype-based technique GLOBETROTTER without suffering loss of accuracy. We apply fastGLOBETROTTER to a cohort of >6000 Europeans from ten countries, revealing previously unreported admixture signals. In particular we infer multiple periods of admixture related to East Asian or Siberian-like sources, starting >2000 years ago, in people living in countries north of the Baltic Sea. In contrast, we infer admixture related to West Asian, North African and/or Southern European sources in populations south of the Baltic Sea, including admixture dated to ≈300-700CE, overlapping the fall of the Roman Empire, in people from Belgium, France and parts of Germany. Our new approach scales to analysing hundreds to thousands of individuals from a putatively admixed populations and hence is applicable to emerging large-scale cohorts of genetically homogeneous populations.

Download Full-text

Quinoa genome assembly employing genomic variation for guided scaffolding

Theoretical and Applied Genetics ◽

10.1007/s00122-021-03915-x ◽

2021 ◽

Author(s):

Alexandrina Bodrug-Schepers ◽

Nancy Stralis-Pavese ◽

Hermann Buerstmayr ◽

Juliane C. Dohm ◽

Heinz Himmelbauer

Keyword(s):

Genome Assembly ◽

Chenopodium Quinoa ◽

Genomic Variation ◽

Valuable Resource ◽

Sequencing Data ◽

Genome Wide ◽

A Genome ◽

Long Read ◽

Genome Assemblies ◽

Haplotype Information

Abstract Key message We propose to use the natural variation between individuals of a population for genome assembly scaffolding. In today’s genome projects, multiple accessions get sequenced, leading to variant catalogs. Using such information to improve genome assemblies is attractive both cost-wise as well as scientifically, because the value of an assembly increases with its contiguity. We conclude that haplotype information is a valuable resource to group and order contigs toward the generation of pseudomolecules. Abstract Quinoa (Chenopodium quinoa) has been under cultivation in Latin America for more than 7500 years. Recently, quinoa has gained increasing attention due to its stress resistance and its nutritional value. We generated a novel quinoa genome assembly for the Bolivian accession CHEN125 using PacBio long-read sequencing data (assembly size 1.32 Gbp, initial N50 size 608 kbp). Next, we re-sequenced 50 quinoa accessions from Peru and Bolivia. This set of accessions differed at 4.4 million single-nucleotide variant (SNV) positions compared to CHEN125 (1.4 million SNV positions on average per accession). We show how to exploit variation in accessions that are distantly related to establish a genome-wide ordered set of contigs for guided scaffolding of a reference assembly. The method is based on detecting shared haplotypes and their expected continuity throughout the genome (i.e., the effect of linkage disequilibrium), as an extension of what is expected in mapping populations where only a few haplotypes are present. We test the approach using Arabidopsis thaliana data from different populations. After applying the method on our CHEN125 quinoa assembly we validated the results with mate-pairs, genetic markers, and another quinoa assembly originating from a Chilean cultivar. We show consistency between these information sources and the haplotype-based relations as determined by us and obtain an improved assembly with an N50 size of 1079 kbp and ordered contig groups of up to 39.7 Mbp. We conclude that haplotype information in distantly related individuals of the same species is a valuable resource to group and order contigs according to their adjacency in the genome toward the generation of pseudomolecules.

Download Full-text

Haplotype tagging reveals parallel formation of hybrid races in two butterfly species

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2015005118 ◽

2021 ◽

Vol 118 (25) ◽

pp. e2015005118

Author(s):

Joana I. Meier ◽

Patricio A. Salazar ◽

Marek Kučka ◽

Robert William Davies ◽

Andreea Dréau ◽

...

Keyword(s):

Large Scale ◽

Low Cost ◽

Natural Populations ◽

Elevational Gradient ◽

Butterfly Species ◽

Hybrid Zones ◽

Sequencing Data ◽

Sequencing Technologies ◽

Linkage Information ◽

Haplotype Information

Genetic variation segregates as linked sets of variants or haplotypes. Haplotypes and linkage are central to genetics and underpin virtually all genetic and selection analysis. Yet, genomic data often omit haplotype information due to constraints in sequencing technologies. Here, we present “haplotagging,” a simple, low-cost linked-read sequencing technique that allows sequencing of hundreds of individuals while retaining linkage information. We apply haplotagging to construct megabase-size haplotypes for over 600 individual butterflies (Heliconius erato and H. melpomene), which form overlapping hybrid zones across an elevational gradient in Ecuador. Haplotagging identifies loci controlling distinctive high- and lowland wing color patterns. Divergent haplotypes are found at the same major loci in both species, while chromosome rearrangements show no parallelism. Remarkably, in both species, the geographic clines for the major wing-pattern loci are displaced by 18 km, leading to the rise of a novel hybrid morph in the center of the hybrid zone. We propose that shared warning signaling (Müllerian mimicry) may couple the cline shifts seen in both species and facilitate the parallel coemergence of a novel hybrid morph in both comimetic species. Our results show the power of efficient haplotyping methods when combined with large-scale sequencing data from natural populations.

Download Full-text

Computational methods for chromosome-scale haplotype reconstruction

Genome Biology ◽

10.1186/s13059-021-02328-9 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Shilpa Garg

Keyword(s):

Genetic Variation ◽

Computational Methods ◽

Whole Genome ◽

Haplotype Reconstruction ◽

High Quality ◽

Short Read ◽

Short Read Sequencing ◽

Sequencing Technologies ◽

Long Read ◽

Haplotype Information

AbstractHigh-quality chromosome-scale haplotype sequences of diploid genomes, polyploid genomes, and metagenomes provide important insights into genetic variation associated with disease and biodiversity. However, whole-genome short read sequencing does not yield haplotype information spanning whole chromosomes directly. Computational assembly of shorter haplotype fragments is required for haplotype reconstruction, which can be challenging owing to limited fragment lengths and high haplotype and repeat variability across genomes. Recent advancements in long-read and chromosome-scale sequencing technologies, alongside computational innovations, are improving the reconstruction of haplotypes at the level of whole chromosomes. Here, we review recent and discuss methodological progress and perspectives in these areas.

Download Full-text

An assembly-free method of phylogeny reconstruction using short-read sequences from pooled samples without barcodes

10.1101/2021.04.09.439138 ◽

2021 ◽

Author(s):

Thomas K. F. Wong ◽

Teng Li ◽

Louis Ranjard ◽

Steven Wu ◽

Jeet Sukumaran ◽

...

Keyword(s):

Dna Sequences ◽

Dna Barcode ◽

Real Data ◽

Reference Sequence ◽

Nucleotide Polymorphisms ◽

Data Set ◽

Single Nucleotide ◽

Short Read ◽

Pooled Samples ◽

Haplotype Information

AbstractA current strategy for obtaining haplotype information from several individuals involves short-read sequencing of pooled amplicons, where fragments from each individual is identified by a unique DNA barcode. In this paper, we report a new method to recover the phylogeny of haplotypes from short-read sequences obtained using pooled amplicons from a mixture of individuals, without barcoding. The method, AFPhyloMix, accepts an alignment of the mixture of reads against a reference sequence, obtains the single-nucleotide-polymorphisms (SNP) patterns along the alignment, and constructs the phylogenetic tree according to the SNP patterns. AFPhyloMix adopts a Bayesian model of inference to estimates the phylogeny of the haplotypes and their relative frequencies, given that the number of haplotypes is known. In our simulations, AFPhyloMix achieved at least 80% accuracy at recovering the phylogenies and frequencies of the constituent haplotypes, for mixtures with up to 15 haplotypes. AFPhyloMix also worked well on a real data set of kangaroo mitochondrial DNA sequences.

Download Full-text

Computational Methods for Chromosome-Scale Haplotype Reconstruction

10.20944/preprints202101.0116.v1 ◽

2021 ◽

Author(s):

Shilpa Garg

Keyword(s):

Genetic Variation ◽

Whole Genome ◽

Haplotype Reconstruction ◽

High Quality ◽

Short Read ◽

Short Read Sequencing ◽

Sequencing Technologies ◽

Evolutionary Studies ◽

Long Read ◽

Haplotype Information

High-quality chromosome-scale haplotype sequences— of diploid genomes, polyploid genomes and metagenomes — provide important insights into genetic variation associated with disease and biodiversity. However, whole-genome short read sequencing does not yield haplotype information that spans whole chromosomes directly. Computational assembly of shorter haplotype fragments is required for haplotype reconstruction, which can be challenging owing to limited fragment lengths and high haplotype and repeat variability across genomes. Recent advancements in long-read and chromosome-scale sequencing technologies, alongside computational innovations, are improving the reconstruction of haplotypes at the level of whole chromosomes. Here, we review recent methodological progress in these areas and discuss perspectives that could enable routine high-quality haplotype reconstruction in clinical and evolutionary studies.

Download Full-text

The ghosts of propagation past: haplotype information clarifies the relative influence of stocking history and phylogeographic processes on contemporary population structure of walleye ( Sander vitreus

Evolutionary Applications ◽

10.1111/eva.13186 ◽

2020 ◽

Author(s):

Matthew L. Bootsma ◽

Loren Miller ◽

Greg G. Sass ◽

Peter T. Euclide ◽

Wesley A. Larson

Keyword(s):

Population Structure ◽

Relative Influence ◽

Sander Vitreus ◽

Haplotype Information

Download Full-text

Using Haplotype Information for Conservation Genomics

Trends in Ecology & Evolution ◽

10.1016/j.tree.2019.10.012 ◽

2020 ◽

Vol 35 (3) ◽

pp. 245-258 ◽

Cited By ~ 6

Author(s):

Maeva Leitwein ◽

Maud Duranton ◽

Quentin Rougemont ◽

Pierre-Alexandre Gagnaire ◽

Louis Bernatchez

Keyword(s):

Conservation Genomics ◽

Haplotype Information

Download Full-text

haplotype information
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Parental relatedness through time revealed by runs of homozygosity in ancient DNA

NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks

An efficient method to identify, date and describe admixture events using haplotype information

Quinoa genome assembly employing genomic variation for guided scaffolding

Haplotype tagging reveals parallel formation of hybrid races in two butterfly species

Computational methods for chromosome-scale haplotype reconstruction

An assembly-free method of phylogeny reconstruction using short-read sequences from pooled samples without barcodes

Computational Methods for Chromosome-Scale Haplotype Reconstruction

The ghosts of propagation past: haplotype information clarifies the relative influence of stocking history and phylogeographic processes on contemporary population structure of walleye ( Sander vitreus

Using Haplotype Information for Conservation Genomics

Export Citation Format

haplotype informationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Parental relatedness through time revealed by runs of homozygosity in ancient DNA

NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks

An efficient method to identify, date and describe admixture events using haplotype information

Quinoa genome assembly employing genomic variation for guided scaffolding

Haplotype tagging reveals parallel formation of hybrid races in two butterfly species

Computational methods for chromosome-scale haplotype reconstruction

An assembly-free method of phylogeny reconstruction using short-read sequences from pooled samples without barcodes

Computational Methods for Chromosome-Scale Haplotype Reconstruction

The ghosts of propagation past: haplotype information clarifies the relative influence of stocking history and phylogeographic processes on contemporary population structure of walleye ( Sander vitreus

Using Haplotype Information for Conservation Genomics

haplotype information
Recently Published Documents