scholarly journals TKGWV2: An ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data

2021 ◽  
Author(s):  
Daniel M Fernandes ◽  
Olivia Cheronet ◽  
Pere Gelabert ◽  
Ron Pinhasi

Estimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1 - 1x minimum average genomic coverage per analysed pair of individuals between. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026x average coverage, or around 1.3 million aligned reads per sample - 4 times less data than 0.1x. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012x, or around 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals previously excluded from kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Daniel M. Fernandes ◽  
Olivia Cheronet ◽  
Pere Gelabert ◽  
Ron Pinhasi

AbstractEstimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1–1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp—four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.


2016 ◽  
Author(s):  
Daniel Fernandes ◽  
Kendra Sirak ◽  
Mario Novak ◽  
John Finarelli ◽  
John Byrne ◽  
...  

ABSTRACTThomas Kent was an Irish rebel who was executed by British forces in the aftermath of the Easter Rising armed insurrection of 1916 and buried in a shallow grave on Cork prison's grounds. In 2015, ninety-nine years after his death, a state funeral was offered to his living family to honor his role in the struggle for Irish independence. However, inaccuracies in record keeping did not allow the bodily remains that supposedly belonged to Kent to be identified with absolute certainty. Using a novel approach based on homozygous single nucleotide polymorphisms, we identified these remains to be those of Kent by comparing his genetic data to that of two known living relatives. As the DNA degradation found on Kent's DNA, characteristic of ancient DNA, rendered traditional methods of relatedness estimation unusable, we forced all loci homozygous, in a process we refer to as “forced homozygote approach”. The results were confirmed using simulated data for different relatedness classes. We argue that this method provides a necessary alternative for relatedness estimations, not only in forensic analysis, but also in ancient DNA studies, where reduced amounts of genetic information can limit the application of traditional methods.


2020 ◽  
Author(s):  
Erin M. Gorden ◽  
Ellen M. Greytak ◽  
Kimberly Sturk-Andreaggi ◽  
Janet Cady ◽  
Timothy P. McMahon ◽  
...  

AbstractDNA-assisted identification of historical remains requires the genetic analysis of highly degraded DNA, along with a comparison to DNA from known relatives. This can be achieved by targeting single nucleotide polymorphisms (SNPs) using a hybridization capture and next-generation sequencing approach suitable for degraded skeletal samples. In the present study, two SNP capture panels were designed to target ∼25,000 (25K) and ∼95,000 (95K) autosomal SNPs, respectively, to enable distant kinship estimation (up to 4th degree relatives). Low-coverage SNP data were successfully recovered from 14 skeletal elements 75 years postmortem, with captured DNA having mean insert sizes ranging from 32-170 bp across the 14 samples. SNP comparison with DNA from known family references was performed in the Parabon Fχ Forensic Analysis Platform, which utilizes a likelihood approach for kinship prediction that was optimized for low-coverage sequencing data with DNA damage. The 25K and 95K panels produced 15,000 and 42,000 SNPs on average, respectively allowing for accurate kinship prediction in 17 and 19 of the 21 pairwise comparisons. Whole genome sequencing was not able to produce sufficient SNP data for accurate kinship prediction, demonstrating that hybridization capture is necessary for historical samples. This study provides the groundwork for the expansion of research involving compromised samples to include SNP hybridization capture.Author SummaryOur study evaluates ancient DNA techniques involving SNP capture and Next-Generation Sequencing for use in forensic identification. We utilized bone samples from 14 sets of previously identified historical remains aged 70 years postmortem for low-coverage SNP genotyping and extended kinship analysis. We performed whole genome sequencing and hybridization capture with two SNP panels, one targeting ∼25,000 SNPs and the other targeting ∼95,000 SNPs, to assess SNP recovery and accuracy in kinship estimation. A genotype likelihood approach was utilized for SNP profiling of degraded DNA characterized by cytosine deamination typical of ancient and historical specimens. Family reference samples from known relatives up to 4th degree were genotyped using a SNP microarray. We then utilized the Parabon Fχ Forensic Analysis Platform to perform pairwise comparisons of all bone and reference samples for kinship prediction. The results showed that both capture panels facilitated accurate kinship prediction in more than 80% of the tested relationships without producing false positive matches (or adventitious hits), which were commonly observed in the whole genome sequencing comparisons. We demonstrate that SNP capture can be an effective method for genotyping of historical remains for distant kinship analysis with known relatives, which will support humanitarian efforts and forensic identification.


2021 ◽  
Author(s):  
Matthew Hayes ◽  
Angela Nguyen ◽  
Rahib Islam ◽  
Caryn Butler ◽  
Ethan Tran ◽  
...  

AbstractDouble minute chromosomes are acentric extrachromosomal DNA artifacts that are frequently observed in the cells of numerous cancers. They are highly amplified and contain oncogenes and drug resistance genes, making their presence a challenge for effective cancer treatment. Algorithmic discovery of double minutes (DM) can potentially improve bench-derived therapies for cancer treatment. A hindrance to this task is that DMs evolve, yielding circular chromatin that shares segments from progenitor double minutes. This creates double minutes with overlapping amplicon coordinates. Existing DM discovery algorithms use whole genome shotgun sequencing in isolation, which can potentially incorrectly classify DMs that share overlapping coordinates. In this study, we describe an algorithm called “ HolistIC” that can predict double minutes in tumor genomes by integrating whole genome shotgun sequencing (WGS) and Hi-C sequencing data. The consolidation of these sources of information resolves ambiguity in double minute amplicon prediction that exists in DM prediction with WGS data used in isolation. We implemented and tested our algorithm on the tandem Hi-C and WGS datasets of three cancer datasets and a simulated dataset. Results on the cancer datasets demonstrated HolistIC’s ability to predict DMs from Hi-C and WGS data in tandem. The results on the simulated data showed the HolistIC can accurately distinguish double minutes that have overlapping amplicon coordinates, an advance over methods that predict extrachromosomal amplification using WGS data in isolation.AvailabilityOur software is available at http://www.github.com/mhayes20/HolistIC.


2020 ◽  
Author(s):  
Kristy Martire ◽  
Agnes Bali ◽  
Kaye Ballantyne ◽  
Gary Edmond ◽  
Richard Kemp ◽  
...  

We do not know how often false positive reports are made in a range of forensic science disciplines. In the absence of this information it is important to understand the naive beliefs held by potential jurors about forensic science evidence reliability. It is these beliefs that will shape evaluations at trial. This descriptive study adds to our knowledge about naive beliefs by: 1) measuring jury-eligible (lay) perceptions of reliability for the largest range of forensic science disciplines to date, over three waves of data collection between 2011 and 2016 (n = 674); 2) calibrating reliability ratings with false positive report estimates; and 3) comparing lay reliability estimates with those of an opportunity sample of forensic practitioners (n = 53). Overall the data suggest that both jury-eligible participants and practitioners consider forensic evidence highly reliable. When compared to best or plausible estimates of reliability and error in the forensic sciences these views appear to overestimate reliability and underestimate the frequency of false positive errors. This result highlights the importance of collecting and disseminating empirically derived estimates of false positive error rates to ensure that practitioners and potential jurors have a realistic impression of the value of forensic science evidence.


Sign in / Sign up

Export Citation Format

Share Document