scholarly journals ShallowHRD: detection of homologous recombination deficiency from shallow whole genome sequencing

2020 ◽  
Vol 36 (12) ◽  
pp. 3888-3889
Author(s):  
Alexandre Eeckhoutte ◽  
Alexandre Houy ◽  
Elodie Manié ◽  
Manon Reverdy ◽  
Ivan Bièche ◽  
...  

Abstract Summary We introduce shallowHRD, a software tool to evaluate tumor homologous recombination deficiency (HRD) based on whole genome sequencing (WGS) at low coverage (shallow WGS or sWGS; ∼1X coverage). The tool, based on mining copy number alterations profile, implements a fast and straightforward procedure that shows 87.5% sensitivity and 90.5% specificity for HRD detection. shallowHRD could be instrumental in predicting response to poly(ADP-ribose) polymerase inhibitors, to which HRD tumors are selectively sensitive. shallowHRD displays efficiency comparable to most state-of-art approaches, is cost-effective, generates low-storable outputs and is also suitable for fixed-formalin paraffin embedded tissues. Availability and implementation shallowHRD R script and documentation are available at https://github.com/aeeckhou/shallowHRD. Supplementary information Supplementary data are available at Bioinformatics online.

PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0245488
Author(s):  
Karin Wallander ◽  
Jesper Eisfeldt ◽  
Mats Lindblad ◽  
Daniel Nilsson ◽  
Kenny Billiau ◽  
...  

Background Analysis of cell-free tumour DNA, a liquid biopsy, is a promising biomarker for cancer. We have performed a proof-of principle study to test the applicability in the clinical setting, analysing copy number alterations (CNAs) in plasma and tumour tissue from 44 patients with gastro-oesophageal cancer. Methods DNA was isolated from blood plasma and a tissue sample from each patient. Array-CGH was applied to the tissue DNA. The cell-free plasma DNA was sequenced by low-coverage whole-genome sequencing using a clinical pipeline for non-invasive prenatal testing. WISECONDOR and ichorCNA, two bioinformatic tools, were used to process the output data and were compared to each other. Results Cancer-associated CNAs could be seen in 59% (26/44) of the tissue biopsies. In the plasma samples, a targeted approach analysing 61 regions of special interest in gastro-oesophageal cancer detected cancer-associated CNAs with a z-score >5 in 11 patients. Broadening the analysis to a whole-genome view, 17/44 patients (39%) had cancer-associated CNAs using WISECONDOR and 13 (30%) using ichorCNA. Of the 26 patients with tissue-verified cancer-associated CNAs, 14 (54%) had corresponding CNAs in plasma. Potentially clinically actionable amplifications overlapping the genes VEGFA, EGFR and FGFR2 were detected in the plasma from three patients. Conclusions We conclude that low-coverage whole-genome sequencing without prior knowledge of the tumour alterations could become a useful tool for cell-free tumour DNA analysis of total CNAs in plasma from patients with gastro-oesophageal cancer.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e15078-e15078
Author(s):  
Kai Liu ◽  
Xueyu Hao ◽  
Mengmeng Zhang ◽  
Mingwei Li ◽  
Wang Wang ◽  
...  

e15078 Background: Recently, homologous recombination deficiency (HRD) scores are associated with the efficacy of Poly‐(ADP‐Ribose)‐Polymerase (PARP) inhibition and platinum-based chemotherapy in a variety of cancers. Evaluating HRD level in patients with cancers is becoming far more important and influential, so far, there is no standard method to be used in clinical. In this study, we developed an algorithm to detect HRD from next-generation sequencing (NGS) for finding additional patients may potentially benefit from target therapy. Methods: Forty-eight patients were enrolled, including breast cancer, ovarian cancer, prostatic cancer. Fifteen cell lines with breast cancer and endometrial carcinoma were collected from Cobioer biosciences co., LTD. Forty-eight Formalin-fixed, paraffin embedded (FFPE) samples and 15 cell lines were performed by DNA extracting. We developed an HRD score algorithm, termed as AcornHRD algorithm. HRD score was analyzed by whole-genome sequencing, and GATK mutect2 software was used to detect BRCA1/2mutation by deep sequencing. Results: BRCA1/2 deleterious mutations were observed in 20 patients (41.7%). HRD was explained by deficiencies in 17 patients (85.0%) with BRCA mutation, whereas eight HRD-high tumors were non- BRCA related (28.6%). Among BRCA wild-type patients, the corresponding percentage of HRD positive patients in breast cancer, ovarian cancer and prostate cancer were 36.3%, 37.5% and 11.1%, respectively. Similar results were also verified in the cell line datasets. The findings showed that 100% (3/3) BRCA1/2 deficient cell lines are also HRD-high. Furthermore, HRD scores were highly correlated with standard results in the cell line datasets. Conclusions: We here report the NGS-based HRD scores to distinguish similarly well between BRCA mutant and BRCA wild-type cases in a cohort of Chinese population. AcornHRD scores were highly associated with BRCA1/2 deficiency. AcornHRD algorithm can be a useful tool to detect HRD events in clinical settings.


2018 ◽  
Vol 35 (16) ◽  
pp. 2847-2849 ◽  
Author(s):  
Jos B Poell ◽  
Matias Mendeville ◽  
Daoud Sie ◽  
Arjen Brink ◽  
Ruud H Brakenhoff ◽  
...  

Abstract Summary Chromosomal copy number aberrations can be efficiently detected and quantified using low-coverage whole-genome sequencing, but analysis is hampered by the lack of knowledge on absolute DNA copy numbers and tumor purity. Here, we describe an analytical tool for Absolute Copy number Estimation, ACE, which scales relative copy number signals from chromosomal segments to optimally fit absolute copy numbers, without the need for additional genetic information, such as SNP data. In doing so, ACE derives an estimate of tumor purity as well. ACE facilitates analysis of large numbers of samples, while maintaining the flexibility to customize models and generate output of single samples. Availability and implementation ACE is freely available via www.bioconductor.org and at www.github.com/tgac-vumc/ACE. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Runyang Nicolas Lou ◽  
Arne Jacobs ◽  
Aryn Wilder ◽  
Nina Overgaard Therkildsen

Low-coverage whole genome sequencing (lcWGS) has emerged as a powerful and cost-effective approach for population genomic studies in both model and non-model species. However, with read depths too low to confidently call individual genotypes, lcWGS requires specialized analysis tools that explicitly account for genotype uncertainty. A growing number of such tools have become available, but it can be difficult to get an overview of what types of analyses can be performed reliably with lcWGS data, and how the distribution of sequencing effort between the number of samples analyzed and per-sample sequencing depths affects inference accuracy. In this introductory guide to lcWGS, we first illustrate how the per-sample cost for lcWGS is now comparable to RAD-seq and Pool-seq in many systems. We then provide an overview of software packages that explicitly account for genotype uncertainty in different types of population genomic inference. Next, we use both simulated and empirical data to assess the accuracy of allele frequency and genetic diversity estimation, detection of population structure, and selection scans under different sequencing strategies. Our results show that spreading a given amount of sequencing effort across more samples with lower depth per sample consistently improves the accuracy of most types of inference, with a few notable exceptions. Finally, we assess the potential for using imputation to bolster inference from lcWGS data in non-model species, and discuss current limitations and future perspectives for lcWGS-based population genomics research. With this overview, we hope to make lcWGS more approachable and stimulate its broader adoption.


Author(s):  
Runyang Nicolas Lou ◽  
Arne Jacobs ◽  
Aryn Wilder ◽  
Nina Overgaard Therkildsen

Low-coverage whole genome sequencing (lcWGS) has emerged as a powerful and cost-effective approach for population genomic studies in both model and non-model species. However, with read depths too low to confidently call individual genotypes, lcWGS requires specialized analysis tools that explicitly account for genotype uncertainty. A growing number of such tools have become available, but it can be difficult to get an overview of what types of analyses can be performed reliably with lcWGS data, and how the distribution of sequencing effort between the number of samples analyzed and per-sample sequencing depths affects inference accuracy. In this introductory guide to lcWGS, we first illustrate how the per-sample cost for lcWGS is now comparable to RAD-seq and Pool-seq in many systems. We then provide an overview of software packages that explicitly account for genotype uncertainty in different types of population genomic inference. Next, we use both simulated and empirical data to assess the accuracy of allele frequency and genetic diversity estimation, detection of population structure, and selection scans under different sequencing strategies. Our results show that spreading a given amount of sequencing effort across more samples with lower depth per sample consistently improves the accuracy of most types of inference, with a few notable exceptions. Finally, we assess the potential for using imputation to bolster inference from lcWGS data in non-model species, and discuss current limitations and future perspectives for lcWGS-based population genomics research. With this overview, we hope to make lcWGS more approachable and stimulate its broader adoption.


2018 ◽  
Vol 35 (15) ◽  
pp. 2555-2561 ◽  
Author(s):  
Arthur Gilly ◽  
Lorraine Southam ◽  
Daniel Suveges ◽  
Karoline Kuchenbaecker ◽  
Rachel Moore ◽  
...  

Abstract Motivation Very low-depth sequencing has been proposed as a cost-effective approach to capture low-frequency and rare variation in complex trait association studies. However, a full characterization of the genotype quality and association power for very low-depth sequencing designs is still lacking. Results We perform cohort-wide whole-genome sequencing (WGS) at low depth in 1239 individuals (990 at 1× depth and 249 at 4× depth) from an isolated population, and establish a robust pipeline for calling and imputing very low-depth WGS genotypes from standard bioinformatics tools. Using genotyping chip, whole-exome sequencing (75× depth) and high-depth (22×) WGS data in the same samples, we examine in detail the sensitivity of this approach, and show that imputed 1× WGS recapitulates 95.2% of variants found by imputed GWAS with an average minor allele concordance of 97% for common and low-frequency variants. In our study, 1× further allowed the discovery of 140 844 true low-frequency variants with 73% genotype concordance when compared to high-depth WGS data. Finally, using association results for 57 quantitative traits, we show that very low-depth WGS is an efficient alternative to imputed GWAS chip designs, allowing the discovery of up to twice as many true association signals than the classical imputed GWAS design. Availability and implementation The HELIC genotype and WGS datasets have been deposited to the European Genome-phenome Archive (https://www.ebi.ac.uk/ega/home): EGAD00010000518; EGAD00010000522; EGAD00010000610; EGAD00001001636, EGAD00001001637. The peakplotter software is available at https://github.com/wtsi-team144/peakplotter, the transformPhenotype app can be downloaded at https://github.com/wtsi-team144/transformPhenotype. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Runyang Nicolas Lou ◽  
Arne Jacobs ◽  
Aryn Wilder ◽  
Nina Overgaard Therkildsen

Low-coverage whole genome sequencing (lcWGS) has emerged as a powerful and cost-effective approach for population genomic studies in both model and non-model species. However, with read depths too low to confidently call individual genotypes, lcWGS requires specialized analysis tools that explicitly account for genotype uncertainty. A growing number of such tools have become available, but it can be difficult to get an overview of what types of analyses can be performed reliably with lcWGS data and how the distribution of sequencing effort between the number of samples analyzed and per-sample sequencing depths affects inference accuracy. In this introductory guide to lcWGS, we first illustrate that the per-sample cost for lcWGS is now comparable to RAD-seq and Pool-seq in many systems. We then provide an overview of software packages that explicitly account for genotype uncertainty in different types of population genomic inference. Next, we use both simulated and empirical data to assess the accuracy of allele frequency estimation, detection of population structure, and selection scans under different sequencing strategies. Our results show that spreading a given amount of sequencing effort across more samples with lower depth per sample consistently improves the accuracy of most types of inference compared to sequencing fewer samples each at higher depth. Finally, we assess the potential for using imputation to bolster inference from lcWGS data in non-model species, and discuss current limitations and future perspectives for lcWGS-based analysis. With this overview, we hope to make lcWGS more approachable and stimulate broader adoption.


Sign in / Sign up

Export Citation Format

Share Document