iStopCancer: A database of 6,016 low pass whole genome sequencing of minimal invasive samples from 21 cancer types of Chinese population.

2020 ◽  
Vol 38 (15_suppl) ◽  
pp. 7062-7062
Author(s):  
Min Yuan ◽  
Qian Ziliang ◽  
Juemin Fang ◽  
Zhongzheng Zhu ◽  
Jianguo Wu ◽  
...  

7062 Background: Cancer is a group of genetic diseases that result from changes in the genome of cells in the body, leading them to grow uncontrollably. Recent researches suggest Chromosome instability (CIN), which is defined as an increased rate of chromosome gains and losses, manifests as cell-to-cell karyotypic heterogeneity and drives cancer initiation and evolution. Methods: In the past two years, we initiated iStopCancer project, and characterized 4515 ‘best available’ minimal-invasive samples from cancer patients and 1501 plasma samples from non-tumor diseases by using low-pass whole genome sequencing. DNA from ‘best available’ minimal-invasive samples, including peripheral plasma, urines, pancreatic juice, bile and effusions were analyzed by low coverage whole genome sequencing followed by the UCAD Bioinformatics workflow to characterize the CINs. In total, 32T bp nucleotide (coverage =1.7X for each sample) were collected. All the data can be visualized on website: http://www.istopcancer.net/pgweb/cn/istopcancer.jsp . Results: 3748(83%) of tumors present detectable CIN (CIN score>1000) in minimal-invasive samples. The missed cancer patients were majorly from patients with either tumor size less than 2cm or less-aggressive cancers, including thyroid cancer, low-grade urothelial carcinoma, lung cancer in-situ, et al. Of the 1501 non-tumor individuals, 30(2.0%) present detectable CIN (|Z|>=3) at the time of sample collection, 24(80.0%) was diagnosed as tumor patient in 3-6 months follow-up. There were 9 (0.59%) of non-cancer individuals without detectable CIN were also reported as tumor patients during 6-month following up. In summary, the positive and negative prediction value is 80.0% and 99.4% respectively. The false alarms were majorly from patients with EBV activations, which indicates virus may interference chromosome stability and drove virus-associated carcinogenesis. For the patient with repeated detections, plasma cfDNA CIN dynamics predicted clinical responses and disease recurrences. Quick clearance of plasma cfDNA CIN in 2-3 weeks was found in 153 (83.6%) patients. Meanwhile, no quick clearance was found in majority of SDs/PDs (73/88=83.0%). Furthermore, cfDNA CIN predicts clinical response 2-8 weeks ahead of traditional biomarkers (CEA, CA15-3, CA199, AFP et al). Conclusions: Large-scale low coverage whole genome sequencing data provides useful information for cancer detection and managements.

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Johannes Smolander ◽  
Sofia Khan ◽  
Kalaimathy Singaravelu ◽  
Leni Kauko ◽  
Riikka J. Lund ◽  
...  

Abstract Background Detection of copy number variations (CNVs) from high-throughput next-generation whole-genome sequencing (WGS) data has become a widely used research method during the recent years. However, only a little is known about the applicability of the developed algorithms to ultra-low-coverage (0.0005–0.8×) data that is used in various research and clinical applications, such as digital karyotyping and single-cell CNV detection. Result Here, the performance of six popular read-depth based CNV detection algorithms (BIC-seq2, Canvas, CNVnator, FREEC, HMMcopy, and QDNAseq) was studied using ultra-low-coverage WGS data. Real-world array- and karyotyping kit-based validation were used as a benchmark in the evaluation. Additionally, ultra-low-coverage WGS data was simulated to investigate the ability of the algorithms to identify CNVs in the sex chromosomes and the theoretical minimum coverage at which these tools can accurately function. Our results suggest that while all the methods were able to detect large CNVs, many methods were susceptible to producing false positives when smaller CNVs (< 2 Mbp) were detected. There was also significant variability in their ability to identify CNVs in the sex chromosomes. Overall, BIC-seq2 was found to be the best method in terms of statistical performance. However, its significant drawback was by far the slowest runtime among the methods (> 3 h) compared with FREEC (~ 3 min), which we considered the second-best method. Conclusions Our comparative analysis demonstrates that CNV detection from ultra-low-coverage WGS data can be a highly accurate method for the detection of large copy number variations when their length is in millions of base pairs. These findings facilitate applications that utilize ultra-low-coverage CNV detection.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e22510-e22510
Author(s):  
Shunli Yang ◽  
Pei Zhihua ◽  
Jianing Yu ◽  
Xiuyu Zhao ◽  
Yiqian Liu ◽  
...  

e22510 Background: Recent advances in circulating cell-free DNA (cfDNA) of plasma have shown that tumor diagnosis based on tumor-specific genetic and epigenetic changes (e.g., somatic mutations, copy number variations, and DNA methylation) is a promising non-invasive method. However, the number of tumor-specific genomic variants identified by whole-genome sequencing (WGS) in early cancer patients is very limited. Moreover, the mutations generated by clonal hematopoiesis in cfDNA can further confound the detection of cancer-specific mutations. It has been shown that ctDNA and cfDNA fragments have differences in length distribution. Compared with a limited number of genomic mutations, cfDNA fragment size index (FSI) is more abundant and easier to be detected. Methods: We designed a novel method for fragment detection of plasma cfDNA based on low-coverage WGS. The fragment length differences between healthy individuals and tumor patients were systematically analyzed. The training dataset includes 50 healthy individuals and 354 patients from eight different cancers. After the data preprocessing, we calculated the weight of fragmental bins and built a model for distinguishing healthy individuals from cancer patients. An independent dataset involving 22 healthy controls and 340 cancer patients was used to validate the model. The performance of our method was measured by the area under the curve (AUC) using the one-versus-all approach. Results: In our analysis, a total of 504 markers were selected from the dataset for model construction. Our model performed well for all cancer types on both training (AUC = 0.804) and validation (AUC = 0.837) datasets. Conclusions: The good performance of our model in large-scale plasma samples demonstrates the potential clinical application of cfDNA fragment analysis in early cancer detection based on low-coverage WGS.


Breast Cancer ◽  
2021 ◽  
Author(s):  
Liang Zhu ◽  
Jia-Ni Pan ◽  
Ziliang Qian ◽  
Wei-Wu Ye ◽  
Xiao-Jia Wang ◽  
...  

Abstract Background Though BRCA1 mutation is the most susceptible factor of breast cancer, its prognostic value is disputable. Here in this study, we use a novel method which based on whole-genome analysis to evaluate the chromosome instability (CIN) value and identified the potential relationship between CIN and prognosis of breast cancer patients with germline-BRCA1 mutation. Materials and methods Sanger sequencing or a 98-gene panel sequencing assay was used to screen for BRCA1 germline small mutations in 1151 breast cancer patients with high-risk factors. MLPA assay was employed to screen BRCA1 large genomic rearrangements in familial breast cancer patients with BRCA1 negative for small mutations. Thirty-two samples with unique BRCA1 germline mutation patterns were further subjected to CIN evaluation by LPWGS (low-pass whole-genome sequencing) technology. Results Firstly, 113 patients with germline BRCA1 mutations were screened from the cohort. Further CIN analysis by the LPWGS assay indicated that CIN was independent from the mutation location or type of BRCA1. Patients with high CIN status had shorter disease-free survival rates (DFS) (HR = 6.54, 95% CI 1.30–32.98, P = 0.034). The TP53 copy loss was also characterized by LPWGS assay. The rates of TP53 copy loss in CIN high and CIN low groups were 85.71% (12/14) and 16.67% (3/18), respectively. Conclusion CIN-high is a prognostic factor correlated with shorter DFS and was independent with the germline BRCA1 mutation pattern. Higher CIN values were significantly correlated with TP53 copy loss in breast cancer patients with germline BRCA1 mutation. Our results revealed a reliable molecular parameter for distinguishing patients with poor prognosis from the BRCA1-mutated breast cancer patients.


Sign in / Sign up

Export Citation Format

Share Document