scholarly journals Comparative analysis of novel MGISEQ-2000 sequencing platform vs Illumina HiSeq 2500 for whole-genome sequencing

2019 ◽  
Author(s):  
Dmitriy Korostin ◽  
Nikolay Kulemin ◽  
Vladimir Naumov ◽  
Vera Belova ◽  
Dmitriy Kwon ◽  
...  

AbstractBackgroundMGISEQ-2000 developed by MGI Tech Co. Ltd. (a subsidiary of the BGI Group) is a new competitor of such next-generation sequencing platforms as NovaSeq and HiSeq (Illumina). Its sequencing principle is based on the DNB and cPAS technologies, which were also used in the previous version of the BGISEQ-500 device. However, the reagents for MGISEQ-2000 have been refined and the platform utilizes updated software. The cPAS method is an advanced technology based on cPAL previously created by Complete Genomics.ResultIn this paper, the authors compare the results of the whole-genome sequencing of a DNA sample from a Russian female donor performed on MGISEQ-2000 and Illumina HiSeq 2500 (both PE150). Two platforms were compared in terms of sequencing quality, number of errors and performance. Additionally, we performed variant calling using four different software packages: Samtools mpileaup, Strelka2, Sentieon, and GATK. The accuracy of single nucleotide polymorphism (SNP) detection was similar in the data generated by MGISEQ-2000 and HiSeq 2500, which was used as a reference. At the same time, a separate indel analysis of the overall error rate revealed similar FPR values and lower sensitivity.Conclusionsit may be concluded with confidence that the data generated by the analyzed sequencing systems is characterized by comparable magnitudes of error and that MGISEQ-2000 can be used for a wide range of research tasks on par with HiSeq 2500.

2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Kelley Paskov ◽  
Jae-Yoon Jung ◽  
Brianna Chrisman ◽  
Nate T. Stockham ◽  
Peter Washington ◽  
...  

Abstract Background As next-generation sequencing technologies make their way into the clinic, knowledge of their error rates is essential if they are to be used to guide patient care. However, sequencing platforms and variant-calling pipelines are continuously evolving, making it difficult to accurately quantify error rates for the particular combination of assay and software parameters used on each sample. Family data provide a unique opportunity for estimating sequencing error rates since it allows us to observe a fraction of sequencing errors as Mendelian errors in the family, which we can then use to produce genome-wide error estimates for each sample. Results We introduce a method that uses Mendelian errors in sequencing data to make highly granular per-sample estimates of precision and recall for any set of variant calls, regardless of sequencing platform or calling methodology. We validate the accuracy of our estimates using monozygotic twins, and we use a set of monozygotic quadruplets to show that our predictions closely match the consensus method. We demonstrate our method’s versatility by estimating sequencing error rates for whole genome sequencing, whole exome sequencing, and microarray datasets, and we highlight its sensitivity by quantifying performance increases between different versions of the GATK variant-calling pipeline. We then use our method to demonstrate that: 1) Sequencing error rates between samples in the same dataset can vary by over an order of magnitude. 2) Variant calling performance decreases substantially in low-complexity regions of the genome. 3) Variant calling performance in whole exome sequencing data decreases with distance from the nearest target region. 4) Variant calls from lymphoblastoid cell lines can be as accurate as those from whole blood. 5) Whole-genome sequencing can attain microarray-level precision and recall at disease-associated SNV sites. Conclusion Genotype datasets from families are powerful resources that can be used to make fine-grained estimates of sequencing error for any sequencing platform and variant-calling methodology.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Kanika Arora ◽  
Minita Shah ◽  
Molly Johnson ◽  
Rashesh Sanghvi ◽  
Jennifer Shelton ◽  
...  

AbstractTo test the performance of a new sequencing platform, develop an updated somatic calling pipeline and establish a reference for future benchmarking experiments, we performed whole-genome sequencing of 3 common cancer cell lines (COLO-829, HCC-1143 and HCC-1187) along with their matched normal cell lines to great sequencing depths (up to 278x coverage) on both Illumina HiSeqX and NovaSeq sequencing instruments. Somatic calling was generally consistent between the two platforms despite minor differences at the read level. We designed and implemented a novel pipeline for the analysis of tumor-normal samples, using multiple variant callers. We show that coupled with a high-confidence filtering strategy, the use of combination of tools improves the accuracy of somatic variant calling. We also demonstrate the utility of the dataset by creating an artificial purity ladder to evaluate the somatic pipeline and benchmark methods for estimating purity and ploidy from tumor-normal pairs. The data and results of the pipeline are made accessible to the cancer genomics community.


PLoS ONE ◽  
2020 ◽  
Vol 15 (3) ◽  
pp. e0230301 ◽  
Author(s):  
Dmitriy Korostin ◽  
Nikolay Kulemin ◽  
Vladimir Naumov ◽  
Vera Belova ◽  
Dmitriy Kwon ◽  
...  

2021 ◽  
Vol 9 (8) ◽  
pp. 1585
Author(s):  
Ana C. Reis ◽  
Liliana C. M. Salvador ◽  
Suelee Robbe-Austerman ◽  
Rogério Tenreiro ◽  
Ana Botelho ◽  
...  

Classical molecular analyses of Mycobacterium bovis based on spoligotyping and Variable Number Tandem Repeat (MIRU-VNTR) brought the first insights into the epidemiology of animal tuberculosis (TB) in Portugal, showing high genotypic diversity of circulating strains that mostly cluster within the European 2 clonal complex. Previous surveillance provided valuable information on the prevalence and spatial occurrence of TB and highlighted prevalent genotypes in areas where livestock and wild ungulates are sympatric. However, links at the wildlife–livestock interfaces were established mainly via classical genotype associations. Here, we apply whole genome sequencing (WGS) to cattle, red deer and wild boar isolates to reconstruct the M. bovis population structure in a multi-host, multi-region disease system and to explore links at a fine genomic scale between M. bovis from wildlife hosts and cattle. Whole genome sequences of 44 representative M. bovis isolates, obtained between 2003 and 2015 from three TB hotspots, were compared through single nucleotide polymorphism (SNP) variant calling analyses. Consistent with previous results combining classical genotyping with Bayesian population admixture modelling, SNP-based phylogenies support the branching of this M. bovis population into five genetic clades, three with apparent geographic specificities, as well as the establishment of an SNP catalogue specific to each clade, which may be explored in the future as phylogenetic markers. The core genome alignment of SNPs was integrated within a spatiotemporal metadata framework to further structure this M. bovis population by host species and TB hotspots, providing a baseline for network analyses in different epidemiological and disease control contexts. WGS of M. bovis isolates from Portugal is reported for the first time in this pilot study, refining the spatiotemporal context of TB at the wildlife–livestock interface and providing further support to the key role of red deer and wild boar on disease maintenance. The SNP diversity observed within this dataset supports the natural circulation of M. bovis for a long time period, as well as multiple introduction events of the pathogen in this Iberian multi-host system.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Agata Stodolna ◽  
Miao He ◽  
Mahesh Vasipalli ◽  
Zoya Kingsbury ◽  
Jennifer Becq ◽  
...  

Abstract Background Clinical-grade whole-genome sequencing (cWGS) has the potential to become the standard of care within the clinic because of its breadth of coverage and lack of bias towards certain regions of the genome. Colorectal cancer presents a difficult treatment paradigm, with over 40% of patients presenting at diagnosis with metastatic disease. We hypothesised that cWGS coupled with 3′ transcriptome analysis would give new insights into colorectal cancer. Methods Patients underwent PCR-free whole-genome sequencing and alignment and variant calling using a standardised pipeline to output SNVs, indels, SVs and CNAs. Additional insights into the mutational signatures and tumour biology were gained by the use of 3′ RNA-seq. Results Fifty-four patients were studied in total. Driver analysis identified the Wnt pathway gene APC as the only consistently mutated driver in colorectal cancer. Alterations in the PI3K/mTOR pathways were seen as previously observed in CRC. Multiple private CNAs, SVs and gene fusions were unique to individual tumours. Approximately 30% of patients had a tumour mutational burden of > 10 mutations/Mb of DNA, suggesting suitability for immunotherapy. Conclusions Clinical whole-genome sequencing offers a potential avenue for the identification of private genomic variation that may confer sensitivity to targeted agents and offer patients new options for targeted therapies.


2019 ◽  
Author(s):  
Junhua Rao ◽  
Lihua Peng ◽  
Fang Chen ◽  
Hui Jiang ◽  
Chunyu Geng ◽  
...  

AbstractBackgroundNext-generation sequence (NGS) has rapidly developed in past years which makes whole-genome sequencing (WGS) becoming a more cost- and time-efficient choice in wide range of biological researches. We usually focus on some variant detection via WGS data, such as detection of single nucleotide polymorphism (SNP), insertion and deletion (Indel) and copy number variant (CNV), which playing an important role in many human diseases. However, the feasibility of CNV detection based on WGS by DNBSEQ™ platforms was unclear. We systematically analysed the genome-wide CNV detection power of DNBSEQ™ platforms and Illumina platforms on NA12878 with five commonly used tools, respectively.ResultsDNBSEQ™ platforms showed stable ability to detect slighter more CNVs on genome-wide (average 1.24-fold than Illumina platforms). Then, CNVs based on DNBSEQ™ platforms and Illumina platforms were evaluated with two public benchmarks of NA12878, respectively. DNBSEQ™ and Illumina platforms showed similar sensitivities and precisions on both two benchmarks. Further, the difference between tools for CNV detection was analyzed, and indicated the selection of tool for CNV detection could affected the CNV performance, such as count, distribution, sensitivity and precision.ConclusionThe major contribution of this paper is providing a comprehensive guide for CNV detection based on WGS by DNBSEQ™ platforms for the first time.


2018 ◽  
Vol 5 (suppl_1) ◽  
pp. S286-S287
Author(s):  
Evangelina Namburete

Abstract Background Knowing the genetic diversity of M. tuberculosis strains causing drug-resistant tuberculosis (DR-TB) in high burden TB and low resources countries such as Mozambique is a key factor to TB disease spread control and world TB epidemic control. Whole-genome sequencing (WGS) better describes molecular diversity, lineages and sub lineages, relationship between strains, underline mutations conferring drug-resistant TB, which may not be shown by molecular and phenotypic tests. As far as we know this is the first study that describes genetic diversity of M. tuberculosis strains causing DR-TB and using WGS in central region of Mozambique.We aim to describe genetic diversity of M. tuberculosis strains causing DR-TB in central Mozambique. Methods A total of 35 strains from Beira Mozambique were evaluated with genotypic tests (Genotype MTBDRplus™, and MTBDRsl™); phenotypic (MGIT-SIRE™) and DST. All isolates resistant to isoniazid (H) or rifampicin (R) or both were submitted to WGS Illumina HiSeq 2000 and analyzed with TB profiler database and phylogenetic tree was done using Figtree tool. This was a descriptive cross-sectional study. Results WGS shown that strains analyzed, belongs to three of six major lineages, with Lineage 4: 25(71.4%); Lineage 1: 5(14.3%); and Lineage 2 Beijing family: 5(14.3%)]. All pre-XDR strains 3(8.6%) were from lineage 4.3. By WGS, all 35 strains had any mutations conferring DR-TB while in one strain, mutation was not shown by genotypic neither phenotypic DST. Compared with genotypic tests, WGS had best performance in showing mutation conferring resistance to etambutol 12/35 (34.3%) and 7/35 (20%). Conclusion The DR-TB disease in Beira Mozambique is mainly caused by M. tuberculosis strains of Lineage 4, sub-lineage 3 although lineage 1 and 2 are also present. WGS shows underline mutations causing DR–TB that are not detected by genotypic and phenotypic DST test. Disclosures All authors: No reported disclosures.


2020 ◽  
Vol 58 (11) ◽  
Author(s):  
Thomas A. Kohl ◽  
Katharina Kranzer ◽  
Sönke Andres ◽  
Thierry Wirth ◽  
Stefan Niemann ◽  
...  

ABSTRACT Mycobacterium bovis is the primary cause of bovine tuberculosis (bTB) and infects a wide range of domestic animal and wildlife species and humans. In Germany, bTB still emerges sporadically in cattle herds, free-ranging wildlife, diverse captive animal species, and humans. In order to understand the underlying population structure and estimate the population size fluctuation through time, we analyzed 131 M. bovis strains from animals (n = 38) and humans (n = 93) in Germany from 1999 to 2017 by whole-genome sequencing (WGS), mycobacterial interspersed repetitive-unit–variable-number tandem-repeat (MIRU-VNTR) typing, and spoligotyping. Based on WGS data analysis, 122 out of the 131 M. bovis strains were classified into 13 major clades, of which 6 contained strains from both human and animal cases and 7 only strains from human cases. Bayesian analyses suggest that the M. bovis population went through two sharp anticlimaxes, one in the middle of the 18th century and another one in the 1950s. WGS-based cluster analysis grouped 46 strains into 13 clusters ranging in size from 2 to 11 members and involving strains from distinct host types, e.g., only cattle and also mixed hosts. Animal strains of four clusters were obtained over a 9-year span, pointing toward autochthonous persistent bTB infection cycles. As expected, WGS had a higher discriminatory power than spoligotyping and MIRU-VNTR typing. In conclusion, our data confirm that WGS and suitable bioinformatics constitute the method of choice to implement prospective molecular epidemiological surveillance of M. bovis. The population of M. bovis in Germany is diverse, with subtle, but existing, interactions between different host groups.


BMC Genomics ◽  
2014 ◽  
Vol 15 (1) ◽  
pp. 85 ◽  
Author(s):  
Chris Bizon ◽  
Michael Spiegel ◽  
Scott A Chasse ◽  
Ian R Gizer ◽  
Yun Li ◽  
...  

2016 ◽  
Vol 55 (3) ◽  
pp. 811-823 ◽  
Author(s):  
Tengguo Li ◽  
Elizabeth R. Unger ◽  
Dhwani Batra ◽  
Mili Sheth ◽  
Martin Steinau ◽  
...  

ABSTRACTWe designed a universal human papillomavirus (HPV) typing assay based on target enrichment and whole-genome sequencing (eWGS). The RNA bait included 23,941 probes targeting 191 HPV types and 12 probes targeting beta-globin as a control. We used the Agilent SureSelect XT2 protocol for library preparation, Illumina HiSeq 2500 for sequencing, and CLC Genomics Workbench for sequence analysis. Mapping stringency for type assignment was determined based on 8 (6 HPV-positive and 2 HPV-negative) control samples. Using the optimal mapping conditions, types were assigned to 24 blinded samples. eWGS results were 100% concordant with Linear Array (LA) genotyping results for 9 plasmid samples and fully or partially concordant for 9 of the 15 cervical-vaginal samples, with 95.83% overall type-specific concordance for LA genotyping. eWGS identified 7 HPV types not included in the LA genotyping. Since this method does not involve degenerate primers targeting HPV genomic regions, PCR bias in genotype detection is minimized. With further refinements aimed at reducing cost and increasing throughput, this first application of eWGS for universal HPV typing could be a useful method to elucidate HPV epidemiology.


Sign in / Sign up

Export Citation Format

Share Document