genome coverage
Recently Published Documents


TOTAL DOCUMENTS

98
(FIVE YEARS 52)

H-INDEX

22
(FIVE YEARS 3)

2021 ◽  
Author(s):  
Amy K. Kim ◽  
Selena Y. Lin ◽  
Surbhi Jain ◽  
Yixiao Cui ◽  
Terence Gade ◽  
...  

AbstractCell-free DNA (cfDNA) from blood has become a promising analyte for cancer genetic liquid biopsy. Urinary cfDNA has been shown to contain mutations associated with non-genitourologic cancers including hepatocellular carcinoma (HCC). In this study, we evaluate urine as a noninvasive alternative to blood-based liquid biopsy in both germline and circulating tumor DNA (ctDNA) genotyping in HCC. Using quantitative PCR (qPCR), whole-genome sequencing (WGS), and targeted NGS, DNA isolated from blood or urine of patients with HCC was analyzed for overall genome coverage, HCC hotspot coverage, and germline or somatic mutation concordance. Targeted NGS of plasma and urine cfDNA was also performed for detection of somatic variants. We found urine cfDNA, similar to plasma cfDNA, showed a major mononucleosomal species of 150-180 bp in both healthy individuals and patients with HCC. By WGS, overall genome coverage breadth was similar between urine and plasma cfDNA, with higher fraction of covered cancer-associated mutation hotspots in urine cfDNA. qPCR analyses of HCC-associated mutations (TP53, CTNNB1, and TERT) in 101 patients with HCC revealed 78% overall concordance between plasma and urine. Targeted NGS of HCC-associated gene regions in additional 15 HCC patients showed a 97% overall position-level concordance between plasma and urine cfDNA. Collectively, urine DNA can potentially be used as a completely noninvasive liquid biopsy for HCC.Significance StatementHepatocellular carcinoma (HCC) is the most common liver cancer worldwide and the fastest growing gastrointestinal cancer in the U.S. Cell-free DNA (cfDNA) which originates from various cells undergoing apoptosis or necrosis including tumor cells, is present in all body fluids levels including urine. Urinary cfDNA isolated from patients with HCC showed a similar fragment size distribution, overall genome coverage, and comparable sensitivity for detecting HCC-associated variants compared to plasma cfDNA. Urine was also determined to be a reliable source of germline genotype information, similar to peripheral blood mononuclear cells in blood-based liquid biopsies. Urine cfDNA can be used as a completely non-invasive liquid biopsy in HCC.


2021 ◽  
Vol 1 ◽  
Author(s):  
Gregory S. Orf ◽  
Kenn Forberg ◽  
Todd V. Meyer ◽  
Illya Mowerman ◽  
Aurash Mohaimani ◽  
...  

Background: Surveillance of SARS-CoV-2 across the globe has enabled detection of new variants and informed the public health response. With highly sensitive methods like qPCR widely adopted for diagnosis, the ability to sequence and characterize specimens with low titers needs to keep pace.Methods: Nucleic acids extracted from nasopharyngeal swabs collected from four sites in the United States in early 2020 were converted to NGS libraries to sequence SARS-CoV-2 genomes using metagenomic and xGen target enrichment approaches. Single nucleotide polymorphism (SNP) analysis and phylogeny were used to determine clade assignments and geographic origins of strains.Results: SARS-CoV-2-specific xGen enrichment enabled full genome coverage for 87 specimens with Ct values <29, corresponding to viral loads of >10,000 cp/ml. For samples with viral loads between 103 and 106 cp/ml, the median genome coverage for xGen was 99.1%, sequence depth was 605X, and the “on-target” rate was 57 ± 21%, compared to 13%, 2X and 0.001 ± 0.016%, respectively, for metagenomic sequencing alone. Phylogenetic analysis revealed the presence of most clades that existed at the time of the study, though clade GH dominated in the Midwest.Conclusions: Even as vaccines are being widely distributed, a high case load of SARS-CoV-2 infection persists around the world. Viral genetic surveillance has succeeded in warning the public of new variants in circulation and ensured that diagnostic tools remain resilient to a steadily increasing number of mutations. Target capture offers a means of characterizing low viral load samples which would normally pose a challenge for metagenomic sequencing.


2021 ◽  
Vol 12 ◽  
Author(s):  
Lisa M. Hodges ◽  
Eduardo N. Taboada ◽  
Adam Koziol ◽  
Steven Mutschall ◽  
Burton W. Blais ◽  
...  

The increasing prevalence of antimicrobial resistance (AMR) in Campylobacter spp. is a global concern. This study evaluated the use of whole-genome sequencing (WGS) to predict AMR in Campylobacter jejuni and C. coli. A panel of 271 isolates recovered from Canadian poultry was used to compare AMR genotype to antimicrobial susceptibility testing (AST) results (azithromycin, ciprofloxacin, erythromycin, gentamicin, tetracycline, florfenicol, nalidixic acid, telithromycin, and clindamycin). The presence of antibiotic resistance genes (ARGs) was determined for each isolate using five computational approaches to evaluate the effect of: ARG screening software, input data (i.e., raw reads, draft genome assemblies), genome coverage and genome assembly software. Overall, concordance between the genotype and phenotype was influenced by the computational pipelines, level of genome coverage and the type of ARG but not by input data. For example, three of the pipelines showed a 99% agreement between detection of a tet(O) gene and tetracycline resistance, whereas agreement between the detection of tet(O) and TET resistance was 98 and 93% for two pipelines. Overall, higher levels of genome coverage were needed to reliably detect some ARGs; for example, at 15X coverage a tet(O) gene was detected in >70% of the genomes, compared to <60% of the genomes for bla(OXA). No genes associated with florfenicol or gentamicin resistance were found in the set of strains included in this study, consistent with AST results. Macrolide and fluoroquinolone resistance was associated 100% with mutations in the 23S rRNA (A2075G) and gyrA (T86I) genes, respectively. A lower association between a A2075G 23S rRNA gene mutation and resistance to clindamycin and telithromycin (92.8 and 78.6%, respectively) was found. While WGS is an effective approach to predicting AMR in Campylobacter, this study demonstrated the impact that computational pipelines, genome coverage and the genes can have on the reliable identification of an AMR genotype.


2021 ◽  
Vol 12 ◽  
Author(s):  
Xiao Xiong ◽  
Yogeshwar D Kelkar ◽  
Chris J Geden ◽  
Chao Zhang ◽  
Yidong Wang ◽  
...  

The parasitoid wasp Muscidifurax raptorellus (Hymenoptera: Pteromalidae) is a gregarious species that has received extensive attention for its potential in biological pest control against house fly, stable fly, and other filth flies. It has a high reproductive capacity and can be reared easily. However, genome assembly is not available for M. raptorellus or any other species in this genus. Previously, we assembled a complete circular mitochondrial genome with a length of 24,717 bp. Here, we assembled and annotated a high-quality nuclear genome of M. raptorellus, using a combination of long-read (104× genome coverage) and short-read (326× genome coverage) sequencing technologies. The assembled genome size is 314 Mbp in 226 contigs, with a 97.9% BUSCO completeness score and a contig N50 of 4.67 Mb, suggesting excellent continuity of this assembly. Our assembly builds the foundation for comparative and evolutionary genomic analysis in the genus of Muscidifurax and possible future biocontrol applications.


2021 ◽  
Vol 8 (Supplement_1) ◽  
pp. S281-S282
Author(s):  
Heather L Wells ◽  
Joseph Barrows ◽  
Mara Couto-Rodriguez ◽  
Xavier O Jirau Serrano ◽  
Marilyne Debieu ◽  
...  

Abstract Background The quantitative level of pathogens present in a host is a major driver of infectious disease (ID) state and outcome. However, the majority of ID diagnostics are qualitative. Next-generation sequencing (NGS) is an emerging ID diagnostics and research tool to provide insights, including tracking transmission, evolution, and identifying novel strains. Methods We built a novel likelihood-based computational method to leverage pathogen-specific genome-wide NGS data to detect SARS-CoV-2, profile genetic variants, and furthermore quantify levels of these pathogens. We used de-identified clinical specimens tested for SARS-CoV-2 using RT-PCR, SARS-CoV-2 NGS Assay (hybrid capture, Twist Bioscience), or ARTIC (amplicon-based) platform, and COVID-DX software. A training (n=87) and validation (n=22) set was selected to establish the strength of our quantification model. We fit non-uniform probabilistic error profiles to a deterministic sigmoidal equation that more realistically represents observed data and used likelihood maximized over several different read depths to improve accuracy over a wide range of values of viral load. Given the proportion of the genome covered at varying depths for a single sample as input data, our model estimated the Ct of that sample as the value that produces the maximum likelihood of generating the observed genome coverage data. Results The model fit on 87 SARS-CoV-2 NGS Assay training samples produced a good fit to the 22 validation samples, with a coefficient of correlation (r2) of ~0.8. The accuracy of the model was high (mean absolute % error of ~10%, meaning our model is able to predict the Ct value of each sample within a margin of ±10% on average). Because of the nature of the commonly used ARTIC protocol, we found that all quantitative signals in this data were lost during PCR amplification and the model is not applicable for quantification of samples captured this way. The ability to model quantification is a major advantage of the SARS-CoV-2 NGS assay protocol. The likelihood-based model to estimate SARS-CoV-2 viral titer Left Observed genome coverage (y-axis) plotted against Ct value (x-axis). The best-fitting logistic curve is demonstrated with a red line with shaded areas above and below representing the fitted error profile. RIGHT: Model-estimated Ct values (y-axis) compared to laboratory Ct values (x-axis) with grey bars representing estimated confidence intervals. The 1:1 diagonal is shown as a dotted line. Conclusion To our knowledge, this is the first model to incorporate sequence data mapped across the genome of a pathogen to quantify the level of that pathogen in a clinical specimen. This has implications in ID diagnostics, research, and metagenomics. Disclosures Heather L. Wells, MPH, Biotia, Inc. (Consultant) Joseph Barrows, MS, Biotia (Employee) Mara Couto-Rodriguez, MS, Biotia (Employee) Xavier O. Jirau Serrano, B.S., Biotia (Employee) Marilyne Debieu, PhD, Biotia (Employee) Karen Wessel, PhD, Labor Zotz/Klimas (Employee) Christopher Mason, PhD, Biotia (Board Member, Advisor or Review Panel member, Shareholder) Dorottya Nagy-Szakal, MD PhD, Biotia Inc (Employee, Shareholder) Niamh B. O’Hara, PhD, Biotia (Board Member, Employee, Shareholder)


2021 ◽  
Author(s):  
Teodora Ribarska ◽  
Pål Marius Bjørnstad ◽  
Arvind Y.M. Sundaram ◽  
Gregor D. Gilfillan

Abstract Background Novel commercial kits for whole genome library preparation for next-generation sequencing on Illumina platforms promise shorter workflows, lower inputs and cost savings. Time savings are achieved by employing enzymatic DNA fragmentation and by combining end-repair and tailing reactions. Fewer cleanup steps also allow greater DNA input flexibility (1 ng-1 µg), PCR-free options from 100 ng DNA, and lower price as compared to the well-established sonication and tagmentation-based DNA library preparation kits. Results We compared the performance of four enzymatic fragmentation-based DNA library preparation kits (from New England Biolabs, Roche, Swift Biosciences and Quantabio) to a tagmentation-based kit (Illumina) using low input DNA amounts (10 ng) and PCR-free reactions with 100 ng DNA. With four technical replicates of each input amount and kit, we compared the kits` fragmentation sequence-bias as well as performance parameters such as sequence coverage and the clinically relevant detection of single nucleotide and indel variants. While all kits produced high quality sequence data and demonstrated similar performance, several enzymatic fragmentation methods produced library insert sizes which deviated from those intended. Libraries with longer insert lengths performed better in terms of coverage, SNV and indel detection. Lower performance of shorter-insert libraries could be explained by loss of sequence coverage to overlapping paired-end reads, exacerbated by the preferential sequencing of shorter fragments on Illumina sequencers. We also observed that libraries prepared with minimal or no PCR performed best with regard to indel detection. Conclusions The enzymatic fragmentation-based DNA library preparation kits from NEB, Roche, Swift and Quantabio are good alternatives to the tagmentation based Nextera DNA flex kit from Illumina, offering reproducible results using flexible DNA inputs, quick workflows and lower prices. Libraries with insert DNA fragments longer than the cumulative sum of both read lengths avoid read overlap, thus produce more informative data that leads to strongly improved genome coverage and consequently also increased sensitivity and precision of SNP and indel detection. In order to best utilize such enzymatic fragmentation reagents, researchers should be prepared to invest time to optimize fragmentation conditions for their particular samples.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Adriana Maria Sanabria ◽  
Jessin Janice ◽  
Erik Hjerde ◽  
Gunnar Skov Simonsen ◽  
Anne-Merethe Hanssen

AbstractShotgun-metagenomics may give valuable clinical information beyond the detection of potential pathogen(s). Identification of antimicrobial resistance (AMR), virulence genes and typing directly from clinical samples has been limited due to challenges arising from incomplete genome coverage. We assessed the performance of shotgun-metagenomics on positive blood culture bottles (n = 19) with periprosthetic tissue for typing and prediction of AMR and virulence profiles in Staphylococcus aureus. We used different approaches to determine if sequence data from reads provides more information than from assembled contigs. Only 0.18% of total reads was derived from human DNA. Shotgun-metagenomics results and conventional method results were consistent in detecting S. aureus in all samples. AMR and known periprosthetic joint infection virulence genes were predicted from S. aureus. Mean coverage depth, when predicting AMR genes was 209 ×. Resistance phenotypes could be explained by genes predicted in the sample in most of the cases. The choice of bioinformatic data analysis approach clearly influenced the results, i.e. read-based analysis was more accurate for pathogen identification, while contigs seemed better for AMR profiling. Our study demonstrates high genome coverage and potential for typing and prediction of AMR and virulence profiles in S. aureus from shotgun-metagenomics data.


2021 ◽  
Vol 6 ◽  
pp. 241
Author(s):  
Ingra M. Claro ◽  
Mariana S. Ramundo ◽  
Thais M. Coletti ◽  
Camila A. M. da Silva ◽  
Ian N. Valenca ◽  
...  

Emerging and re-emerging viruses are a global health concern. Genome sequencing as an approach for monitoring circulating viruses is currently hampered by complex and expensive methods. Untargeted, metagenomic nanopore sequencing can provide genomic information to identify pathogens, prepare for or even prevent outbreaks. SMART (Switching Mechanism at the 5′ end of RNA Template) is a popular method for RNA-Seq but most current methods rely on oligo-dT priming to target polyadenylated mRNA molecules. We have developed two random primed SMART-Seq approaches, ‘SMART-9N’, and a version compatible with barcoded PCR primers available from Oxford Nanopore Technologies, ‘Rapid SMART-9N’, for the detection, characterization, and whole-genome sequencing of RNA viruses. The methods were developed using viral isolates, clinical samples, and compared to a gold-standard amplicon-based method. From a Zika virus isolate the SMART-9N approach recovered 10kb of the 10.8kb RNA genome in a single nanopore read. We also obtained full genome coverage at a high depth coverage using the Rapid SMART-9N, which takes only 10 minutes and costs up to 45% less than other methods. We found the limits of detection of these methods to be 6e00 focus forming units (FFU)/mL with 99.02% and 87.58% genome coverage for SMART-9N and Rapid SMART-9N respectively. Yellow fever virus plasma samples and SARS-CoV-2 nasopharyngeal samples previously confirmed by RT-qPCR with a broad range of Ct-values were selected for validation. Both methods produced greater genome coverage when compared to the multiplex PCR approach and we obtained the longest single read of this study (18.5 kb) with a SARS-CoV-2 clinical sample, 60% of the virus genome using the Rapid SMART-9N method. This work demonstrates that SMART-9N and Rapid SMART-9N are sensitive, low input, and long-read compatible alternatives for RNA virus detection and genome sequencing and Rapid SMART-9N improves the cost, time, and complexity of laboratory work.


2021 ◽  
Author(s):  
Kyle Fletcher ◽  
Rongkui Han ◽  
Diederik Smilde ◽  
Richard Michelmore

Polyploidy and heterokaryosis are common and consequential genetic phenomena that increase the number of haplotypes in an organism and complicate whole-genome sequence analysis. Allele balance has been used to infer polyploidy and heterokaryosis in diverse organisms using read sets sequenced to greater than 50x whole-genome coverage. However, Sequencing to adequate depth is costly if applied to multiple individuals or large genomes. We developed VCFvariance.pl to utilize the variance of allele balance to infer polyploidy and/or heterokaryosis at low sequence coverage. This analysis requires as little as 10x whole-genome coverage and reduces the allele balance profile down to a single value, which can be used to determine if an individual has two or more haplotypes. This approach was validated on simulated, synthetic, and authentic read sets from an oomycete, fungus, and plant. The approach was deployed to ascertain the genome status of multiple isolates of Bremia lactucae and Phytophthora infestans. VCFvariance.pl is a Perl script available at https://github.com/kfletcher88/VCFvariance.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0255663
Author(s):  
Efrat Bucris ◽  
Victoria Indenbaum ◽  
Roberto Azar ◽  
Oran Erster ◽  
Eric Haas ◽  
...  

Measles outbreaks escalated globally despite worldwide elimination efforts. Molecular epidemiological investigations utilizing partial measles virus (MeV) genomes are challenged by reduction in global genotypes and low evolutionary rates. Greater resolution was reached using MeV complete genomes, however time and costs limit the application to numerous samples. We developed an approach to unbiasedly sequence complete MeV genomes directly from patient urine samples. Samples were enriched for MeV using filtration or nucleases and the minimal number of sequence reads to allocate per sample based on its MeV content was assessed using in-silico reduction of sequencing depth. Application of limited-resource sequencing to treated MeV-positive samples demonstrated that 1–5 million sequences for samples with high/medium MeV quantities and 10–15 million sequences for samples with lower MeV quantities are sufficient to obtain >98% MeV genome coverage and over X50 average depth. This approach enables real-time high-resolution molecular epidemiological investigations of large-scale MeV outbreaks.


Sign in / Sign up

Export Citation Format

Share Document