scholarly journals Chemistry and Bioinformatics Considerations in Using Next-Generation Sequencing Technologies to Inferring HIV Proviral DNA Genome-Intactness

Viruses ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 1874
Author(s):  
Guinevere Q. Lee

HIV persists via integration of the viral DNA into the human genome. The HIV DNA pool within an infected individual is a complex population that comprises both intact and defective viral genomes, each with a distinct integration site, in addition to a unique repertoire of viral quasi-species. Obtaining an accurate profile of the viral DNA pool is critical to understanding viral persistence and resolving interhost differences. Recent advances in next-generation deep sequencing (NGS) technologies have enabled the development of two sequencing assays to capture viral near-full- genome sequences at single molecule resolution (FLIP-seq) or to co-capture full-length viral genome sequences in conjunction with its associated viral integration site (MIP-seq). This commentary aims to provide an overview on both FLIP-seq and MIP-seq, discuss their strengths and limitations, and outline specific chemistry and bioinformatics concerns when using these assays to study HIV persistence.

BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Kelsi A. Lindblad ◽  
Jananan S. Pathmanathan ◽  
Sandrine Moreira ◽  
John R. Bracht ◽  
Robert P. Sebra ◽  
...  

Abstract Background Whole-genome shotgun sequencing, which stitches together millions of short sequencing reads into a single genome, ushered in the era of modern genomics and led to a rapid expansion of the number of genome sequences available. Nevertheless, assembly of short reads remains difficult, resulting in fragmented genome sequences. Ultimately, only a sequencing technology capable of capturing complete chromosomes in a single run could resolve all ambiguities. Even “third generation” sequencing technologies produce reads far shorter than most eukaryotic chromosomes. However, the ciliate Oxytricha trifallax has a somatic genome with thousands of chromosomes averaging only 3.2 kbp, making it an ideal candidate for exploring the benefits of sequencing whole chromosomes without assembly. Results We used single-molecule real-time sequencing to capture thousands of complete chromosomes in single reads and to update the published Oxytricha trifallax JRB310 genome assembly. In this version, over 50% of the completed chromosomes with two telomeres derive from single reads. The improved assembly includes over 12,000 new chromosome isoforms, and demonstrates that somatic chromosomes derive from variable rearrangements between somatic segments encoded up to 191,000 base pairs away. However, while long reads reduce the need for assembly, a hybrid approach that supplements long-read sequencing with short reads for error correction produced the most complete and accurate assembly, overall. Conclusions This assembly provides the first example of complete eukaryotic chromosomes captured by single sequencing reads and demonstrates that traditional approaches to genome assembly can mask considerable structural variation.


mBio ◽  
2014 ◽  
Vol 5 (3) ◽  
Author(s):  
Jason T. Ladner ◽  
Brett Beitzel ◽  
Patrick S. G. Chain ◽  
Matthew G. Davenport ◽  
Eric Donaldson ◽  
...  

ABSTRACT Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five “standard” categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques.


Author(s):  
Maarit Suomalainen ◽  
Vibhu Prasad ◽  
Abhilash Kannan ◽  
Urs F. Greber

AbstractIn clonal cultures, not all cells are equally susceptible to virus infection. Underlying mechanisms of infection variability are poorly understood. Here, we developed image-based single cell measurements to scrutinize the heterogeneity of adenovirus (AdV) infection. AdV delivers, transcribes and replicates a linear double-stranded DNA genome in the nucleus. We measured the abundance of viral transcripts by single-molecule RNA fluorescence in situ hybridization (FISH), and the incoming ethynyl-deoxy-cytidine (EdC)-tagged viral genome by copper(I)-catalyzed azide-alkyne cycloaddition (click) reaction. The early transcripts increased from 2-12 hours, the late ones from 12-23 hours post infection (pi), indicating distinct accumulation kinetics. Surprisingly, the expression of the immediate early transactivator gene E1A only moderately correlated with the number of viral genomes in the cell nucleus, although the incoming viral DNA remained largely intact until 7 hours pi. Genome-to-genome heterogeneity was found at the level of viral transcription, as indicated by colocalization with the large intron containing early region E4 transcripts, uncorrelated to the multiplicity of incoming genomes in the nucleus. In accordance, individual genomes exhibited heterogeneous replication activity, as shown by single-strand DNA-FISH and immunocytochemistry. These results indicate that the variability in viral gene expression and replication are not due to defective genomes but due to host cell heterogeneity. By analyzing the cell cycle state, we found that G1 cells exhibited the highest E1A expression, and significantly increased the correlation between E1A expression and viral genome copy numbers. This combined image-based single molecule procedure is ideally suited to explore the cell-to-cell variability in viral infection, including transcriptional activators and repressors, RNA splicing mechanisms, and the impact of the 3-dimensional nuclear topology on gene regulation.Author SummaryAdenoviruses (AdV) are ubiquitous pathogens in vertebrates. They persist in infected people, and cause unpredictable outbreaks, morbidity and mortality across the globe. Here we report that the common human AdV type C5 (AdV-C5) gives rise to considerable infection variability at the level of single cells in culture, and that a major underlying reason is the cell-to-cell heterogeneity. By combining sensitive single molecule in situ technology for detecting the incoming viral DNA and newly synthesized viral transcripts we show that viral gene expression is heterogeneous between infected human cells, as well as individual genomes. We report a moderate correlation between the number of viral genomes in the nucleus and immediate early E1A transcripts. This correlation is increased in the G1 phase of the cell cycle, where the E1A transcripts were found to be more abundant than in any other cell cycle phase. Our results demonstrate the importance of cell-to-cell variability measurements for understanding transcription and replication in viral infections.


2020 ◽  
Vol 48 (6) ◽  
pp. 2399-2414
Author(s):  
Anireddy S.N. Reddy ◽  
Jie Huang ◽  
Naeem H. Syed ◽  
Asa Ben-Hur ◽  
Suomeng Dong ◽  
...  

Next-generation sequencing (NGS) technologies - Illumina RNA-seq, Pacific Biosciences isoform sequencing (PacBio Iso-seq), and Oxford Nanopore direct RNA sequencing (DRS) - have revealed the complexity of plant transcriptomes and their regulation at the co-/post-transcriptional level. Global analysis of mature mRNAs, transcripts from nuclear run-on assays, and nascent chromatin-bound mRNAs using short as well as full-length and single-molecule DRS reads have uncovered potential roles of different forms of RNA polymerase II during the transcription process, and the extent of co-transcriptional pre-mRNA splicing and polyadenylation. These tools have also allowed mapping of transcriptome-wide start sites in cap-containing RNAs, poly(A) site choice, poly(A) tail length, and RNA base modifications. The emerging theme from recent studies is that reprogramming of gene expression in response to developmental cues and stresses at the co-/post-transcriptional level likely plays a crucial role in eliciting appropriate responses for optimal growth and plant survival under adverse conditions. Although the mechanisms by which developmental cues and different stresses regulate co-/post-transcriptional splicing are largely unknown, a few recent studies indicate that the external cues target spliceosomal and splicing regulatory proteins to modulate alternative splicing. In this review, we provide an overview of recent discoveries on the dynamics and complexities of plant transcriptomes, mechanistic insights into splicing regulation, and discuss critical gaps in co-/post-transcriptional research that need to be addressed using diverse genomic and biochemical approaches.


2016 ◽  
Author(s):  
Xian Fan ◽  
Mark Chaisson ◽  
Luay Nakhleh ◽  
Ken Chen

AbstractAchieving complete, accurate and cost-effective assembly of human genome is of great importance for realizing the promises of precision medicine. The abundance of repeats and genetic variations in human genome and the limitations of existing sequencing technologies call for the development of novel assembly methods that could leverage the complementary strengths of multiple technologies.We propose a Hybrid Structural variant Assembly (HySA) approach that integrates sequencing reads from next generation sequencing (NGS) and single-molecule sequencing (SMS) technologies to accurately assemble and detect structural variations (SV) in human genome. By identifying homologous SV-containing reads from different technologies through a bipartite-graph-based clustering algorithm, our approach turns a whole genome assembly problem into a set of independent SV assembly problems, each of which can be effectively solved to enhance assembly of structurally altered regions in human genome.In testing our approach using data generated from a haploid hydatidiform mole genome (CHM1) and a diploid human genome (NA12878), we found that our approach substantially improved the detection of many types of SVs, particularly novel large insertions, small INDELs (10-50bp) and short tandem repeat expansions and contractions over existing approaches with a low false discovery rate. Our work highlights the strengths and limitations of current approaches and provides an effective solution for extending the power of existing sequencing technologies for SV discovery.


2009 ◽  
Vol 11 (1) ◽  
pp. 31-46 ◽  
Author(s):  
Michael L. Metzker

Viruses ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 321
Author(s):  
Ashley N. Della Fera ◽  
Alix Warburton ◽  
Tami L. Coursey ◽  
Simran Khurana ◽  
Alison A. McBride

Persistent infection with oncogenic human papillomavirus (HPV) types is responsible for ~5% of human cancers. The HPV infectious cycle can sustain long-term infection in stratified epithelia because viral DNA is maintained as low copy number extrachromosomal plasmids in the dividing basal cells of a lesion, while progeny viral genomes are amplified to large numbers in differentiated superficial cells. The viral E1 and E2 proteins initiate viral DNA replication and maintain and partition viral genomes, in concert with the cellular replication machinery. Additionally, the E5, E6, and E7 proteins are required to evade host immune responses and to produce a cellular environment that supports viral DNA replication. An unfortunate consequence of the manipulation of cellular proliferation and differentiation is that cells become at high risk for carcinogenesis.


Sign in / Sign up

Export Citation Format

Share Document