scholarly journals NanoHIV: A Bioinformatics Pipeline for Producing Accurate, Near Full-Length HIV Proviral Genomes Sequenced Using the Oxford Nanopore Technology

Cells ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 2577
Author(s):  
Imogen A. Wright ◽  
Kayla E. Delaney ◽  
Mary Grace K. Katusiime ◽  
Johannes C. Botha ◽  
Susan Engelbrecht ◽  
...  

HIV-1 proviral single-genome sequencing by limiting-dilution polymerase chain reaction (PCR) amplification is important for differentiating the sequence-intact from defective proviruses that persist during antiretroviral therapy (ART). Intact proviruses may rebound if ART is interrupted and are the barrier to an HIV cure. Oxford Nanopore Technologies (ONT) sequencing offers a promising, cost-effective approach to the sequencing of long amplicons such as near full-length HIV-1 proviruses, but the high diversity of HIV-1 and the ONT sequencing error render analysis of the generated data difficult. NanoHIV is a new tool that uses an iterative consensus generation approach to construct accurate, near full-length HIV-1 proviral single-genome sequences from ONT data. To validate the approach, single-genome sequences generated using NanoHIV consensus building were compared to Illumina® consensus building of the same nine single-genome near full-length amplicons and an average agreement of 99.4% was found between the two sequencing approaches.

mSphere ◽  
2020 ◽  
Vol 5 (5) ◽  
Author(s):  
Bhavna Hora ◽  
Naila Gulzar ◽  
Yue Chen ◽  
Konstantinos Karagiannis ◽  
Fangping Cai ◽  
...  

ABSTRACT High-throughput sequencing (HTS) has been widely used to characterize HIV-1 genome sequences. There are no algorithms currently that can directly determine genotype and quasispecies population using short HTS reads generated from long genome sequences without additional software. To establish a robust subpopulation, subtype, and recombination analysis workflow, we amplified the HIV-1 3′-half genome from plasma samples of 65 HIV-1-infected individuals and sequenced the entire amplicon (∼4,500 bp) by HTS. With direct analysis of raw reads using HIVE-hexahedron, we showed that 48% of samples harbored 2 to 13 subpopulations. We identified various subtypes (17 A1s, 4 Bs, 27 Cs, 6 CRF02_AGs, and 11 unique recombinant forms) and defined recombinant breakpoints of 10 recombinants. These results were validated with viral genome sequences generated by single genome sequencing (SGS) or the analysis of consensus sequence of the HTS reads. The HIVE-hexahedron workflow is more sensitive and accurate than just evaluating the consensus sequence and also more cost-effective than SGS. IMPORTANCE The highly recombinogenic nature of human immunodeficiency virus type 1 (HIV-1) leads to recombination and emergence of quasispecies. It is important to reliably identify subpopulations to understand the complexity of a viral population for drug resistance surveillance and vaccine development. High-throughput sequencing (HTS) provides improved resolution over Sanger sequencing for the analysis of heterogeneous viral subpopulations. However, current methods of analysis of HTS reads are unable to fully address accurate population reconstruction. Hence, there is a dire need for a more sensitive, accurate, user-friendly, and cost-effective method to analyze viral quasispecies. For this purpose, we have improved the HIVE-hexahedron algorithm that we previously developed with in silico short sequences to analyze raw HTS short reads. The significance of this study is that our standalone algorithm enables a streamlined analysis of quasispecies, subtype, and recombination patterns from long HIV-1 genome regions without the need of additional sequence analysis tools. Distinct viral populations and recombination patterns identified by HIVE-hexahedron are further validated by comparison with sequences obtained by single genome sequencing (SGS).


2012 ◽  
Vol 93 (6) ◽  
pp. 1173-1184 ◽  
Author(s):  
Chunhua Li ◽  
Hong Cao ◽  
Ling Lu ◽  
Donald Murphy

In this study, we characterized full-length hepatitis C virus (HCV) genome sequences for 11 genotype 2 isolates. They were isolated from the sera of 11 patients residing in Canada, of whom four had an African origin. Full-length genomes, each with 18–25 overlapping fragments, were obtained by PCR amplification. Five isolates represent the first complete genomes of subtypes 2d, 2e, 2j, 2m and 2r, while the other six correspond to variants that do not group within any assigned subtypes. These sequences had lengths of 9508–9825 nt and each contained a single ORF encoding 3012–3106 aa. Predicted amino acids were carefully inspected and unique variation patterns were recognized, especially for a 2e isolate, QC64. Phylogenetic analysis of complete genome sequences provides evidence that there are a total of 16 subtypes, of which 11 have been described here. Co-analysis with 68 partial NS5B sequences also differentiated 18 assigned subtypes, 2a–2r, and eight additional lineages within genotype 2, which is consistent with the analysis of complete genome sequences. The data from this study will now allow 10 assigned subtypes and six additional lineages of HCV genotype 2 to have their full-length genomes defined. Further analysis with 2021 genotype 2 sequences available in the HCV database indicated that the geographical distribution of these subtypes is consistent with an African origin, with particular subtypes having spread to Asia and the Americas.


2020 ◽  
Vol 222 (10) ◽  
pp. 1670-1680
Author(s):  
Vlad Novitsky ◽  
Melissa Zahralban-Steele ◽  
Sikhulile Moyo ◽  
Tapiwa Nkhisang ◽  
Dorcas Maruapula ◽  
...  

Abstract Background Phylogenetic mapping of HIV-1 lineages circulating across defined geographical locations is promising for better understanding HIV transmission networks to design optimal prevention interventions. Methods We obtained near full-length HIV-1 genome sequences from people living with HIV (PLWH), including participants on antiretroviral treatment in the Botswana Combination Prevention Project, conducted in 30 Botswana communities in 2013–2018. Phylogenetic relationships among viral sequences were estimated by maximum likelihood. Results We obtained 6078 near full-length HIV-1C genome sequences from 6075 PLWH. We identified 984 phylogenetically distinct HIV-1 lineages (molecular HIV clusters) circulating in Botswana by mid-2018, with 2–27 members per cluster. Of these, dyads accounted for 62%, approximately 32% (n = 316) were found in single communities, and 68% (n = 668) were spread across multiple communities. Men in clusters were approximately 3 years older than women (median age 42 years, vs 39 years; P < .0001). In 65% of clusters, men were older than women, while in 35% of clusters women were older than men. The majority of identified viral lineages were spread across multiple communities. Conclusions A large number of circulating phylogenetically distinct HIV-1C lineages (molecular HIV clusters) suggests highly diversified HIV transmission networks across Botswana communities by 2018.


2021 ◽  
Vol 9 (12) ◽  
pp. 2598
Author(s):  
Anton Pembaur ◽  
Erwan Sallard ◽  
Patrick Philipp Weil ◽  
Jennifer Ortelt ◽  
Parviz Ahmad-Nejad ◽  
...  

The scale of the ongoing SARS-CoV-2 pandemic warrants the urgent establishment of a global decentralized surveillance system to recognize local outbreaks and the emergence of novel variants of concern. Among available deep-sequencing technologies, nanopore-sequencing could be an important cornerstone, as it is mobile, scalable, and cost-effective. Therefore, streamlined nanopore-sequencing protocols need to be developed and optimized for SARS-CoV-2 variants identification. We adapted and simplified existing workflows using the ‘midnight’ 1200 bp amplicon split primer sets for PCR, which produce tiled overlapping amplicons covering almost the entire SARS-CoV-2 genome. Subsequently, we applied Oxford Nanopore Rapid Barcoding and the portable MinION Mk1C sequencer combined with the interARTIC bioinformatics pipeline. We tested a simplified and less time-consuming workflow using SARS-CoV-2-positive specimens from clinical routine and identified the CT value as a useful pre-analytical parameter, which may help to decrease sequencing failures rates. Complete pipeline duration was approx. 7 h for one specimen and approx. 11 h for 12 multiplexed barcoded specimens. The adapted protocol contains fewer processing steps and can be completely conducted within one working day. Diagnostic CT values deduced from qPCR standardization experiments can act as principal criteria for specimen selection. As a guideline, SARS-CoV-2 genome copy numbers lower than 4 × 106 were associated with a coverage threshold below 20-fold and incompletely assembled SARS-CoV-2 genomes. Thus, based on the described thermocycler/chemistry combination, we recommend CT values of ~26 or lower to achieve full and high-quality SARS-CoV-2 (+)RNA genome coverage.


2019 ◽  
Author(s):  
Alejandro R. Gener

ABSTRACTObjective(s)To evaluate nanopore DNA sequencing for sequencing full-length HIV-1 provirus.DesignI used nanopore sequencing to sequence full-length HIV-1 from a plasmid (pHXB2).MethodspHXB2 plasmid was processed with the Rapid PCR-Barcoding library kit and sequenced on the MinION sequencer (Oxford Nanopore Technologies, Oxford., UK). Raw fast5 reads were converted into fastq (base called) with Albacore, Guppy, and FlipFlop base callers. Reads were first aligned to the reference with BWA-MEM to evaluate sample coverage manually. Reads were then assembled with Canu into contigs, and contigs manually finished in SnapGene.ResultsI sequenced full-length HXB2 HIV-1 from 5’ to 3’ LTR (100%), with median per-base coverage of over 9000x in one 12-barcoded experiment on a single MinION flow cell. The longest HIV-spanning read to-date was generated, at a length of 11,487 bases, which included full-length HIV-1 and plasmid backbone on either side. At least 20 variants were discovered in pHXB2 compared to reference.ConclusionsThe MinION sequencer performed as-expected, covering full-length HIV. The discovery of variants in a dogmatic reference plasmid demonstrates the need for single-molecule sequence verification moving forward. These results illustrate the utility of long read sequencing to advance the study of HIV at single integration site resolution.


2016 ◽  
Vol 32 (6) ◽  
pp. 601-606 ◽  
Author(s):  
Yindi Song ◽  
Yue Feng ◽  
Zhijiang Miao ◽  
Binghui Wang ◽  
Ming Yang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document