scholarly journals Sub-dominant principal components inform new vaccine targets for HIV Gag

2019 ◽  
Vol 35 (20) ◽  
pp. 3884-3889 ◽  
Author(s):  
Syed Faraz Ahmed ◽  
Ahmed A Quadeer ◽  
David Morales-Jimenez ◽  
Matthew R McKay

Abstract Motivation Patterns of mutational correlations, learnt from patient-derived sequences of human immunodeficiency virus (HIV) proteins, are informative of biochemically linked networks of interacting sites that may enable viral escape from the host immune system. Accurate identification of these networks is important for rationally designing vaccines which can effectively block immune escape pathways. Previous computational methods have partly identified such networks by examining the principal components (PCs) of the mutational correlation matrix of HIV Gag proteins. However, driven by a conservative approach, these methods analyze the few dominant (strongest) PCs, potentially missing information embedded within the sub-dominant (relatively weaker) ones that may be important for vaccine design. Results By using sequence data for HIV Gag, complemented by model-based simulations, we revealed that certain networks of interacting sites that appear important for vaccine design purposes are not accurately reflected by the dominant PCs. Rather, these networks are encoded jointly by both dominant and sub-dominant PCs. By incorporating information from the sub-dominant PCs, we identified a network of interacting sites of HIV Gag that associated very strongly with viral control. Based on this network, we propose several new candidates for a potent T-cell-based HIV vaccine. Availability and implementation Accession numbers of all sequences used and the source code scripts for all analysis and figures reported in this work are available online at https://github.com/faraz107/HIV-Gag-Immunogens. Supplementary information Supplementary data are available at Bioinformatics online.

2017 ◽  
Vol 92 (5) ◽  
Author(s):  
Blake Schouest ◽  
Andrea M. Weiler ◽  
Sanath Kumar Janaka ◽  
Tereance A. Myers ◽  
Arpita Das ◽  
...  

ABSTRACTNef-specific CD8+T lymphocytes (CD8TL) are linked to extraordinary control of primate lentiviral replication, but the mechanisms underlying their efficacy remain largely unknown. The immunodominant, Mamu-B*017:01+-restricted Nef195-203MW9 epitope in SIVmac239 partially overlaps a sorting motif important for interactions with host AP-2 proteins and, hence, downmodulation of several host proteins, including Tetherin (CD317/BST-2), CD28, CD4, SERINC3, and SERINC5. We reasoned that CD8TL-driven evolution in this epitope might compromise Nef's ability to modulate these important molecules. Here, we used deep sequencing of SIV from nine B*017:01+macaques throughout infection with SIVmac239 to characterize the patterns of viral escape in this epitope and then assayed the impacts of these variants on Nef-mediated modulation of multiple host molecules. Acute variation in multiple Nef195-203MW9 residues significantly compromised Nef's ability to downregulate surface Tetherin, CD4, and CD28 and reduced its ability to prevent SERINC5-mediated reduction in viral infectivity but did not impact downregulation of CD3 or major histocompatibility complex class I, suggesting the selective disruption of immunomodulatory pathways involving Nef AP-2 interactions. Together, our data illuminate a pattern of viral escape dictated by a selective balance to maintain AP-2-mediated downregulation while evading epitope-specific CD8TL responses. These data could shed light on mechanisms of both CD8TL-driven viral control generally and on Mamu-B*017:01-mediated viral control specifically.IMPORTANCEA rare subset of humans infected with HIV-1 and macaques infected with SIV can control the virus without aid of antiviral medications. A common feature of these individuals is the ability to mount unusually effective CD8 T lymphocyte responses against the virus. One of the most formidable aspects of HIV is its ability to evolve to evade immune responses, particularly CD8 T lymphocytes. We show that macaques that target a specific peptide in the SIV Nef protein are capable of better control of the virus and that, as the virus evolves to escape this response, it does so at a cost to specific functions performed by the Nef protein. Our results help show how the virus can be controlled by an immune response, which could help in designing effective vaccines.


Author(s):  
Amnon Koren ◽  
Dashiell J Massey ◽  
Alexa N Bracci

Abstract Motivation Genomic DNA replicates according to a reproducible spatiotemporal program, with some loci replicating early in S phase while others replicate late. Despite being a central cellular process, DNA replication timing studies have been limited in scale due to technical challenges. Results We present TIGER (Timing Inferred from Genome Replication), a computational approach for extracting DNA replication timing information from whole genome sequence data obtained from proliferating cell samples. The presence of replicating cells in a biological specimen leads to non-uniform representation of genomic DNA that depends on the timing of replication of different genomic loci. Replication dynamics can hence be observed in genome sequence data by analyzing DNA copy number along chromosomes while accounting for other sources of sequence coverage variation. TIGER is applicable to any species with a contiguous genome assembly and rivals the quality of experimental measurements of DNA replication timing. It provides a straightforward approach for measuring replication timing and can readily be applied at scale. Availability and Implementation TIGER is available at https://github.com/TheKorenLab/TIGER. Supplementary information Supplementary data are available at Bioinformatics online


Author(s):  
Yanrong Ji ◽  
Zhihan Zhou ◽  
Han Liu ◽  
Ramana V Davuluri

Abstract Motivation Deciphering the language of non-coding DNA is one of the fundamental problems in genome research. Gene regulatory code is highly complex due to the existence of polysemy and distant semantic relationship, which previous informatics methods often fail to capture especially in data-scarce scenarios. Results To address this challenge, we developed a novel pre-trained bidirectional encoder representation, named DNABERT, to capture global and transferrable understanding of genomic DNA sequences based on up and downstream nucleotide contexts. We compared DNABERT to the most widely used programs for genome-wide regulatory elements prediction and demonstrate its ease of use, accuracy and efficiency. We show that the single pre-trained transformers model can simultaneously achieve state-of-the-art performance on prediction of promoters, splice sites and transcription factor binding sites, after easy fine-tuning using small task-specific labeled data. Further, DNABERT enables direct visualization of nucleotide-level importance and semantic relationship within input sequences for better interpretability and accurate identification of conserved sequence motifs and functional genetic variant candidates. Finally, we demonstrate that pre-trained DNABERT with human genome can even be readily applied to other organisms with exceptional performance. We anticipate that the pre-trained DNABERT model can be fined tuned to many other sequence analyses tasks. Availability and implementation The source code, pretrained and finetuned model for DNABERT are available at GitHub (https://github.com/jerryji1993/DNABERT). Supplementary information Supplementary data are available at Bioinformatics online.


Science ◽  
2021 ◽  
Vol 371 (6526) ◽  
pp. 284-288 ◽  
Author(s):  
Brian Hie ◽  
Ellen D. Zhong ◽  
Bonnie Berger ◽  
Bryan Bryson

The ability for viruses to mutate and evade the human immune system and cause infection, called viral escape, remains an obstacle to antiviral and vaccine development. Understanding the complex rules that govern escape could inform therapeutic design. We modeled viral escape with machine learning algorithms originally developed for human natural language. We identified escape mutations as those that preserve viral infectivity but cause a virus to look different to the immune system, akin to word changes that preserve a sentence’s grammaticality but change its meaning. With this approach, language models of influenza hemagglutinin, HIV-1 envelope glycoprotein (HIV Env), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Spike viral proteins can accurately predict structural escape patterns using sequence data alone. Our study represents a promising conceptual bridge between natural language and viral evolution.


Plant Disease ◽  
2021 ◽  
Author(s):  
Terry Torres-Cruz ◽  
Briana Whitaker ◽  
Robert Proctor ◽  
Kirk Broders ◽  
Imane Laraba ◽  
...  

Species within Fusarium are of global agricultural, medical, and food/feed safety concern and have been extensively characterized. However, accurate identification of species is challenging and usually requires DNA sequence data. FUSARIUM-ID (http://isolate.fusariumdb.org/) is a publicly available database designed to support the identification of Fusarium species using sequences of multiple phylogenetically informative loci, especially the highly informative ~680 bp 5' portion of the translation elongation factor 1-alpha (TEF1) gene that has been adopted as the primary barcoding locus in the genus. However, FUSARIUM-ID v.1.0 and 2.0 had several limitations, including inconsistent metadata annotation for the archived sequences and poor representation of some species complexes and marker loci. Here, we present FUSARIUM-ID v.3.0, which provides the following improvements: (i) additional and updated annotation of metadata for isolates associated with each sequence, (ii) expanded taxon representation in the TEF1 sequence database, (iii) availability of the sequence database as a downloadable file to enable local BLAST queries, and (iv) a tutorial file for users to perform local BLAST searches using either freely-available software, such as SequenceServer, BLAST+ executable in the command line, and Galaxy, or the proprietary Geneious software. FUSARIUM-ID will be updated on a regular basis by archiving sequences of TEF1 and other loci from newly identified species and greater in-depth sampling of currently recognized species.


2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Tiziana Larussa ◽  
Isabella Leone ◽  
Evelina Suraci ◽  
Maria Imeneo ◽  
Francesco Luzza

Helicobacter pyloricolonizes the gastric mucosa of at least half of the human population, causing a worldwide infection that appears in early childhood and if not treated, it can persist for life. The presence of symptoms and their severity depend on bacterial components, host susceptibility, and environmental factors, which allowH. pylorito switch between commensalism and pathogenicity.H. pylori-driven interactions with the host immune system underlie the persistence of the infection in humans, since the bacterium is able to interfere with the activity of innate and adaptive immune cells, reducing the inflammatory response in its favour. Gastritis due toH. pyloriresults from a complex interaction between several T cell subsets. In particular,H. pyloriis known to induce a T helper (Th)1/Th17 cell response-driven gastritis, whose impaired modulation caused by the bacterium is thought to sustain the ongoing inflammatory condition and the unsuccessful clearing of the infection. In this review we discuss the current findings underlying the mechanisms implemented byH. pylorito alter the T helper lymphocyte proliferation, thus facilitating the development of chronic infections and allowing the survival of the bacterium in the human host.


2018 ◽  
Vol 35 (14) ◽  
pp. 2492-2494
Author(s):  
Tania Cuppens ◽  
Thomas E Ludwig ◽  
Pascal Trouvé ◽  
Emmanuelle Genin

Abstract Summary When analyzing sequence data, genetic variants are considered one by one, taking no account of whether or not they are found in the same individual. However, variant combinations might be key players in some diseases as variants that are neutral on their own can become deleterious when associated together. GEMPROT is a new analysis tool that allows, from a phased vcf file, to visualize the consequences of the genetic variants on the protein. At the level of an individual, the program shows the variants on each of the two protein sequences and the Pfam functional protein domains. When data on several individuals are available, GEMPROT lists the haplotypes found in the sample and can compare the haplotype distributions between different sub-groups of individuals. By offering a global visualization of the gene with the genetic variants present, GEMPROT makes it possible to better understand the impact of combinations of genetic variants on the protein sequence. Availability and implementation GEMPROT is freely available at https://github.com/TaniaCuppens/GEMPROT. An on-line version is also available at http://med-laennec.univ-brest.fr/GEMPROT/. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Brett Whitty ◽  
John F. Thompson

AbstractBackgroundLow levels of sample contamination can have disastrous effects on the accurate identification of somatic variation in tumor samples. Detection of sample contamination in DNA is generally based on observation of low frequency variants that suggest more than a single source of DNA is present. This strategy works with standard DNA samples but is especially problematic in solid tumor FFPE samples because there can be huge variations in allele frequency (AF) due to massive copy number changes arising from large gains and losses across the genome. The tremendously variable allele frequencies make detection of contamination challenging. A method not based on individual AF is needed for accurate determination of whether a sample is contaminated and to what degree.MethodsWe used microhaplotypes to determine whether sample contamination is present. Microhaplotypes are sets of variants on the same sequencing read that can be unambiguously phased. Instead of measuring AF, the number and frequency of microhaplotypes is determined. Contamination detection becomes based on fundamental genomic properties, linkage disequilibrium (LD) and the diploid nature of human DNA, rather than variant frequencies. We optimized microhaplotype content based on 164 single nucleotide variant sets located in genes already sequenced within a cancer panel. Thus, contamination detection uses existing sequence data and does not require sequencing of any extraneous regions. The content is chosen based on LD data from the 1000 Genomes Project to be ancestry agnostic, providing the same sensitivity for contamination detection with samples from individuals of African, East Asian, and European ancestry.ResultsDetection of contamination at 1% and below is possible using this design. The methods described here can also be extended to other DNA mixtures such as forensic and non-invasive prenatal testing samples where DNA mixes of 1% or less can be similarly detected.ConclusionsThe microhaplotype method allows sensitive detection of DNA contamination in FFPE tumor samples. These methods provide a foundation for examining DNA mixtures in a variety of contexts. With the appropriate panels and high sequencing depth, low levels of secondary DNA can be detected and this can be valuable in a variety of applications.


2021 ◽  
Author(s):  
Yuanyuan Qu ◽  
Xueyan Zhang ◽  
Meiyu Wang ◽  
Lina Sun ◽  
Yongzhong Jiang ◽  
...  

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has precipitated multiple variants resistant to therapeutic antibodies. In this study, 12 high-affinity antibodies were generated from convalescent donors in early outbreaks using immune antibody phage display libraries. Of them, two RBD-binding antibodies (F61 and H121) showed high affinity neutralization against SARS-CoV-2, whereas three S2-target antibodies failed to neutralize SARS-CoV-2. Following structure analysis, F61 identified a linear epitope located in residues G446 - S494, which overlapped with angiotensin-converting enzyme 2 (ACE2) binding sites, while H121 recognized a conformational epitope located on the side face of RBD, outside from ACE2 binding domain. Hence the cocktail of the two antibodies achieved better performance of neutralization to SARS-CoV-2. Importantly, F61 and H121 exhibited efficient neutralizing activity against variants B.1.1.7 and B.1.351, those showed immune escape. Efficient neutralization of F61 and H121 against multiple mutations within RBD revealed a broad neutralizing activity against SARS-CoV-2 variants, which mitigated the risk of viral escape. Our findings defined the basis of therapeutic cocktails of F61 and H121 with broad neutralization and delivered a guideline for the current and future vaccine design, therapeutic antibody development, and antigen diagnosis of SARS-CoV-2 and its novel variants.


2021 ◽  
Author(s):  
Yuka Koizumi ◽  
Sheny Ahmad ◽  
Miyuki Ikeda ◽  
Akiko Yashima-Abo ◽  
Ginny Espina ◽  
...  

Background: Paradoxically, Helicobacter pylori-positive (HP+) advanced gastric cancer patients have a better prognosis than those who are HP-negative (HP-). Immunologic and statistical analyses can be used to verify whether systematic mechanisms modulated by HP are involved in this more favorable outcome. Methods: A total of 658 advanced gastric cancer patients who underwent gastrectomy were enrolled. HP infection, mismatch repair, programmed death-ligand 1 (PD-L1), and CD4/CD8 proteins, and microsatellite instability were analyzed. Overall survival (OS) and relapse free survival (RFS) rates were analyzed after stratifying clinicopathological factors. Cox proportional hazards regression analysis was performed to identify independent prognostic factors. Results: Among 491 cases that were analyzed, 175 (36%) and 316 (64%) cases were HP+ and HP⁻, respectively. Analysis of RFS indicated an interaction of HP status among the subgroups for S-1 Dose (P=0.0487) and PD-L1 (P=0.016). HP+ patients in the PD-L1- group had significantly higher five-year OS and RFS than HP- patients (81% vs. 68%; P=0.0011; HR 0.477; and 76% vs. 63%; P=0.0011; HR 0.508, respectively). The five-year OS and RFS was also significantly higher for HP⁺ compared to HP- patients in the PD-L1-/S-1-reduced group (86% vs. 46%; p=0.0014; HR 0.205; 83% vs. 34%; p=0.001; HR 0.190, respectively). Thus, HP status was identified as one of the most potentially important independent factors to predict prolonged survival. Conclusion: Modulation of host immune system function by HP may contribute to prolonged survival in the absence of immune escape mechanisms of gastric cancer.


Sign in / Sign up

Export Citation Format

Share Document