scholarly journals Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning

2021 ◽  
Vol 12 ◽  
Author(s):  
Zhenfeng Li ◽  
Lun Hu ◽  
Zehai Tang ◽  
Cheng Zhao

Understanding the substrate specificity of HIV-1 protease plays an essential role in the prevention of HIV infection. A variety of computational models have thus been developed to predict substrate sites that are cleaved by HIV-1 protease, but most of them normally follow a supervised learning scheme to build classifiers by considering experimentally verified cleavable sites as positive samples and unknown sites as negative samples. However, certain noisy can be contained in the negative set, as false negative samples are possibly existed. Hence, the performance of the classifiers is not as accurate as they could be due to the biased prediction results. In this work, unknown substrate sites are regarded as unlabeled samples instead of negative ones. We propose a novel positive-unlabeled learning algorithm, namely PU-HIV, for an effective prediction of HIV-1 protease cleavage sites. Features used by PU-HIV are encoded from different perspectives of substrate sequences, including amino acid identities, coevolutionary patterns and chemical properties. By adjusting the weights of errors generated by positive and unlabeled samples, a biased support vector machine classifier can be built to complete the prediction task. In comparison with state-of-the-art prediction models, benchmarking experiments using cross-validation and independent tests demonstrated the superior performance of PU-HIV in terms of AUC, PR-AUC, and F-measure. Thus, with PU-HIV, it is possible to identify previously unknown, but physiologically existed substrate sites that are able to be cleaved by HIV-1 protease, thus providing valuable insights into designing novel HIV-1 protease inhibitors for HIV treatment.

2009 ◽  
Vol 84 (3) ◽  
pp. 1513-1526 ◽  
Author(s):  
Bin Yu ◽  
Dora P. A. J. Fonseca ◽  
Sara M. O'Rourke ◽  
Phillip W. Berman

ABSTRACT The identification of vaccine immunogens able to elicit broadly neutralizing antibodies (bNAbs) is a major goal in HIV vaccine research. Although it has been possible to produce recombinant envelope glycoproteins able to adsorb bNAbs from HIV-positive sera, immunization with these proteins has failed to elicit antibody responses effective against clinical isolates of HIV-1. Thus, the epitopes recognized by bNAbs are present on recombinant proteins, but they are not immunogenic. These results led us to consider the possibility that changes in the pattern of antigen processing might alter the immune response to the envelope glycoprotein to better elicit protective immunity. In these studies, we have defined protease cleavage sites on HIV gp120 recognized by three major human proteases (cathepsins L, S, and D) important for antigen processing and presentation. Remarkably, six of the eight sites identified in gp120 were highly conserved and clustered in regions of the molecule associated with receptor binding and/or the binding of neutralizing antibodies. These results suggested that HIV may have evolved to take advantage of major histocompatibility complex (MHC) class II antigen processing enzymes in order to evade or direct the antiviral immune response.


2014 ◽  
Vol 31 (8) ◽  
pp. 1204-1210 ◽  
Author(s):  
T. Rognvaldsson ◽  
L. You ◽  
D. Garwicz

Sign in / Sign up

Export Citation Format

Share Document