binding prediction
Recently Published Documents


TOTAL DOCUMENTS

232
(FIVE YEARS 113)

H-INDEX

25
(FIVE YEARS 8)

2022 ◽  
Vol 9 (1) ◽  
pp. 2270004
Author(s):  
Jon Lundstrøm ◽  
Emma Korhonen ◽  
Frédérique Lisacek ◽  
Daniel Bojar

2021 ◽  
Author(s):  
Cheng Chen ◽  
Zongzhao Qiu ◽  
Zhenghe Yang ◽  
Bin Yu ◽  
Xuefeng Cui

2021 ◽  
pp. 2103807
Author(s):  
Jon Lundstrøm ◽  
Emma Korhonen ◽  
Frédérique Lisacek ◽  
Daniel Bojar

2021 ◽  
Vol 22 (23) ◽  
pp. 12882
Author(s):  
Paul T. Kim ◽  
Robin Winter ◽  
Djork-Arné Clevert

In silico protein–ligand binding prediction is an ongoing area of research in computational chemistry and machine learning based drug discovery, as an accurate predictive model could greatly reduce the time and resources necessary for the detection and prioritization of possible drug candidates. Proteochemometric modeling (PCM) attempts to create an accurate model of the protein–ligand interaction space by combining explicit protein and ligand descriptors. This requires the creation of information-rich, uniform and computer interpretable representations of proteins and ligands. Previous studies in PCM modeling rely on pre-defined, handcrafted feature extraction methods, and many methods use protein descriptors that require alignment or are otherwise specific to a particular group of related proteins. However, recent advances in representation learning have shown that unsupervised machine learning can be used to generate embeddings that outperform complex, human-engineered representations. Several different embedding methods for proteins and molecules have been developed based on various language-modeling methods. Here, we demonstrate the utility of these unsupervised representations and compare three protein embeddings and two compound embeddings in a fair manner. We evaluate performance on various splits of a benchmark dataset, as well as on an internal dataset of protein–ligand binding activities and find that unsupervised-learned representations significantly outperform handcrafted representations.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Shitao Zhao ◽  
Michiaki Hamada

Abstract Background Protein-RNA interactions play key roles in many processes regulating gene expression. To understand the underlying binding preference, ultraviolet cross-linking and immunoprecipitation (CLIP)-based methods have been used to identify the binding sites for hundreds of RNA-binding proteins (RBPs) in vivo. Using these large-scale experimental data to infer RNA binding preference and predict missing binding sites has become a great challenge. Some existing deep-learning models have demonstrated high prediction accuracy for individual RBPs. However, it remains difficult to avoid significant bias due to the experimental protocol. The DeepRiPe method was recently developed to solve this problem via introducing multi-task or multi-label learning into this field. However, this method has not reached an ideal level of prediction power due to the weak neural network architecture. Results Compared to the DeepRiPe approach, our Multi-resBind method demonstrated substantial improvements using the same large-scale PAR-CLIP dataset with respect to an increase in the area under the receiver operating characteristic curve and average precision. We conducted extensive experiments to evaluate the impact of various types of input data on the final prediction accuracy. The same approach was used to evaluate the effect of loss functions. Finally, a modified integrated gradient was employed to generate attribution maps. The patterns disentangled from relative contributions according to context offer biological insights into the underlying mechanism of protein-RNA interactions. Conclusions Here, we propose Multi-resBind as a new multi-label deep-learning approach to infer protein-RNA binding preferences and predict novel interactions. The results clearly demonstrate that Multi-resBind is a promising tool to predict unknown binding sites in vivo and gain biology insights into why the neural network makes a given prediction.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Janik Sielemann ◽  
Donat Wulf ◽  
Romy Schmidt ◽  
Andrea Bräutigam

AbstractUnderstanding gene expression will require understanding where regulatory factors bind genomic DNA. The frequently used sequence-based motifs of protein-DNA binding are not predictive, since a genome contains many more binding sites than are actually bound and transcription factors of the same family share similar DNA-binding motifs. Traditionally, these motifs only depict sequence but neglect DNA shape. Since shape may contribute non-linearly and combinational to binding, machine learning approaches ought to be able to better predict transcription factor binding. Here we show that a random forest machine learning approach, which incorporates the 3D-shape of DNA, enhances binding prediction for all 216 tested Arabidopsis thaliana transcription factors and improves the resolution of differential binding by transcription factor family members which share the same binding motif. We observed that DNA shape features were individually weighted for each transcription factor, even if they shared the same binding sequence.


2021 ◽  
pp. 203-221
Author(s):  
Erick I. Navarro-Delgado ◽  
Marisol Salgado-Albarrán ◽  
Karla Torres-Arciga ◽  
Nicolas Alcaraz ◽  
Ernesto Soto-Reyes ◽  
...  

2021 ◽  
Vol 23 (Supplement_6) ◽  
pp. vi99-vi99
Author(s):  
Darwin Kwok ◽  
Takahide Nejo ◽  
Joseph Costello ◽  
Hideho Okada

Abstract BACKGROUND While immunotherapy is profoundly efficacious in certain cancers, its success is limited in cancers with lower mutational burden, such as gliomas. Therefore, investigating neoantigens beyond those from somatic mutations can expand the repertoire of immunotherapy targets. Recent studies detected alternative-splicing (AS) events in various cancer types that could potentially translate into tumor-specific proteins. Our study investigates AS within glioma to identify novel MHC-I-presented neoantigen targets through an integrative transcriptomic and proteomic computational pipeline, complemented by an extensive spatiotemporal analysis of the AS candidates. METHODS Bulk RNA-seq of high tumor purity TCGA-GBM/LGG (n=429) were analyzed through a novel systematic pipeline, and tumor-specific splicing junctions (neojunctions) were identified in silico by cross-referencing with bulk RNA-seq of GTEx normal tissue (n=9,166). Two HLA-binding prediction algorithms were subsequently incorporated to predict peptide sequences with high likelihood for HLA-presentation. Investigation of the tumor-wide clonality and temporal stability of the candidates was performed on extensive RNA-seq data from our spatially mapped intratumoral samples and longitudinally collected tumor tissue RNA-seq. Proteomic validation was conducted through mass-spec analysis of the Clinical Proteomic Tumor Analysis Consortium (CPTAC)-GBM repository (n=99). RESULTS Our analysis of TCGA-GBM/LGG bulk RNA-seq identified 249 putative neojunctions that translate into 222 cancer-specific peptide sequences which confer 21,489 tumor-specific n-mers (8-11 amino acids in length). Both prediction algorithms concurrently identified 271 n-mers likely to bind and be presented by HLA*A0101, HLA*A0201, HLA*A0301, HLA*A1101, or HLA*A2402. We confirmed the expression of 15 out of 58 HLA*A0201-binding candidates in HLA*A0201+ patient-derived glioma cell line RNA-seq with a subset of candidates conserved spatially. Analysis of CPTAC-GBM mass-spec data detected 23 tumor-specific peptides with 5 containing detected n-mers highly predicted to be HLA-presented. CONCLUSION Tumor-specific neojunctions identified in our unique integrative pipeline present novel candidate immunotherapy targets for gliomas and offer a new avenue in neoantigen discovery across cancer types.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Jing Qu ◽  
Sheng S. Yin ◽  
Han Wang

The metal ion binding of transmembrane proteins (TMPs) plays a fundamental role in biological processes, pharmaceutics, and medicine, but it is hard to extract enough TMP structures in experimental techniques to discover their binding mechanism comprehensively. To predict the metal ion binding sites for TMPs on a large scale, we present a simple and effective two-stage prediction method TMP-MIBS, to identify the corresponding binding residues using TMP sequences. At present, there is no specific research on the metal ion binding prediction of TMPs. Thereby, we compared our model with the published tools which do not distinguish TMPs from water-soluble proteins. The results in the independent verification dataset show that TMP-MIBS has superior performance. This paper explores the interaction mechanism between TMPs and metal ions, which is helpful to understand the structure and function of TMPs and is of great significance to further construct transport mechanisms and identify potential drug targets.


Author(s):  
Gianvito Grassoa ◽  
Arianna Di Gregorio ◽  
Bojan Mavkov ◽  
Dario Piga ◽  
Giuseppe Falvo D’Urso Labate ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document