Theoretical investigation on the geometries of DOTA and DOTA-like complexes and on the transition states of their conformational equilibriaElectronic supplementary information (ESI) available: Tables S1–S3, extended versions of Tables 1–3, and Table S4, comparison of structural features. See http://www.rsc.org/suppdata/nj/b1/b106168m/

Abstract Motivation Protein structure determination has primarily been performed using X-ray crystallography. To overcome the expensive cost, high attrition rate and series of trial-and-error settings, many in-silico methods have been developed to predict crystallization propensities of proteins based on their sequences. However, the majority of these methods build their predictors by extracting features from protein sequences, which is computationally expensive and can explode the feature space. We propose DeepCrystal, a deep learning framework for sequence-based protein crystallization prediction. It uses deep learning to identify proteins which can produce diffraction-quality crystals without the need to manually engineer additional biochemical and structural features from sequence. Our model is based on convolutional neural networks, which can exploit frequently occurring k-mers and sets of k-mers from the protein sequences to distinguish proteins that will result in diffraction-quality crystals from those that will not. Results Our model surpasses previous sequence-based protein crystallization predictors in terms of recall, F-score, accuracy and Matthew’s correlation coefficient (MCC) on three independent test sets. DeepCrystal achieves an average improvement of 1.4, 12.1% in recall, when compared to its closest competitors, Crysalis II and Crysf, respectively. In addition, DeepCrystal attains an average improvement of 2.1, 6.0% for F-score, 1.9, 3.9% for accuracy and 3.8, 7.0% for MCC w.r.t. Crysalis II and Crysf on independent test sets. Availability and implementation The standalone source code and models are available at https://github.com/elbasir/DeepCrystal and a web-server is also available at https://deeplearning-protein.qcri.org. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Identification of metabolites from tandem mass spectra with a machine learning approach utilizing structural features

Bioinformatics ◽

10.1093/bioinformatics/btz736 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yuanyue Li ◽

Michael Kuhn ◽

Anne-Claude Gavin ◽

Peer Bork

Keyword(s):

Structural Features ◽

Rapid Identification ◽

Supplementary Information ◽

Training Dataset ◽

Fragmentation Pattern ◽

Accurate Identification ◽

Rarefaction Analysis ◽

Tandem Mass Spectra ◽

Random Forest Models ◽

Fragmentation Patterns

Abstract Motivation Untargeted mass spectrometry (MS/MS) is a powerful method for detecting metabolites in biological samples. However, fast and accurate identification of the metabolites’ structures from MS/MS spectra is still a great challenge. Results We present a new analysis method, called SubFragment-Matching (SF-Matching) that is based on the hypothesis that molecules with similar structural features will exhibit similar fragmentation patterns. We combine information on fragmentation patterns of molecules with shared substructures and then use random forest models to predict whether a given structure can yield a certain fragmentation pattern. These models can then be used to score candidate molecules for a given mass spectrum. For rapid identification, we pre-compute such scores for common biological molecular structure databases. Using benchmarking datasets, we find that our method has similar performance to CSI: FingerID and those very high accuracies can be achieved by combining our method with CSI: FingerID. Rarefaction analysis of the training dataset shows that the performance of our method will increase as more experimental data become available. Availability and implementation SF-Matching is available from http://www.bork.embl.de/Docu/sf_matching. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

InterPep2: global peptide–protein docking using interaction surface templates

Bioinformatics ◽

10.1093/bioinformatics/btaa005 ◽

2020 ◽

Vol 36 (8) ◽

pp. 2458-2465 ◽

Cited By ~ 2

Author(s):

Isak Johansson-Åkhe ◽

Claudio Mirabello ◽

Björn Wallner

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Structural Features ◽

Protein Docking ◽

Supplementary Information ◽

Peptide Ligand ◽

Protein Protein Interactions ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Improved Performance

Abstract Motivation Interactions between proteins and peptides or peptide-like intrinsically disordered regions are involved in many important biological processes, such as gene expression and cell life-cycle regulation. Experimentally determining the structure of such interactions is time-consuming and difficult because of the inherent flexibility of the peptide ligand. Although several prediction-methods exist, most are limited in performance or availability. Results InterPep2 is a freely available method for predicting the structure of peptide–protein interactions. Improved performance is obtained by using templates from both peptide–protein and regular protein–protein interactions, and by a random forest trained to predict the DockQ-score for a given template using sequence and structural features. When tested on 252 bound peptide–protein complexes from structures deposited after the complexes used in the construction of the training and templates sets of InterPep2, InterPep2-Refined correctly positioned 67 peptides within 4.0 Å LRMSD among top10, similar to another state-of-the-art template-based method which positioned 54 peptides correctly. However, InterPep2 displays a superior ability to evaluate the quality of its own predictions. On a previously established set of 27 non-redundant unbound-to-bound peptide–protein complexes, InterPep2 performs on-par with leading methods. The extended InterPep2-Refined protocol managed to correctly model 15 of these complexes within 4.0 Å LRMSD among top10, without using templates from homologs. In addition, combining the template-based predictions from InterPep2 with ab initio predictions from PIPER-FlexPepDock resulted in 22% more near-native predictions compared to the best single method (22 versus 18). Availability and implementation The program is available from: http://wallnerlab.org/InterPep2. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Combined Experimental and Theoretical Investigation of Ligand and Anion Controlled Complex Formation with Unprecedented Structural Features and Photoluminescence Properties of Zinc(II) Complexes

Crystal Growth & Design ◽

10.1021/cg500717n ◽

2014 ◽

Vol 14 (8) ◽

pp. 4111-4123 ◽

Cited By ~ 24

Author(s):

Prateeti Chakraborty ◽

Jaydeep Adhikary ◽

Sugata Samanta ◽

Daniel Escudero ◽

Abril C. Castro ◽

...

Keyword(s):

Complex Formation ◽

Theoretical Investigation ◽

Structural Features ◽

Photoluminescence Properties

Download Full-text

A comprehensive theoretical investigation of the transition states and a proposed kinetic model for the cinchoninium ion asymmetric phase-transfer catalyzed alkylation reaction

Journal of Molecular Catalysis A Chemical ◽

10.1016/j.molcata.2016.03.009 ◽

2016 ◽

Vol 417 ◽

pp. 192-199 ◽

Cited By ~ 12

Author(s):

Ernane F. Martins ◽

Josefredo R. Pliego

Keyword(s):

Kinetic Model ◽

Theoretical Investigation ◽

Phase Transfer ◽

Transition States ◽

Alkylation Reaction

Download Full-text

Theoretical study of hydrogen abstraction from dimethyl ether and methyl tert-butyl ether by hydroxyl radicalElectronic supplementary information (ESI) available: optimized structural parameters, energies, zero point energies and dipole moments for reactants, products, and transition states (Tables S1–8). See http://www.rsc.org/suppdata/cp/b1/b109970c/

Physical Chemistry Chemical Physics ◽

10.1039/b109970c ◽

2002 ◽

Vol 4 (10) ◽

pp. 1797-1806 ◽

Cited By ~ 18

Author(s):

F. Atadinç ◽

C. Selçuki ◽

L. Sari ◽

V. Aviyente

Keyword(s):

Dimethyl Ether ◽

Theoretical Study ◽

Dipole Moments ◽

Structural Parameters ◽

Hydrogen Abstraction ◽

Transition States ◽

Zero Point ◽

Supplementary Information ◽

Methyl Tert Butyl Ether ◽

Butyl Ether

Download Full-text

ConPlot: web-based application for the visualization of protein contact maps integrated with other data

Bioinformatics ◽

10.1093/bioinformatics/btab049 ◽

2021 ◽

Author(s):

Filomeno Sánchez Rodríguez ◽

Shahram Mesdaghi ◽

Adam J Simpkin ◽

J Javier Burgos-Mármol ◽

David L Murphy ◽

...

Keyword(s):

Empty Space ◽

Structural Features ◽

Supplementary Information ◽

Contact Map ◽

Web Based ◽

Contact Maps ◽

File Formats ◽

Residue Contacts ◽

Contact Data ◽

Protein Contact Maps

Abstract Summary Covariance-based predictions of residue contacts and inter-residue distances are an increasingly popular data type in protein bioinformatics. Here we present ConPlot, a web-based application for convenient display and analysis of contact maps and distograms. Integration of predicted contact data with other predictions is often required to facilitate inference of structural features. ConPlot can therefore use the empty space near the contact map diagonal to display multiple coloured tracks representing other sequence-based predictions. Popular file formats are natively read and bespoke data can also be flexibly displayed. This novel visualization will enable easier interpretation of predicted contact maps. Availability and implementation available online at www.conplot.org, along with documentation and examples. Alternatively, ConPlot can be installed and used locally using the docker image from the project’s Docker Hub repository. ConPlot is licensed under the BSD 3-Clause. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text