structure databases
Recently Published Documents


TOTAL DOCUMENTS

111
(FIVE YEARS 16)

H-INDEX

20
(FIVE YEARS 4)

2022 ◽  
Author(s):  
Emre Brookes ◽  
Mattia Rocco

Abstract Recent spectacular advances by AI programs in 3D structure predictions from protein sequences have revolutionized the field in terms of accuracy and speed. The resulting "folding frenzy" has already produced predicted protein structure databases for the entire human and other organisms' proteomes. However, rapidly ascertaining a predicted structure's reliability based on measured properties in solution should be considered. Shape-sensitive hydrodynamic parameters such as the diffusion and sedimentation coefficients (D0t(20,w),s0(20,w)) and the intrinsic viscosity ([η]) can provide a rapid assessment of the overall structure likeliness, and SAXS would yield the structure-related pair-wise distance distribution function p(r) vs. r. Using the extensively validated UltraScan SOlution MOdeler (US-SOMO) suite we have calculated from the AlphaFold structures the corresponding D0t(20,w), s0(20,w), [η], p(r) vs. r, and other parameters. Circular dichroism spectra were also computed. The resulting US-SOMO-AF database should aid in rapidly evaluating the consistency in solution of AlphaFold predicted protein structures.


2021 ◽  
Vol 8 (3) ◽  
pp. 103-111
Author(s):  
Krishna R Gupta ◽  
Uttam Patle ◽  
Uma Kabra ◽  
P. Mishra ◽  
Milind J Umekar

Three-dimensional protein structure prediction from amino acid sequence has been a thought-provoking task for decades, but it of pivotal importance as it provides a better understanding of its function. In recent years, the methods for prediction of protein structures have advanced considerably. Computational techniques and increase in protein sequence and structure databases have influence the laborious protein structure determination process. Still there is no single method which can predict all the protein structures. In this review, we describe the four stages of protein structure determination. We have also explored the currenttechniques used to uncover the protein structure and highpoint best suitable method for a given protein.


2021 ◽  
Vol 19 (1) ◽  
pp. 1171-1182
Author(s):  
Wed Mohammed Ali ALaerjani ◽  
Saraa Abdullah Abu-Melha ◽  
Khalid Ali Khan ◽  
Hamed A. Ghramh ◽  
Ali Yahya A. Alalmie ◽  
...  

Abstract Acacia honey is characterized by high nutritional, antioxidant, antibacterial and immuno-modulatory values. This work investigated the presence of short and cyclic peptides in Acacia and Ziziphus honey samples. Acacia honey samples (Acacia tortilis and Acacia hamulosa) and three Ziziphus honeys (Ziziphus spina-christi) were screened for their short and cyclic peptide contents using the LC-MS and the chemical structure databases. Moreover, the total protein content was determined using the Bradford method. The A. tortilis honey contained three short peptides; HWCC, DSST, and ECH, and the A. hamulosa honey sample contained five short peptides and one cyclic peptide. The short peptides of the A. hamulosa honey were Ac-GMGHG-OH (Ac-MGGHG-OH), Boc-R(Aloc)2-C(Pal)-OH, H-C (1)-NEt2·H-C (1)-NEt2, APAP (AAPP), and GAFQ (deamino-2-pyrid-4-yl-glycyl-dl-alanyl-dl-norvalyl-dl-asparagine). The cyclic peptide of the A. hamulosa honey was cyclo[Aad-RGD-d-F] (cyclo[Aad-Arg-Gly-Asp-d-Phe]). The Ziziphus honey was characterized by the presence of either Almiramide B or Auristatin-6-AQ. A. tortilis, A. hamulosa, and Ziziphus honeys are characterized by the presence of short and cyclic peptides which may contribute to their medicinal values.


Author(s):  
Yury Smirnov ◽  

The Russian National Public Library for Science and Technology has published the Electronic Dictionary of Standardized Abbreviations in the Russian and 25 Foreign European Languages for Bibliographic Records. Its structure, databases and user interface, functionality and purpose are characterized in brief. Link to purchase the dictionary is given.


2020 ◽  
Author(s):  
Ziyan Zhang ◽  
Aria Mansouri Tehrani ◽  
Anton Oliynyk ◽  
Blake Day ◽  
Jakoah Brgoch

We report an ensemble machine-learning method capable of finding new superhard materials by directly predicting the load-dependent Vickers hardness based only on the chemical composition. A total of 1062 experimentally measured load-dependent Vickers hardness data were extracted from the literature and used to train a supervised machine-learning algorithm utilizing boosting, achieving excellent accuracy (R2 = 0.97). This new model was then tested by synthesizing and measuring the load-dependent hardness of several unreported disilicides as well as analyzing the predicted hardness of several classic superhard materials. The trained ensemble method was then employed to screen for superhard materials by examining more than 66,000 compounds in crystal structure databases, which showed that only 68 known materials surpass the superhard threshold. The hardness model was then combined with our data-driven phase diagram generation tool to expand the limited num1 ber of reported compounds. Eleven ternary borocarbide phase spaces were studied, and more than ten thermodynamically favorable compositions with superhard potential were identified, proving this ensemble model’s ability to find previously unknown superhard materials


2020 ◽  
Author(s):  
Ziyan Zhang ◽  
Aria Mansouri Tehrani ◽  
Anton Oliynyk ◽  
Blake Day ◽  
Jakoah Brgoch

We report an ensemble machine-learning method capable of finding new superhard materials by directly predicting the load-dependent Vickers hardness based only on the chemical composition. A total of 1062 experimentally measured load-dependent Vickers hardness data were extracted from the literature and used to train a supervised machine-learning algorithm utilizing boosting, achieving excellent accuracy (R2 = 0.97). This new model was then tested by synthesizing and measuring the load-dependent hardness of several unreported disilicides as well as analyzing the predicted hardness of several classic superhard materials. The trained ensemble method was then employed to screen for superhard materials by examining more than 66,000 compounds in crystal structure databases, which showed that only 68 known materials surpass the superhard threshold. The hardness model was then combined with our data-driven phase diagram generation tool to expand the limited num1 ber of reported compounds. Eleven ternary borocarbide phase spaces were studied, and more than ten thermodynamically favorable compositions with superhard potential were identified, proving this ensemble model’s ability to find previously unknown superhard materials


Author(s):  
Kai Dührkop ◽  
Louis Felix Nothias ◽  
Markus Fleischauer ◽  
Marcus Ludwig ◽  
Martin A. Hoffmann ◽  
...  

ABSTRACTMetabolomics experiments can employ non-targeted tandem mass spectrometry to detect hundreds to thousands of molecules in a biological sample. Structural annotation of molecules is typically carried out by searching their fragmentation spectra in spectral libraries or, recently, in structure databases. Annotations are limited to structures present in the library or database employed, prohibiting a thorough utilization of the experimental data. We present a computational tool for systematic compound class annotation: CANOPUS uses a deep neural network to predict 1,270 compound classes from fragmentation spectra, and explicitly targets compounds where neither spectral nor structural reference data are available. CANOPUS even predicts classes for which no MS/MS training data are available. We demonstrate the broad utility of CANOPUS by investigating the effect of the microbial colonization in the digestive system in mice, and through analysis of the chemodiversity of different Euphorbia plants; both uniquely revealing biological insights at the compound class level.


2020 ◽  
Vol 12 (3) ◽  
pp. 103-121 ◽  
Author(s):  
Rijja Hussain Bokhari ◽  
Nooreen Amirjan ◽  
Hyeonsoo Jeong ◽  
Kyung Mo Kim ◽  
Gustavo Caetano-Anollés ◽  
...  

Abstract The candidate phyla radiation (CPR) is a proposed subdivision within the bacterial domain comprising several candidate phyla. CPR organisms are united by small genome and physical sizes, lack several metabolic enzymes, and populate deep branches within the bacterial subtree of life. These features raise intriguing questions regarding their origin and mode of evolution. In this study, we performed a comparative and phylogenomic analysis to investigate CPR origin and evolution. Unlike previous gene/protein sequence-based reports of CPR evolution, we used protein domain superfamilies classified by protein structure databases to resolve the evolutionary relationships of CPR with non-CPR bacteria, Archaea, Eukarya, and viruses. Across all supergroups, CPR shared maximum superfamilies with non-CPR bacteria and were placed as deep branching bacteria in most phylogenomic trees. CPR contributed 1.22% of new superfamilies to bacteria including the ribosomal protein L19e and encoded four core superfamilies that are likely involved in cell-to-cell interaction and establishing episymbiotic lifestyles. Although CPR and non-CPR bacterial proteomes gained common superfamilies over the course of evolution, CPR and Archaea had more common losses. These losses mostly involved metabolic superfamilies. In fact, phylogenies built from only metabolic protein superfamilies separated CPR and non-CPR bacteria. These findings indicate that CPR are bacterial organisms that have probably evolved in an Archaea-like manner via the early loss of metabolic functions. We also discovered that phylogenies built from metabolic and informational superfamilies gave contrasting views of the groupings among Archaea, Bacteria, and Eukarya, which add to the current debate on the evolutionary relationships among superkingdoms.


2019 ◽  
Author(s):  
Marcus Ludwig ◽  
Louis-Félix Nothias ◽  
Kai Dührkop ◽  
Irina Koester ◽  
Markus Fleischauer ◽  
...  

1AbstractThe confident high-throughput identification of small molecules remains one of the most challenging tasks in mass spectrometry-based metabolomics. SIRIUS has become a powerful tool for the interpretation of tandem mass spectra, and shows outstanding performance for identifying the molecular formula of a query compound, being the first step of structure identification. Nevertheless, the identification of both molecular formulas for large compounds above 500 Daltons and novel molecular formulas remains highly challenging. Here, we present ZODIAC, a network-based algorithm for the de novo estimation of molecular formulas. ZODIAC reranks SIRIUS’ molecular formula candidates, combining fragmentation tree computation with Bayesian statistics using Gibbs sampling. Through careful algorithm engineering, ZODIAC’s Gibbs sampling is very swift in practice. ZODIAC decreases incorrect annotations 16.2-fold on a challenging plant extract dataset with most compounds above 700 Dalton; we then show improvements on four additional, diverse datasets. Our analysis led to the discovery of compounds with novel molecular formulas such as C24H47BrNO8P which, as of today, is not present in any publicly available molecular structure databases.


Sign in / Sign up

Export Citation Format

Share Document