Virtual Database Screening Algorithm for the Detection of Practically Valuable Proteins of Bovine and Pig Lungs

Abstract-The algorithm of the virtual database screening for the detection of proteins with the practical significance for the pharmaceutical and biotechnological industries has been developed. The Pythom programming language v. 3.6.5 in Notepad++ framework was used to develop the algorithm. The UniProt database served as a source of the information about the structure of the proteins comprising the bovine and pig lung proteome, and the open DrugBank database was used to the subsequent search for matches in the protein structures. The virtual screening allowed to detect more than 5,500 proteins which are present in the proteome of bovine and pig lungs; the assessment of the practical significance was absent in 99% of the proteins, although it resulted from the manual search in the DrugBank database that some of them were parts of drags. The algorithm also made it possible to find out target proteins for drags in the human lung proteome, which were similar with those contained in the bovine (46) and pig (84) lung proteome. Paired alignment of amino acid sequences was used to compare the human and animals' target proteins. In the end, the developed algorithm for virtual screening allowed to identify in the first approximation the proteins with practical significance that are in varying degrees included in the farm animals' lung proteome. In the future, the more detailed screening will be possible due to the algorithm optimization and use of closed databases, which will provide more complete information about practically valuable proteins for biotechnology and medicine. proteome, database, DragBank, UniProt, virtual screening, Python, lungs The work was carried out with financial support by Russian Foundation for Fundamental Research and the administration of the Volgograd region within the framework of the scientific project No. 18-44-343003

Download Full-text

Integrative analysis of histomorphology, transcriptome and whole genome resequencing identified DIO2 gene as a crucial gene for the protuberant knob located on forehead in geese

BMC Genomics ◽

10.1186/s12864-021-07822-9 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Yan Deng ◽

Shenqiang Hu ◽

Chenglong Luo ◽

Qingyuan Ouyang ◽

Li Li ◽

...

Keyword(s):

Molecular Mechanisms ◽

Protein Secondary Structure ◽

Genomic Analysis ◽

Production Performance ◽

Genetic Mechanism ◽

Farm Animals ◽

Practical Significance ◽

Genome Resequencing ◽

Genomic Analyses ◽

Whole Genome Resequencing

Abstract Background During domestication, remarkable changes in behavior, morphology, physiology and production performance have taken place in farm animals. As one of the most economically important poultry, goose owns a unique appearance characteristic called knob, which is located at the base of the upper bill. However, neither the histomorphology nor the genetic mechanism of the knob phenotype has been revealed in geese. Results In the present study, integrated radiographic, histological, transcriptomic and genomic analyses revealed the histomorphological characteristics and genetic mechanism of goose knob. The knob skin was developed, and radiographic results demonstrated that the knob bone was obviously protuberant and pneumatized. Histologically, there were major differences in structures in both the knob skin and bone between geese owing knob (namely knob-geese) and those devoid of knob (namely non-knob geese). Through transcriptome analysis, 592 and 952 genes differentially expressed in knob skin and bone, and significantly enriched in PPAR and Calcium pathways in knob skin and bone, respectively, which revealed the molecular mechanisms of histomorphological differences of the knob between knob- and non-knob geese. Furthermore, integrated transcriptomic and genomic analysis contributed to the identification of 17 and 21 candidate genes associated with the knob formation in the skin and bone, respectively. Of them, DIO2 gene could play a pivotal role in determining the knob phenotype in geese. Because a non-synonymous mutation (c.642,923 G > A, P265L) changed DIO2 protein secondary structure in knob geese, and Sanger sequencing further showed that the AA genotype was identified in the population of knob geese, and was prevalent in a crossing population which was artificially selected for 10 generations. Conclusions This study was the first to uncover the knob histomorphological characteristics and genetic mechanism in geese, and DIO2 was identified as the crucial gene associated with the knob phenotype. These data not only expand and enrich our knowledge on the molecular mechanisms underlying the formation of head appendages in both mammalian and avian species, but also have important theoretical and practical significance for goose breeding.

Download Full-text

Predicting secondary structures, contact numbers, and residue-wise contact orders of native protein structures from amino acid sequences using critical random networks

BIOPHYSICS ◽

10.2142/biophysics.1.67 ◽

2005 ◽

Vol 1 ◽

pp. 67-74 ◽

Cited By ~ 14

Author(s):

Akira R. Kinjo ◽

Ken Nishikawa

Keyword(s):

Amino Acid ◽

Protein Structures ◽

Secondary Structures ◽

Amino Acid Sequences ◽

Random Networks ◽

Native Protein

Download Full-text

In-silicoprediction and modeling of theEntamoeba histolyticaproteins: Serine-richEntamoeba histolyticaprotein and 29 kDa Cysteine-rich protease

PeerJ ◽

10.7717/peerj.3160 ◽

2017 ◽

Vol 5 ◽

pp. e3160 ◽

Cited By ~ 5

Author(s):

Kumar Manochitra ◽

Subhash Chandra Parija

Keyword(s):

Amino Acid ◽

Structure Prediction ◽

Tertiary Structure ◽

Protein Structures ◽

Amino Acid Sequences ◽

Treatment Modalities ◽

Bioinformatic Tools ◽

Complex Protein ◽

A Cell ◽

Quaternary Structures

BackgroundAmoebiasis is the third most common parasitic cause of morbidity and mortality, particularly in countries with poor hygienic settings. There exists an ambiguity in the diagnosis of amoebiasis, and hence there arises a necessity for a better diagnostic approach. Serine-richEntamoeba histolyticaprotein (SREHP), peroxiredoxin and Gal/GalNAc lectin are pivotal inE. histolyticavirulence and are extensively studied as diagnostic and vaccine targets. For elucidating the cellular function of these proteins, details regarding their respective quaternary structures are essential. However, studies in this aspect are scant. Hence, this study was carried out to predict the structure of these target proteins and characterize them structurally as well as functionally using appropriatein-silicomethods.MethodsThe amino acid sequences of the proteins were retrieved from National Centre for Biotechnology Information database and aligned using ClustalW. Bioinformatic tools were employed in the secondary structure and tertiary structure prediction. The predicted structure was validated, and final refinement was carried out.ResultsThe protein structures predicted by i-TASSER were found to be more accurate than Phyre2 based on the validation using SAVES server. The prediction suggests SREHP to be an extracellular protein, peroxiredoxin a peripheral membrane protein while Gal/GalNAc lectin was found to be a cell-wall protein. Signal peptides were found in the amino-acid sequences of SREHP and Gal/GalNAc lectin, whereas they were not present in the peroxiredoxin sequence. Gal/GalNAc lectin showed better antigenicity than the other two proteins studied. All the three proteins exhibited similarity in their structures and were mostly composed of loops.DiscussionThe structures of SREHP and peroxiredoxin were predicted successfully, while the structure of Gal/GalNAc lectin could not be predicted as it was a complex protein composed of sub-units. Also, this protein showed less similarity with the available structural homologs. The quaternary structures of SREHP and peroxiredoxin predicted from this study would provide better structural and functional insights into these proteins and may aid in development of newer diagnostic assays or enhancement of the available treatment modalities.

Download Full-text

Identification of Immune Related LRR-Containing Genes in Maize (Zea maysL.) by Genome-Wide Sequence Analysis

International Journal of Genomics ◽

10.1155/2015/231358 ◽

2015 ◽

Vol 2015 ◽

pp. 1-11 ◽

Cited By ~ 10

Author(s):

Wei Song ◽

Baoqiang Wang ◽

Xinghua Li ◽

Jianfen Wei ◽

Ling Chen ◽

...

Keyword(s):

Zea Mays ◽

Plant Disease Resistance ◽

Interleukin 1 ◽

Protein Structures ◽

Gene Clusters ◽

Amino Acid Sequences ◽

Leucine Rich Repeat ◽

Immune Receptors ◽

Genome Wide ◽

First Time

A large number of immune receptors consist of nucleotide binding site-leucine rich repeat (NBS-LRR) proteins and leucine rich repeat-receptor-like kinases (LRR-RLK) that play a crucial role in plant disease resistance. Although many NBS-LRR genes have been previously identified inZea mays, there are no reports on identifying NBS-LRR genes encoded in the N-terminal Toll/interleukin-1 receptor (TIR) motif and identifying genome-wide LRR-RLK genes. In the present study, 151 NBS-LRR genes and 226 LRR-RLK genes were identified after performing bioinformatics analysis of the entire maize genome. Of these identified genes, 64 NBS-LRR genes and four TIR-NBS-LRR genes were identified for the first time. The NBS-LRR genes are unevenly distributed on each chromosome with gene clusters located at the distal end of each chromosome, while LRR-RLK genes have a random chromosomal distribution with more paired genes. Additionally, six LRR-RLK/RLPs including FLS2, PSY1R, PSKR1, BIR1, SERK3, and Cf5 were characterized inZea maysfor the first time. Their predicted amino acid sequences have similar protein structures with their respective homologues in other plants, indicating that these maize LRR-RLK/RLPs have the same functions as their homologues act as immune receptors. The identified gene sequences would assist in the study of their functions in maize.

Download Full-text

Odontogenic Keratocyst

AL-Kindy College Medical Journal ◽

10.47723/kcmj.v17i2.266 ◽

2021 ◽

Vol 17 (2) ◽

pp. 52-61

Author(s):

Marwa A. Hamied ◽

Salwa M. Al-Shaikhani ◽

Zana D. Ali

Keyword(s):

English Language ◽

Odontogenic Keratocyst ◽

Hard Copy ◽

English Studies ◽

Eligibility Criteria ◽

Virtual Database ◽

Manual Search ◽

Language Studies ◽

Wait And See ◽

Early Diagnose

Purpose: to review in detail various aspects of odontogenic keratocyst, emphasizing recent nomenclature, clinical, histopathological, recurrence, and management of odontogenic keratocyst. Methods: To achieve the objective of this review, a manual search was done in hard copy books of oral and maxillofacial pathology, and an electronic search was done in the google website, oral and maxillofacial pathology E-books, virtual database sites, such as PubMed, Research Gate, Academia, and Google scholar using the descriptors: odontogenic cyst, kerato odontogenic tumor, odontogenic keratocyst, and jaws cystic lesion. The eligibility criteria for selecting articles were: to be in the English language, studies published in journals, or indexed in these databases until 2021. Exclusion criteria were: articles in any languages other than English, studies presented in duplicate between the bases, whose theme did not contemplate the objective proposed in this review, or those not available in the digital environment. Data collection occurred from October to December 2020, followed by a thorough evaluation of the studies found, including an exploratory, selective, analytical, and interpretative reading. Summary and conclusions: the odontogenic keratocyst is noteworthy because of its unusual growth pattern, the tendency to recur, and association with an inherited syndrome. The renaming of odontogenic keratocysts as keratocystic odontogenic tumors has been one of the most debatable changes in the terminology of odontogenic lesions in recent years. Early diagnose of this lesion is important to perform the more conservative treatment. A wait-and-see policy, with yearly follow-up for the first five years and every two years after that, is strongly advocated.

Download Full-text

The role of water and protein flexibility in the structure-based virtual screening of allosteric GPCR modulators: an mGlu5 receptor case study

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-019-00224-w ◽

2019 ◽

Vol 33 (9) ◽

pp. 787-797 ◽

Cited By ~ 1

Author(s):

Zoltán Orgován ◽

György G. Ferenczy ◽

György M. Keserű

Keyword(s):

Virtual Screening ◽

Ligand Binding ◽

Conformational Changes ◽

Metabotropic Glutamate Receptor ◽

Protein Structures ◽

Protein Flexibility ◽

Water Molecules ◽

Large Set ◽

Allosteric Modulators

Abstract Stabilizing unique receptor conformations, allosteric modulators of G-protein coupled receptors (GPCRs) might open novel treatment options due to their new pharmacological action, their enhanced specificity and selectivity in both binding and signaling. Ligand binding occurs at intrahelical allosteric sites and involves significant induced fit effects that include conformational changes in the local protein environment and water networks. Based on the analysis of available crystal structures of metabotropic glutamate receptor 5 (mGlu5) we investigated these effects in the binding of mGlu5 receptor negative allosteric modulators. A large set of retrospective virtual screens revealed that the use of multiple protein structures and the inclusion of selected water molecules improves virtual screening performance compared to conventional docking strategies. The role of water molecules and protein flexibility in ligand binding can be taken into account efficiently by the proposed docking protocol that provided reasonable enrichment of true positives. This protocol is expected to be useful also for identifying intrahelical allosteric modulators for other GPCR targets.

Download Full-text

The Basis for Target-Based Virtual Screening: Protein Structures

Methods and Principles in Medicinal Chemistry - Virtual Screening ◽

10.1002/9783527633326.ch4 ◽

2011 ◽

pp. 87-114 ◽

Cited By ~ 6

Author(s):

Jason C. Cole ◽

Oliver Korb ◽

Tjelvar S. G. Olsson ◽

John Liebeschuetz

Keyword(s):

Virtual Screening ◽

Protein Structures

Download Full-text

Multiple protein structures and multiple ligands: effects on the apparent goodness of virtual screening results

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-008-9168-9 ◽

2008 ◽

Vol 22 (3-4) ◽

pp. 257-265 ◽

Cited By ~ 39

Author(s):

Robert P. Sheridan ◽

Georgia B. McGaughey ◽

Wendy D. Cornell

Keyword(s):

Virtual Screening ◽

Protein Structures ◽

Multiple Protein ◽

Multiple Ligands

Download Full-text

Sequence Analysis and Structure Prediction of Malaysia SARS-CoV-2 Strain’s Structural and Accessory Proteins

Biointerface Research in Applied Chemistry ◽

10.33263/briac123.32593304 ◽

2021 ◽

Vol 12 (3) ◽

pp. 3259-3304

Keyword(s):

Structure Prediction ◽

Active Sites ◽

Homo Sapiens ◽

Functional Characterization ◽

Protein Structures ◽

Three Dimensional ◽

Experimental Models ◽

Amino Acid Sequences ◽

Accessory Proteins ◽

Evolutionary Analysis

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that transmitted from animal to human became a life-threatening pandemic in 2020. Scientists are currently testing several drugs to eradicate the COVID-19 outbreak. However, there is no 100 % effective drug or vaccine against SARS-CoV-2 has been discovered so far. In this study, we explored the structure prediction and functional analysis of 75 Malaysia SARS-CoV-2 strain’s structural and accessory proteins without the presence of experimental models. Physiochemical analysis, secondary structure analysis, structure prediction, functional characterization, active site identification, and evolutionary analysis based on the amino acid sequences retrieved from National Centre for Biotechnology Information (NCBI). Three-dimensional (3-D) protein structures were built using the Swiss model. The quality of protein models was verified by ERRAT, PROCHECK, and Verify 3D tools. Active prediction analysis revealed the high potential active sites of proteins where the anti-viral drug or vaccine may bind and inhibit the viral activities. Molecular phylogenetic analysis of ORF10, ORF8, and ORF6 proteins from five different species was analyzed. The results from this analysis proved that Homo sapiens SARS-CoV-2 had high genetic similarity with the bat coronavirus. These analyses may help in designing structure-based anti-viral drugs or to develop potential vaccines for SARS-CoV-2.

Download Full-text

Multi-Scale Structural Analysis of Proteins by Deep Semantic Segmentation

10.1101/474627 ◽

2018 ◽

Author(s):

Raphael R. Eguchi ◽

Po-Ssu Huang

Keyword(s):

Image Classification ◽

Protein Design ◽

Large Scale ◽

De Novo ◽

Protein Structures ◽

Semantic Segmentation ◽

Amino Acid Sequences ◽

Structural Quality ◽

Small Subset ◽

Structural Prediction

AbstractRecent advancements in computational methods have facilitated large-scale sampling of protein structures, leading to breakthroughs in protein structural prediction and enabling de novo protein design. Establishing methods to identify candidate structures that can lead to native folds or designable structures remains a challenge, since few existing metrics capture high-level structural features such as architectures, folds, and conformity to conserved structural motifs. Convolutional Neural Networks (CNNs) have been successfully used in semantic segmentation — a subfield of image classification in which a class label is predicted for every pixel. Here, we apply semantic segmentation to protein structures as a novel strategy for fold identification and structural quality assessment. We represent protein structures as 2D α-carbon distance matrices (“contact maps”), and train a CNN that assigns each residue in a multi-domain protein to one of 38 architecture classes designated by the CATH database. Our model performs exceptionally well, achieving a per-residue accuracy of 90.8% on the test set (95.0% average accuracy over all classes; 87.8% average within-structure accuracy). The unique aspect of our classifier is that it encodes sequence agnostic residue environments from the PDB and can assess structural quality as quantitative probabilities. We demonstrate that individual class probabilities can be used as a metric that indicates the degree to which a randomly generated structure assumes a specific fold, as well as a metric that highlights non-conformative regions of a protein belonging to a known class. These capabilities yield a powerful tool for guiding structural sampling for both structural prediction and design.SignificanceRecent computational advances have allowed researchers to predict the structure of many proteins from their amino acid sequences, as well as designing new sequences that fold into predefined structures. However, these tasks are often challenging because they require selection of a small subset of promising structural models from a large pool of stochastically generated ones. Here, we describe a novel approach to protein model selection that uses 2D image classification techniques to evaluate 3D protein models. Our method can be used to select structures based on the fold that they adopt, and can also be used to identify regions of low structural quality. These capabilities yield a powerful tool for both protein design and structure prediction.

Download Full-text