protein sequences
Recently Published Documents


TOTAL DOCUMENTS

2054
(FIVE YEARS 503)

H-INDEX

99
(FIVE YEARS 18)

2022 ◽  
Author(s):  
Emre Brookes ◽  
Mattia Rocco

Abstract Recent spectacular advances by AI programs in 3D structure predictions from protein sequences have revolutionized the field in terms of accuracy and speed. The resulting "folding frenzy" has already produced predicted protein structure databases for the entire human and other organisms' proteomes. However, rapidly ascertaining a predicted structure's reliability based on measured properties in solution should be considered. Shape-sensitive hydrodynamic parameters such as the diffusion and sedimentation coefficients (D0t(20,w),s0(20,w)) and the intrinsic viscosity ([η]) can provide a rapid assessment of the overall structure likeliness, and SAXS would yield the structure-related pair-wise distance distribution function p(r) vs. r. Using the extensively validated UltraScan SOlution MOdeler (US-SOMO) suite we have calculated from the AlphaFold structures the corresponding D0t(20,w), s0(20,w), [η], p(r) vs. r, and other parameters. Circular dichroism spectra were also computed. The resulting US-SOMO-AF database should aid in rapidly evaluating the consistency in solution of AlphaFold predicted protein structures.


2022 ◽  
Author(s):  
Artis Linārs ◽  
Ivars Silamikelis ◽  
Dita Gudra ◽  
Ance Roga ◽  
Dāvids Fridmanis

Over the decades the improvement of naturally occurring proteins and creation of novel ones has been the primary goal for many practical biotechnology researchers and it is widely recognized that randomization of protein sequences coupled to various effect screening methodologies is one of the most powerful techniques for fast, efficient and purposeful approach for acquisition of desired improvements. Over the years considerable advancements have been made in this field, however development of PCR based or template guided methodologies has been hampered by the resulting template sequence bias. In this article we present novel whole plasmid amplification based approach, which we named OverFlap PCR, for randomization of virtually any region of the plasmid DNA, without introduction of mentioned bias.


Author(s):  
S. Dinesh

Abstract: Homology detection plays a major role in bioinformatics. Different type of methods is used for Homology detection. Here we extract the information from protein sequences and then uses the various algorithm to predict the similarity between protein families. SVM most commonly used the algorithm in homology detection. Classification techniques are not suitable for homology detection because theyare not suitable for high dimensional datasets. Soreducing the higher dimensionality is very important than easily can predict the similarity of protein families. Keywords: Homology detection, Protein, Sequence, Reducing dimensionality, BLAST, SCOP.


2021 ◽  
Author(s):  
Swathika RS ◽  
Vimal S ◽  
Bhagya Shree E ◽  
Elakkiya Elumalai ◽  
Krishna Kant Gupta

The SARS-CoV-2 virus has caused the severe pandemic, COVID19 and since then its been critical to produce a potent vaccine to prevent the quick transmission and also to avoid alarming deaths. Among all type of vaccines peptide based epitope design tend to outshine with respect to low cost production and more efficacy. Therefore, we started with obtaining the necessary protein sequences from NCBI database of SARS-CoV-2 virus and filtered with respect to antigenicity, virulency, pathogenicity and non- homologous nature with human proteome using different available online tools and servers. The promising proteins was checked for containing common B and T- cell epitopes. The structure for these proteins were modeled from I-TASSER server followed by its refinement and validation. The predicted common epitopes were mapped on modeled structures of proteins by using Pepitope server. The surface exposed epitopes were docked with the most common allele DRB1*0101 using the GalaxyPepDock server. The epitopes, ELEGIQYGRS from Leader protein (NSP1), YGPFVDRQTA from 3c-like proteinase (nsp5), DLKWARFPKS from NSP9 and YQDVNCTEVP from Surface glycoprotein (spike protein) are the epitopes which has more hydrogen bonds. Hence these four epitopes could be considered as a more promising epitopes and these epitopes can be used for future studies.


Genes ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 61
Author(s):  
Roberto Del Amparo ◽  
Miguel Arenas

Diverse phylogenetic methods require a substitution model of evolution that should mimic, as accurately as possible, the real substitution process. At the protein level, empirical substitution models have traditionally been based on a large number of different proteins from particular taxonomic levels. However, these models assume that all of the proteins of a taxonomic level evolve under the same substitution patterns. We believe that this assumption is highly unrealistic and should be relaxed by considering protein-specific substitution models that account for protein-specific selection processes. In order to test this hypothesis, we inferred and evaluated four new empirical substitution models for the protease and integrase of HIV and other viruses. We found that these models more accurately fit, compared with any of the currently available empirical substitution models, the evolutionary process of these proteins. We conclude that evolutionary inferences from protein sequences are more accurate if they are based on protein-specific substitution models rather than taxonomic-specific (generalist) substitution models. We also present four new empirical substitution models of protein evolution that could be useful for phylogenetic inferences of viral protease and integrase.


Plants ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 52
Author(s):  
Jie Luo ◽  
Junhao Chen ◽  
Wenlei Guo ◽  
Zhengfu Yang ◽  
Kean-Jin Lim ◽  
...  

Due to its peculiar morphological characteristics, there is dispute as to whether the genus of Annamocarya sinensis, a species of Juglandaceae, is Annamocarya or Carya. Most morphologists believe it should be distinguished from the Carya genus while genomicists suggest that A. sinensis belongs to the Carya genus. To explore the taxonomic status of A. sinensis using chloroplast genes, we collected chloroplast genomes of 16 plant species and assembled chloroplast genomes of 10 unpublished Carya species. We analyzed all 26 species’ chloroplast genomes through two analytical approaches (concatenation and coalescence), using the entire and unique chloroplast coding sequence (CDS) and entire and protein sequences. Our results indicate that the analysis of the CDS and protein sequences or unique CDS and unique protein sequence of chloroplast genomes shows that A. sinensis indeed belongs to the Carya genus. In addition, our analysis shows that, compared to single chloroplast genes, the phylogeny trees constructed using numerous genes showed higher consistency. Moreover, the phylogenetic analysis calculated with the coalescence method and unique gene sequences was more robust than that done with the concatenation method, particularly for analyzing phylogenetically controversial species. Through the analysis, our results concluded that A. sinensis should be called C. sinensis.


2021 ◽  
Author(s):  
Vladimir Gligorijevic ◽  
Daniel Berenberg ◽  
Stephen Ra ◽  
Andrew Watkins ◽  
Simon Kelow ◽  
...  

Protein design is challenging because it requires searching through a vast combinatorial space that is only sparsely functional. Self-supervised learning approaches offer the potential to navigate through this space more effectively and thereby accelerate protein engineering. We introduce a sequence denoising autoencoder (DAE) that learns the manifold of protein sequences from a large amount of potentially unlabelled proteins. This DAE is combined with a function predictor that guides sampling towards sequences with higher levels of desired functions. We train the sequence DAE on more than 20M unlabeled protein sequences spanning many evolutionarily diverse protein families and train the function predictor on approximately 0.5M sequences with known function labels. At test time, we sample from the model by iteratively denoising a sequence while exploiting the gradients from the function predictor. We present a few preliminary case studies of protein design that demonstrate the effectiveness of this proposed approach, which we refer to as "deep manifold sampling", including metal binding site addition, function-preserving diversification, and global fold change.


2021 ◽  
Author(s):  
Karla Helena-Bueno ◽  
Charlotte Rebecca Brown ◽  
Egor Konyk ◽  
Sergey Melnikov

Despite the rapidly increasing number of organisms with sequenced genomes, there is no existing resource that simultaneously contains information about genome sequences and the optimal growth conditions for a given species. In the absence of such a resource, we cannot immediately sort genomic sequences by growth conditions, making it difficult to study how organisms and biological molecules adapt to distinct environments. To address this problem, we have created a database called GSHC (Genome Sequences: Hot, Cold, and everything in between). This database, available at http://melnikovlab.com/gshc, brings together information about the genomic sequences and optimal growth temperatures for 25,324 species, including ~89% of the bacterial species with known genome sequences. Using this database, it is now possible to readily compare genomic sequences from thousands of species and correlate variations in genes and genomes with optimal growth temperatures, at the scale of the entire tree of life. The database interface allows users to retrieve protein sequences sorted by optimal growth temperature for their corresponding species, providing a tool to explore how organisms, genomes, and individual proteins and nucleic acids adapt to certain temperatures. We hope that this database will contribute to medicine and biotechnology by helping to create a better understanding of molecular adaptations to heat and cold, leading to new ways to preserve biological samples, engineer useful enzymes, and develop new biological materials and organisms with the desired tolerance to heat and cold.


Author(s):  
Kirthick Kumaran A. S ◽  
Vijayashree Priyadharsini J. ◽  
A. S. Smiline Girija ◽  
P. Sankar Ganesh

Introduction: Antimicrobial peptides (AMPs) are small molecules which are known to exert destructive effects upon pathogenic microorganisms. AMPs are designed from proteins obtained from various sources and tested under in vitro conditions to deduce their antimicrobial activity. Materials and Methods: A few of the peptidoglycan hydrolases such as lysostaphin (AAB53783.1), enterolysin (AGG79281.1), and endolysin (YP_009901016.1) were selected for the study based on an extensive text mining process. The protein sequences of the proteins were retrieved from the NCBI (National Centre for Biotechnology Information) database in the FASTA format (https://www.ncbi.nlm.nih.gov/protein/). Results and Discussion :In the antimicrobial protein lysostaphin, three antimicrobial peptide are been found, in which two is active and other is inactive, and one has antifungal property with a score of -0.15, and one having cell penetrating property, in which all are non toxic. Conclusion: The present study predicts AMPs from lysostaphin, entero and endolysins. These peptides were found to possess antifungal, anti-biofilm properties. Most of the peptides predicted were found to be non-cell penetrating and non-toxic.


Sign in / Sign up

Export Citation Format

Share Document