primary sequence alignment
Recently Published Documents


TOTAL DOCUMENTS

4
(FIVE YEARS 3)

H-INDEX

1
(FIVE YEARS 0)

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Rahil Taujale ◽  
Zhongliang Zhou ◽  
Wayland Yeung ◽  
Kelley W. Moremen ◽  
Sheng Li ◽  
...  

AbstractGlycosyltransferases (GTs) play fundamental roles in nearly all cellular processes through the biosynthesis of complex carbohydrates and glycosylation of diverse protein and small molecule substrates. The extensive structural and functional diversification of GTs presents a major challenge in mapping the relationships connecting sequence, structure, fold and function using traditional bioinformatics approaches. Here, we present a convolutional neural network with attention (CNN-attention) based deep learning model that leverages simple secondary structure representations generated from primary sequences to provide GT fold prediction with high accuracy. The model learns distinguishing secondary structure features free of primary sequence alignment constraints and is highly interpretable. It delineates sequence and structural features characteristic of individual fold types, while classifying them into distinct clusters that group evolutionarily divergent families based on shared secondary structural features. We further extend our model to classify GT families of unknown folds and variants of known folds. By identifying families that are likely to adopt novel folds such as GT91, GT96 and GT97, our studies expand the GT fold landscape and prioritize targets for future structural studies.


2021 ◽  
Author(s):  
Rahil Taujale ◽  
Zhongliang Zhou ◽  
Wayland Yeung ◽  
Kelley W Moremen ◽  
Sheng Li ◽  
...  

Glycosyltransferases (GTs) play fundamental roles in nearly all cellular processes through 10 the biosynthesis of complex carbohydrates and glycosylation of diverse protein and small 11 molecule substrates. The extensive structural and functional diversification of GTs presents a 12 major challenge in mapping the relationships connecting sequence, structure, fold and function 13 using traditional bioinformatics approaches. Here, we present a convolutional neural network 14 with attention (CNN-attention) based deep learning model that leverages simple secondary 15 structure representations generated from primary sequences to provide GT fold prediction with 16 high accuracy. The model learned distinguishing features free of primary sequence alignment 17 constraints and, unlike other models, is highly interpretable and helped identify common 18 secondary structural features shared by divergent families. The model delineated sequence and 19 structural features characteristic of individual fold types, while classifying them into distinct 20 clusters that group evolutionarily divergent families based on shared secondary structural 21 features. We further extend our model to classify GT families of unknown folds and variants of 22 known folds. By identifying families that are likely to adopt novel folds such as GT91, GT96 and 23 GT97, our studies identify targets for future structural studies and expand the GT fold landscape.


Author(s):  
Abu Sajib

Respiratory transmission is the primary route of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infection. Angiotensin I converting enzyme 2 (ACE2) is the known receptor of SARS-CoV-2 surface spike glycoprotein for entry into human cells. A recent study reported absent to low expression of ACE2 in a variety of human lung epithelial cell samples. Three bioprojects (PRJEB4337, PRJNA270632 and PRJNA280600) invariably found abundant expression of ACE1 (a homolog of ACE2 and also known as ACE) in human lungs compared to very low expression of ACE2. In fact, ACE1 has a wider and more abundant tissue distribution compared to ACE2. Although it is not obvious from the primary sequence alignment of ACE1 and ACE2, comparison of X-ray crystallographic structures show striking similarities in the regions of the peptidase domains (PD) of these proteins, which is known (for ACE2) to interact with the receptor binding domain (RBD) of the SARS-CoV-2 spike protein. Critical amino acids in ACE2 that mediate interaction with the viral spike protein are present and organized in the same order in the PD of ACE1. In silico analysis predicts comparable interaction of SARS-CoV-2 spike protein with ACE1 and ACE2. In addition, this study predicts from a list of 1263 already approved drugs that may interact with ACE2 and/or ACE1, potentially interfere with the entry of SARS-CoV-2 inside the host cells and alleviate the symptoms of Coronavirus disease (COVID-19).


2013 ◽  
Vol 33 (3) ◽  
Author(s):  
Takamitsu Miyafusa ◽  
Jose M. M. Caaveiro ◽  
Yoshikazu Tanaka ◽  
Martin E. Tanner ◽  
Kouhei Tsumoto

Enzymes synthesizing the bacterial CP (capsular polysaccharide) are attractive antimicrobial targets. However, we lack critical information about the structure and mechanism of many of them. In an effort to reduce that gap, we have determined three different crystal structures of the enzyme CapE of the human pathogen Staphylococcus aureus. The structure reveals that CapE is a member of the SDR (short-chain dehydrogenase/reductase) super-family of proteins. CapE assembles in a hexameric complex stabilized by three major contact surfaces between protein subunits. Turnover of substrate and/or coenzyme induces major conformational changes at the contact interface between protein subunits, and a displacement of the substrate-binding domain with respect to the Rossmann domain. A novel dynamic element that we called the latch is essential for remodelling of the protein–protein interface. Structural and primary sequence alignment identifies a group of SDR proteins involved in polysaccharide synthesis that share the two salient features of CapE: the mobile loop (latch) and a distinctive catalytic site (MxxxK). The relevance of these structural elements was evaluated by site-directed mutagenesis.


Sign in / Sign up

Export Citation Format

Share Document