scholarly journals Missense variants reveal functional insights into the human ARID family of gene regulators

2021 ◽  
Author(s):  
Gauri Deák ◽  
Atlanta G Cook

Missense variants are alterations to protein coding sequences that result in amino acid substitutions. They can be deleterious if the amino acid is required for maintaining structure or/and function, but are likely to be tolerated at other sites. Consequently, missense variation within a healthy population can mirror the effects of negative selection on protein structure and function, such that functional sites on proteins are often depleted of missense variants. Advances in high-throughput sequencing have dramatically increased the sample size of available human variation data, allowing for population-wide analysis of selective pressures. In this study, we developed a convenient set of tools, called 1D-to-3D, for visualizing the positions of missense variants on protein sequences and structures. We used these tools to characterize human homologues of the ARID family of gene regulators. ARID family members are implicated in multiple cancer types, developmental disorders, and immunological diseases but current understanding of their mechanistic roles is incomplete. Combined with phylogenetic and structural analyses, our approach allowed us to characterise sites important for protein-protein interactions, histone modification recognition, and DNA binding by the ARID proteins. We find that comparing missense depletion patterns among paralogs can reveal sub-functionalization at the level of domains. We propose that visualizing missense variants and their depletion on structures can serve as a valuable tool for complementing evolutionary and experimental findings.

2018 ◽  
Author(s):  
Yanhui Hu ◽  
Richelle Sopko ◽  
Verena Chung ◽  
Romain A. Studer ◽  
Sean D. Landry ◽  
...  

AbstractPost-translational modification (PTM) serves as a regulatory mechanism for protein function, influencing stability, protein interactions, activity and localization, and is critical in many signaling pathways. The best characterized PTM is phosphorylation, whereby a phosphate is added to an acceptor residue, commonly serine, threonine and tyrosine. As proteins are often phosphorylated at multiple sites, identifying those sites that are important for function is a challenging problem. Considering that many phosphorylation sites may be non-functional, prioritizing evolutionarily conserved phosphosites provides a general strategy to identify the putative functional sites with regards to regulation and function. To facilitate the identification of conserved phosphosites, we generated a large-scale phosphoproteomics dataset from Drosophila embryos collected from six closely-related species. We built iProteinDB (https://www.flyrnai.org/tools/iproteindb/), a resource integrating these data with other high-throughput PTM datasets, including vertebrates, and manually curated information for Drosophila. At iProteinDB, scientists can view the PTM landscape for any Drosophila protein and identify predicted functional phosphosites based on a comparative analysis of data from closely-related Drosophila species. Further, iProteinDB enables comparison of PTM data from Drosophila to that of orthologous proteins from other model organisms, including human, mouse, rat, Xenopus laevis, Danio rerio, and Caenorhabditis elegans.


2020 ◽  
Vol 48 (W1) ◽  
pp. W132-W139
Author(s):  
Sumaiya Iqbal ◽  
David Hoksza ◽  
Eduardo Pérez-Palma ◽  
Patrick May ◽  
Jakob B Jespersen ◽  
...  

Abstract Human genome sequencing efforts have greatly expanded, and a plethora of missense variants identified both in patients and in the general population is now publicly accessible. Interpretation of the molecular-level effect of missense variants, however, remains challenging and requires a particular investigation of amino acid substitutions in the context of protein structure and function. Answers to questions like ‘Is a variant perturbing a site involved in key macromolecular interactions and/or cellular signaling?’, or ‘Is a variant changing an amino acid located at the protein core or part of a cluster of known pathogenic mutations in 3D?’ are crucial. Motivated by these needs, we developed MISCAST (missense variant to protein structure analysis web suite; http://miscast.broadinstitute.org/). MISCAST is an interactive and user-friendly web server to visualize and analyze missense variants in protein sequence and structure space. Additionally, a comprehensive set of protein structural and functional features have been aggregated in MISCAST from multiple databases, and displayed on structures alongside the variants to provide users with the biological context of the variant location in an integrated platform. We further made the annotated data and protein structures readily downloadable from MISCAST to foster advanced offline analysis of missense variants by a wide biological community.


2020 ◽  
Vol 15 (4) ◽  
pp. 300-308
Author(s):  
Haixia Long ◽  
Zhao Sun ◽  
Manzhi Li ◽  
Hai Yan Fu ◽  
Ming Cai Lin

Background: Protein phosphorylation is one of the most important Post-translational Modifications (PTMs) occurring at amino acid residues serine (S), threonine (T), and tyrosine (Y). It plays critical roles in protein structure and function predicting. With the development of novel high-throughput sequencing technologies, there are a huge amount of protein sequences being generated and stored in databases. Objective: It is of great importance in both basic research and drug development to quickly and accurately predict which residues of S, T, or Y can be phosphorylated. Methods: In order to solve the problem, a novel hybrid deep learning model with a convolutional neural network and bi-directional long short-term memory recurrent neural network (CNN+BLSTM) is proposed for predicting phosphorylation sites in proteins. The model contains a list of layers that transform the input data into an output class, in which the convolution layer captures higher-level abstraction features of amino acid, while the recurrent layer captures long-term dependencies between amino acids to improve predictions. The joint model learns interactions between higher-level features derived from the protein sequence to predict the phosphorylated sites. Results: We applied our model together with two canonical methods namely iPhos-PseEn and MusiteDeep. A 5-fold cross-validation process indicated that CNN+BLSTM outperforms the two competitors in various evaluation metrics like the area under the receiver operating characteristic and precision-recall curves, the Matthews correlation coefficient, F-measure, accuracy, and so on. Conclusion: CNN+BLSTM is promising in identifying potential protein phosphorylation for further experimental validation.


2010 ◽  
Vol 57 (1) ◽  
Author(s):  
Liping Zhang ◽  
Binyun Ma ◽  
Jianping Wu ◽  
Chunhong Fei ◽  
Lian Yang ◽  
...  

The calcium-activated neutral proteases, mu- and m-calpain, along with their inhibitor, calpastatin, have been demonstrated to mediate a variety of Ca(2+)-dependent processes including signal transduction, cell proliferation, cell cycle progression, differentiation, apoptosis, membrane fusion, platelet activation and skeletal muscle protein degradation. The cDNA coding for yak calpastatin was amplified and cloned by RT-PCR to investigate and characterize the nucleotide/amino-acid sequence and to predict structure and function of the calpastatin. The present study suggests that the yak calpastatin gene encodes a protein of 786 amino acids that shares 99 % sequence identity with the amino-acid sequence of cattle calpastatin, and that the yak protein is composed of an N-terminal region (domains L and XL) and four repetitive homologous C-terminal domains (d1-d4), in which several prosite motifs are present including short peptide L54-64 (EVKPKEHTEPK in domain L) and GXXE/ DXTIPPXYR (in subdomain B), where X is a variable amino acid. Our results suggest the existence of other functional sites including potential phosphorylation sites for protein kinase C, cAMP- and cGMP-dependent protein kinase, casein kinase II, as well as N-myristoylation and amidation sites that play an important role in molecular regulation of the calpain/calpastatin system. The regulation of the calpain/calpastatin system is determined by the interaction between dIV and dVI in calpains and subdomains A, B, and C in calpastatin.


2019 ◽  
Vol 5 (1) ◽  
Author(s):  
Adrian Israel Lehvy ◽  
Guy Horev ◽  
Yarden Golan ◽  
Fabian Glaser ◽  
Yael Shammai ◽  
...  

Abstract Zinc is vital for the structure and function of ~3000 human proteins and hence plays key physiological roles. Consequently, impaired zinc homeostasis is associated with various human diseases including cancer. Intracellular zinc levels are tightly regulated by two families of zinc transporters: ZIPs and ZnTs; ZIPs import zinc into the cytosol from the extracellular milieu, or from the lumen of organelles into the cytoplasm. In contrast, the vast majority of ZnTs compartmentalize zinc within organelles, whereas the ubiquitously expressed ZnT1 is the sole zinc exporter. Herein, we explored the hypothesis that qualitative and quantitative alterations in ZnT1 activity impair cellular zinc homeostasis in cancer. Towards this end, we first used bioinformatics to analyze inactivating mutations in ZIPs and ZNTs, catalogued in the COSMIC and gnomAD databases, representing tumor specimens and healthy population controls, respectively. ZnT1, ZnT10, ZIP8, and ZIP10 showed extremely high rates of loss of function mutations in cancer as compared to healthy controls. Analysis of the putative functional impact of missense mutations in ZnT1-ZnT10 and ZIP1-ZIP14, using homologous protein alignment and structural predictions, revealed that ZnT1 displays a markedly increased frequency of predicted functionally deleterious mutations in malignant tumors, as compared to a healthy population. Furthermore, examination of ZnT1 expression in 30 cancer types in the TCGA database revealed five tumor types with significant ZnT1 overexpression, which predicted dismal prognosis for cancer patient survival. Novel functional zinc transport assays, which allowed for the indirect measurement of cytosolic zinc levels, established that wild type ZnT1 overexpression results in low intracellular zinc levels. In contrast, overexpression of predicted deleterious ZnT1 missense mutations did not reduce intracellular zinc levels, validating eight missense mutations as loss of function (LoF) mutations. Thus, alterations in ZnT1 expression and LoF mutations in ZnT1 provide a molecular mechanism for impaired zinc homeostasis in cancer formation and/or progression.


2021 ◽  
Vol 4 (3) ◽  
pp. 51
Author(s):  
Satish Kantipudi ◽  
Daniel Harder ◽  
Sara Bonetti ◽  
Dimitrios Fotiadis ◽  
Jean-Marc Jeckelmann

Heterodimeric amino acid transporters (HATs) are protein complexes composed of two subunits, a heavy and a light subunit belonging to the solute carrier (SLC) families SLC3 and SLC7. HATs transport amino acids and derivatives thereof across the plasma membrane. The human HAT 4F2hc-LAT1 is composed of the type-II membrane N-glycoprotein 4F2hc (SLC3A2) and the L-type amino acid transporter LAT1 (SLC7A5). 4F2hc-LAT1 is medically relevant, and its dysfunction and overexpression are associated with autism and tumor progression. Here, we provide a general applicable protocol on how to screen for the best membrane transport protein-expressing clone in terms of protein amount and function using Pichia pastoris as expression host. Furthermore, we describe an overexpression and purification procedure for the production of the HAT 4F2hc-LAT1. The isolated heterodimeric complex is pure, correctly assembled, stable, binds the substrate L-leucine, and is thus properly folded. Therefore, this Pichia pastoris-derived recombinant human 4F2hc-LAT1 sample can be used for downstream biochemical and biophysical characterizations.


2021 ◽  
pp. 002203452110048
Author(s):  
G.B. Proctor ◽  
A.M. Shaalan

Although the physiological control of salivary secretion has been well studied, the impact of disease on salivary gland function and how this changes the composition and function of saliva is less well understood and is considered in this review. Secretion of saliva is dependent upon nerve-mediated stimuli, which activate glandular fluid and protein secretory mechanisms. The volume of saliva secreted by salivary glands depends upon the frequency and intensity of nerve-mediated stimuli, which increase dramatically with food intake and are subject to facilitatory or inhibitory influences within the central nervous system. Longer-term changes in saliva secretion have been found to occur in response to dietary change and aging, and these physiological influences can alter the composition and function of saliva in the mouth. Salivary gland dysfunction is associated with different diseases, including Sjögren syndrome, sialadenitis, and iatrogenic disease, due to radiotherapy and medications and is usually reported as a loss of secretory volume, which can range in severity. Defining salivary gland dysfunction by measuring salivary flow rates can be difficult since these vary widely in the healthy population. However, saliva can be sampled noninvasively and repeatedly, which facilitates longitudinal studies of subjects, providing a clearer picture of altered function. The application of omics technologies has revealed changes in saliva composition in many systemic diseases, offering disease biomarkers, but these compositional changes may not be related to salivary gland dysfunction. In Sjögren syndrome, there appears to be a change in the rheology of saliva due to altered mucin glycosylation. Analysis of glandular saliva in diseases or therapeutic interventions causing salivary gland inflammation frequently shows increased electrolyte concentrations and increased presence of innate immune proteins, most notably lactoferrin. Altering nerve-mediated signaling of salivary gland secretion contributes to medication-induced dysfunction and may also contribute to altered saliva composition in neurodegenerative disease.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Hilary C. Martin ◽  
◽  
Eugene J. Gardner ◽  
Kaitlin E. Samocha ◽  
Joanna Kaplanis ◽  
...  

AbstractOver 130 X-linked genes have been robustly associated with developmental disorders, and X-linked causes have been hypothesised to underlie the higher developmental disorder rates in males. Here, we evaluate the burden of X-linked coding variation in 11,044 developmental disorder patients, and find a similar rate of X-linked causes in males and females (6.0% and 6.9%, respectively), indicating that such variants do not account for the 1.4-fold male bias. We develop an improved strategy to detect X-linked developmental disorders and identify 23 significant genes, all of which were previously known, consistent with our inference that the vast majority of the X-linked burden is in known developmental disorder-associated genes. Importantly, we estimate that, in male probands, only 13% of inherited rare missense variants in known developmental disorder-associated genes are likely to be pathogenic. Our results demonstrate that statistical analysis of large datasets can refine our understanding of modes of inheritance for individual X-linked disorders.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Siddhartha Kundu

Abstract Objective Non-haem iron(II)- and 2-oxoglutarate-dependent dioxygenases (i2OGdd), are a taxonomically and functionally diverse group of enzymes. The active site comprises ferrous iron in a hexa-coordinated distorted octahedron with the apoenzyme, 2-oxoglutarate and a displaceable water molecule. Current information on novel i2OGdd members is sparse and relies on computationally-derived annotation schema. The dissimilar amino acid composition and variable active site geometry thereof, results in differing reaction chemistries amongst i2OGdd members. An additional need of researchers is a curated list of sequences with putative i2OGdd function which can be probed further for empirical data. Results This work reports the implementation of $$Fe\left(2\right)OG$$ F e 2 O G , a web server with dual functionality and an extension of previous work on i2OGdd enzymes $$\left(Fe\left(2\right)OG\equiv \{H2OGpred,DB2OG\}\right)$$ F e 2 O G ≡ { H 2 O G p r e d , D B 2 O G } . $$Fe\left(2\right)OG$$ F e 2 O G , in this form is completely revised, updated (URL, scripts, repository) and will strengthen the knowledge base of investigators on i2OGdd biochemistry and function. $$Fe\left(2\right)OG$$ F e 2 O G , utilizes the superior predictive propensity of HMM-profiles of laboratory validated i2OGdd members to predict probable active site geometries in user-defined protein sequences. $$Fe\left(2\right)OG$$ F e 2 O G , also provides researchers with a pre-compiled list of analyzed and searchable i2OGdd-like sequences, many of which may be clinically relevant. $$Fe(2)OG$$ F e ( 2 ) O G , is freely available (http://204.152.217.16/Fe2OG.html) and supersedes all previous versions, i.e., H2OGpred, DB2OG.


Sign in / Sign up

Export Citation Format

Share Document