Sequence conservation, domain architectures, and phylogenetic distribution of the HD-GYP type c-di-GMP phosphodiesterases

10.1101/2021.11.05.467447 ◽

2021 ◽

Author(s):

Michael Y. Galperin ◽

Shan-Ho Chou

Keyword(s):

Protein Interactions ◽

Regulatory Protein ◽

Second Messenger ◽

Distribution Patterns ◽

Distribution Functions ◽

Amino Acid Residues ◽

Sequence Motifs ◽

Protein Protein Interactions ◽

Conserved Sequence ◽

Common Domain

The HD-GYP domain, named after two of its conserved sequence motifs, was first described in 1999 as a specialized version of the widespread HD phosphohydrolase domain that had additional highly conserved amino acid residues. Domain associations of HD-GYP indicated its involvement in bacterial signal transduction and distribution patterns of this domain suggested that it could serve as a hydrolase of the bacterial second messenger c-di-GMP, in addition to or instead of the EAL domain. Subsequent studies confirmed the ability of various HD-GYP domains to hydrolyze c-di-GMP to linear pGpG and/or GMP. Certain HD-GYP-containing proteins hydrolyze another second messenger, cGAMP, and some HD-GYP domains participate in regulatory protein-protein interactions. The recently solved structures of HD-GYP domains from four distinct organisms clarified the mechanisms of c-di-GMP binding and metal-assisted hydrolysis. However, the HD-GYP domain is poorly represented in public domain databases, which causes certain confusion about its phylogenic distribution, functions, and domain architectures. Here, we present a refined sequence model for the HD-GYP domain and describe the roles of its most conserved residues in metal and/or substrate binding. We also calculate the numbers of HD-GYPs encoded in various genomes and list the most common domain combinations involving HD-GYP, such as the RpfG (REC-HD-GYP), Bd1817 (DUF3391-HD-GYP), and PmGH (GAF-HD-GYP) protein families. We also provide the descriptions of six HD-GYP-associated domains, including four novel integral membrane sensor domains. This work is expected to stimulate studies of diverse HD-GYP-containing proteins, their N-terminal sensor domains, and the signals to which they respond.

Download Full-text

snRNP Sm proteins share two evolutionarily conserved sequence motifs which are involved in Sm protein-protein interactions.

The EMBO Journal ◽

10.1002/j.1460-2075.1995.tb07199.x ◽

1995 ◽

Vol 14 (9) ◽

pp. 2076-2088 ◽

Cited By ~ 137

Author(s):

H. Hermann ◽

P. Fabrizio ◽

V.A. Raker ◽

K. Foulaki ◽

H. Hornig ◽

...

Keyword(s):

Protein Interactions ◽

Sequence Motifs ◽

Protein Protein Interactions ◽

Sm Proteins ◽

Conserved Sequence ◽

Evolutionarily Conserved ◽

Conserved Sequence Motifs ◽

Sm Protein

Download Full-text

Characterizing protein-DNA binding event subtypes in ChIP-exo data

10.1101/266536 ◽

2018 ◽

Cited By ~ 3

Author(s):

Naomi Yamada ◽

William K.M. Lai ◽

Nina Farrell ◽

B. Franklin Pugh ◽

Shaun Mahony

Keyword(s):

Dna Binding ◽

Protein Interactions ◽

Regulatory Protein ◽

Distribution Patterns ◽

Dna Interaction ◽

Binding Modes ◽

Dna Motifs ◽

Multiple Protein ◽

Binding Event ◽

Dna Crosslinking

AbstractMotivationRegulatory proteins associate with the genome either by directly binding cognate DNA motifs or via protein-protein interactions with other regulators. Each recruitment mechanism may be associated with distinct motifs and may also result in distinct characteristic patterns in high-resolution protein-DNA binding assays. For example, the ChIP-exo protocol precisely characterizes protein-DNA crosslinking patterns by combining chromatin immunoprecipitation (ChIP) with 5’ → 3’ exonuclease digestion. Since different regulatory complexes will result in different protein-DNA crosslinking signatures, analysis of ChIP-exo tag enrichment patterns should enable detection of multiple protein-DNA binding modes for a given regulatory protein. However, current ChIP-exo analysis methods either treat all binding events as being of a uniform type or rely on motifs to cluster binding events into subtypes.ResultsTo systematically detect multiple protein-DNA interaction modes in a single ChIP-exo experiment, we introduce the ChIP-exo mixture model (ChExMix). ChExMix probabilistically models the genomic locations and subtype memberships of binding events using both ChIP-exo tag distribution patterns and DNA motifs. We demonstrate that ChExMix achieves accurate detection and classification of binding event subtypes using in silico mixed ChIP-exo data. We further demonstrate the unique analysis abilities of ChExMix using a collection of ChIP-exo experiments that profile the binding of key transcription factors in MCF-7 cells. In these data, ChExMix identifies possible recruitment mechanisms of FoxA1 and ERα, thus demonstrating that ChExMix can effectively stratify ChIP-exo binding events into biologically meaningful subtypes.AvailabilityChExMix is available from https://github.com/seqcode/[email protected]

Download Full-text

Octamer transcription factors 1 and 2 each bind to two different functional elements in the immunoglobulin heavy-chain promoter.

Molecular and Cellular Biology ◽

10.1128/mcb.9.2.747 ◽

1989 ◽

Vol 9 (2) ◽

pp. 747-756 ◽

Cited By ~ 43

Author(s):

L Poellinger ◽

R G Roeder

Keyword(s):

Transcription Factors ◽

Heavy Chain ◽

Protein Interactions ◽

Transcription Initiation ◽

Immunoglobulin Heavy Chain ◽

Sequence Motifs ◽

Cell Type Specificity ◽

Conserved Sequence ◽

Octamer Motif ◽

Sequence Elements

Immunoglobulin heavy-chain genes contain two conserved sequence elements 5' to the site of transcription initiation: the octamer ATGCAAAT and the heptamer CTCATGA. Both of these elements are required for normal cell-specific promoter function. The present study demonstrates that both the ubiquitous and lymphoid-cell-specific octamer transcription factors (OTF-1 and OTF-2, respectively) interact specifically with each of the two conserved sequence elements, forming either homo- or heterodimeric complexes. This was surprising, since the heptamer and octamer sequence motifs bear no obvious similarity to each other. Binding of either factor to the octamer element occurred independently. However, OTF interaction with the heptamer sequence appeared to require the presence of an intact octamer motif and occurred with a spacing of either 2 or 14 base pairs between the two elements, suggesting coordinate binding resulting from protein-protein interactions. The degeneracy in sequences recognized by the OTFs may be important in widening the range over which gene expression can be modulated and in establishing cell type specificity.

Download Full-text

Senescence-associated Barley NAC (NAM, ATAF1,2, CUC) Transcription Factor Interacts with Radical-induced Cell Death 1 through a Disordered Regulatory Domain

Journal of Biological Chemistry ◽

10.1074/jbc.m111.247221 ◽

2011 ◽

Vol 286 (41) ◽

pp. 35418-35429 ◽

Cited By ~ 56

Author(s):

Trine Kjaersgaard ◽

Michael K. Jensen ◽

Michael W. Christiansen ◽

Per Gregersen ◽

Birthe B. Kragelund ◽

...

Keyword(s):

Cell Death ◽

Regulatory Protein ◽

Intrinsic Disorder ◽

Target Sequence ◽

Sequence Motifs ◽

Conserved Sequence ◽

Intrinsically Disordered ◽

Age Related ◽

Terminal Domains

Senescence in plants involves massive nutrient relocation and age-related cell death. Characterization of the molecular components, such as transcription factors (TFs), involved in these processes is required to understand senescence. We found that HvNAC005 and HvNAC013 of the plant-specific NAC (NAM, ATAF1,2, CUC) TF family are up-regulated during senescence in barley (Hordeum vulgare). Both HvNAC005 and HvNAC013 bound the conserved NAC DNA target sequence. Computational and biophysical analyses showed that both proteins are intrinsically disordered in their large C-terminal domains, which are transcription regulatory domains (TRDs) in many NAC TFs. Using motif searches and interaction studies in yeast we identified an evolutionarily conserved sequence, the LP motif, in the TRD of HvNAC013. This motif was sufficient for transcriptional activity. In contrast, HvNAC005 did not function as a transcriptional activator suggesting that an involvement of HvNAC013 and HvNAC005 in senescence will be different. HvNAC013 interacted with barley radical-induced cell death 1 (RCD1) via the very C-terminal part of its TRD, outside of the region containing the LP motif. No significant secondary structure was induced in the HvNAC013 TRD upon interaction with RCD1. RCD1 also interacted with regions dominated by intrinsic disorder in TFs of the MYB and basic helix-loop-helix families. We propose that RCD1 is a regulatory protein capable of interacting with many different TFs by exploiting their intrinsic disorder. In addition, we present the first structural characterization of NAC C-terminal domains and relate intrinsic disorder and sequence motifs to activity and protein-protein interactions.

Download Full-text

Octamer transcription factors 1 and 2 each bind to two different functional elements in the immunoglobulin heavy-chain promoter

Molecular and Cellular Biology ◽

10.1128/mcb.9.2.747-756.1989 ◽

1989 ◽

Vol 9 (2) ◽

pp. 747-756

Author(s):

L Poellinger ◽

R G Roeder

Keyword(s):

Transcription Factors ◽

Heavy Chain ◽

Protein Interactions ◽

Transcription Initiation ◽

Immunoglobulin Heavy Chain ◽

Sequence Motifs ◽

Cell Type Specificity ◽

Conserved Sequence ◽

Octamer Motif ◽

Sequence Elements

Immunoglobulin heavy-chain genes contain two conserved sequence elements 5' to the site of transcription initiation: the octamer ATGCAAAT and the heptamer CTCATGA. Both of these elements are required for normal cell-specific promoter function. The present study demonstrates that both the ubiquitous and lymphoid-cell-specific octamer transcription factors (OTF-1 and OTF-2, respectively) interact specifically with each of the two conserved sequence elements, forming either homo- or heterodimeric complexes. This was surprising, since the heptamer and octamer sequence motifs bear no obvious similarity to each other. Binding of either factor to the octamer element occurred independently. However, OTF interaction with the heptamer sequence appeared to require the presence of an intact octamer motif and occurred with a spacing of either 2 or 14 base pairs between the two elements, suggesting coordinate binding resulting from protein-protein interactions. The degeneracy in sequences recognized by the OTFs may be important in widening the range over which gene expression can be modulated and in establishing cell type specificity.

Download Full-text

Loss of beta-adrenergic receptor-guanine nucleotide regulatory protein interactions accompanies decline in catecholamine responsiveness of adenylate cyclase in maturing rat erythrocytes.

Journal of Biological Chemistry ◽

10.1016/s0021-9258(19)85960-4 ◽

1980 ◽

Vol 255 (5) ◽

pp. 1854-1861 ◽

Cited By ~ 2

Author(s):

L.E. Limbird ◽

D.M. Gill ◽

J.M. Stadel ◽

A.R. Hickey ◽

R.J. Lefkowitz

Keyword(s):

Adenylate Cyclase ◽

Protein Interactions ◽

Adrenergic Receptor ◽

Regulatory Protein ◽

Guanine Nucleotide ◽

Beta Adrenergic Receptor ◽

Rat Erythrocytes ◽

Beta Adrenergic ◽

Guanine Nucleotide Regulatory Protein ◽

Catecholamine Responsiveness

Download Full-text

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Bioinformatics ◽

10.1093/bioinformatics/btab083 ◽

2021 ◽

Author(s):

Yanrong Ji ◽

Zhihan Zhou ◽

Han Liu ◽

Ramana V Davuluri

Keyword(s):

Dna Sequences ◽

Regulatory Elements ◽

Ease Of Use ◽

Fine Tuning ◽

Supplementary Information ◽

Sequence Motifs ◽

Semantic Relationship ◽

Accurate Identification ◽

Conserved Sequence ◽

Genome Wide

Abstract Motivation Deciphering the language of non-coding DNA is one of the fundamental problems in genome research. Gene regulatory code is highly complex due to the existence of polysemy and distant semantic relationship, which previous informatics methods often fail to capture especially in data-scarce scenarios. Results To address this challenge, we developed a novel pre-trained bidirectional encoder representation, named DNABERT, to capture global and transferrable understanding of genomic DNA sequences based on up and downstream nucleotide contexts. We compared DNABERT to the most widely used programs for genome-wide regulatory elements prediction and demonstrate its ease of use, accuracy and efficiency. We show that the single pre-trained transformers model can simultaneously achieve state-of-the-art performance on prediction of promoters, splice sites and transcription factor binding sites, after easy fine-tuning using small task-specific labeled data. Further, DNABERT enables direct visualization of nucleotide-level importance and semantic relationship within input sequences for better interpretability and accurate identification of conserved sequence motifs and functional genetic variant candidates. Finally, we demonstrate that pre-trained DNABERT with human genome can even be readily applied to other organisms with exceptional performance. We anticipate that the pre-trained DNABERT model can be fined tuned to many other sequence analyses tasks. Availability and implementation The source code, pretrained and finetuned model for DNABERT are available at GitHub (https://github.com/jerryji1993/DNABERT). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Olfactory expression of trace amine-associated receptors requires cooperative cis-acting enhancers

Nature Communications ◽

10.1038/s41467-021-23824-3 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Ami Shah ◽

Madison Ratkowski ◽

Alessandro Rosa ◽

Paul Feinstein ◽

Thomas Bozza

Keyword(s):

Gene Expression ◽

Large Family ◽

Sequence Motifs ◽

Specific Expression ◽

Cis Acting ◽

Conserved Sequence ◽

Trace Amine ◽

Sequence Elements ◽

Cell Type Specific Expression ◽

Cell Type Specific

AbstractOlfactory sensory neurons express a large family of odorant receptors (ORs) and a small family of trace amine-associated receptors (TAARs). While both families are subject to so-called singular expression (expression of one allele of one gene), the mechanisms underlying TAAR gene choice remain obscure. Here, we report the identification of two conserved sequence elements in the mouse TAAR cluster (T-elements) that are required for TAAR gene expression. We observed that cell-type-specific expression of a TAAR-derived transgene required either T-element. Moreover, deleting either element reduced or abolished expression of a subset of TAAR genes, while deleting both elements abolished olfactory expression of all TAARs in cis with the mutation. The T-elements exhibit several features of known OR enhancers but also contain highly conserved, unique sequence motifs. Our data demonstrate that TAAR gene expression requires two cooperative cis-acting enhancers and suggest that ORs and TAARs share similar mechanisms of singular expression.

Download Full-text

Spatial organization enhances versatility and specificity in cyclic di-GMP signaling

Biological Chemistry ◽

10.1515/hsz-2020-0202 ◽

2020 ◽

Vol 401 (12) ◽

pp. 1323-1334

Author(s):

Sandra Kunz ◽

Peter L. Graumann

Keyword(s):

Protein Interactions ◽

Cell Cycle Progression ◽

Spatial Organization ◽

Caulobacter Crescentus ◽

Second Messenger ◽

Protein Protein Interactions ◽

Spatial Cues ◽

Signaling Specificity ◽

Regulatory Circuits ◽

Second Messenger Signaling

AbstractThe second messenger cyclic di-GMP regulates a variety of processes in bacteria, many of which are centered around the decision whether to adopt a sessile or a motile life style. Regulatory circuits include pathogenicity, biofilm formation, and motility in a wide variety of bacteria, and play a key role in cell cycle progression in Caulobacter crescentus. Interestingly, multiple, seemingly independent c-di-GMP pathways have been found in several species, where deletions of individual c-di-GMP synthetases (DGCs) or hydrolases (PDEs) have resulted in distinct phenotypes that would not be expected based on a freely diffusible second messenger. Several recent studies have shown that individual signaling nodes exist, and additionally, that protein/protein interactions between DGCs, PDEs and c-di-GMP receptors play an important role in signaling specificity. Additionally, subcellular clustering has been shown to be employed by bacteria to likely generate local signaling of second messenger, and/or to increase signaling specificity. This review highlights recent findings that reveal how bacteria employ spatial cues to increase the versatility of second messenger signaling.

Download Full-text