New computational model for miRNA-mediated repression reveals novel regulatory roles of miRNA bindings inside the coding region

Author(s):  
Shaked Bergman ◽  
Alon Diament ◽  
Tamir Tuller

Abstract Motivation MicroRNAs (miRNAs) are short (∼24nt), non-coding RNAs, which downregulate gene expression in many species and physiological processes. Many details regarding the mechanism which governs miRNA-mediated repression continue to elude researchers. Results We elucidate the interplay between the coding sequence and the 3′UTR, by using elastic net regularization and incorporating translation-related features to predict miRNA-mediated repression. We find that miRNA binding sites at the end of the coding sequence contribute to repression, and that weak binding sites are linked to effective de-repression, possibly as a result of competing with stronger binding sites. Furthermore, we propose a recycling model for miRNAs dissociated from the open reading frame (ORF) by traversing ribosomes, explaining the observed link between increased ribosome density/traversal speed and increased repression. We uncover a novel layer of interaction between the coding sequence and the 3′UTR (untranslated region) and suggest the ORF has a larger role than previously thought in the mechanism of miRNA-mediated repression. Availability and implementation The code is freely available at https://github.com/aescrdni/miRNA_model. Supplementary information Supplementary data are available at Bioinformatics online.

1996 ◽  
Vol 315 (1) ◽  
pp. 315-321 ◽  
Author(s):  
Kent L. REDMAN ◽  
Glenn W. BURRIS

Rat cDNAs for a 52-amino-acid ribosomal protein (CEP52) that is typically formed as a ubiquitin fusion protein, were cloned following reverse transcription and PCR amplification. CEP52 sequence conservation is demonstrated by the similarity of the human and rat cDNA sequences and the identity of the predicted proteins. Amplification of rat cDNA with a primer specific for the 3´ non-coding region of the CEP52 gene, in combination with a consensus primer for the 5´ end of the ubiquitin coding sequence, provided evidence that the rat CEP52 gene is fused to a ubiquitin reading frame. Direct sequence analysis of this PCR product confirmed the in-frame fusion of a ubiquitin coding sequence to the rat CEP52 gene. Antibodies against a synthetic CEP52 peptide were used to show that expressed CEP52 is associated with the 60 S ribosomal subunit, and that it is not linked to ubiquitin. The quantity of CEP52 found in different tissues is quite variable, but appears to correspond to the amount of ribosomes present. Although the human, Arabidopsis thaliana and Nicotiana tabacum CEP52 genes contain introns within the CEP52 coding region, the rat CEP52 coding sequence appears to lack insertions.


2017 ◽  
Author(s):  
Daniel Kogan ◽  
Vijay Kumar Ulaganathan

AbstractMotivationHuman individuals differ because of variations in the DNA sequences of all the 46 chromosomes. Information on genetic variations altering the membrane-proximal binding sites for signal transducer of transcription 3 (STAT3) is valuable for understanding the genetic basis of cancer prognosis and disease progression (Ulaganathan et al, 2015). In this regard, non-synonymous coding region mutations resulting in the alteration of protein sequence in the juxtamembrane region of the type I membrane proteins are biologically and clinically relevant. The knowledge of such rare cell line- and individual-specific germline receptor variants is crucial for the investigation of cell-line specific biological mechanisms and genotype-centric therapeutic approaches.ResultsHere we present TraPS-VarI (Transmembrane Protein Sequence Variant Identifier), a python module to rapidly identify human germline receptor variants modulating STAT3 binding sites by using the genetic variation datasets in the variant call format 4.0. For the found protein variants the module also checks for the availability of associated therapeutic agents and ongoing clinical trial studies.AvailabilityThe Source code and binaries are freely available for download at https://gitlab.com/VJ-Ulaganathan/TraPS-VarI and the documentation can be found at http://traps-vari.readthedocs.io/[email protected] & [email protected] informationSupplementary data enclosed with the manuscript file.


2019 ◽  
Vol 35 (22) ◽  
pp. 4760-4763 ◽  
Author(s):  
Saber HafezQorani ◽  
Aissa Houdjedj ◽  
Mehmet Arici ◽  
Abdesselam Said ◽  
Hilal Kazan

Abstract Summary Long non-coding RNAs (lncRNAs) can act as molecular sponge or decoys for an RNA-binding protein (RBP) through their RBP-binding sites, thereby modulating the expression of all target genes of the corresponding RBP of interest. Here, we present a web tool named RBPSponge to explore lncRNAs based on their potential to act as a sponge for an RBP of interest. RBPSponge identifies the occurrences of RBP-binding sites and CLIP peaks on lncRNAs, and enables users to run statistical analyses to investigate the regulatory network between lncRNAs, RBPs and targets of RBPs. Availability and implementation The web server is available at https://www.RBPSponge.com. Supplementary information Supplementary data are available at Bioinformatics online.


2013 ◽  
Vol 94 (7) ◽  
pp. 1486-1495 ◽  
Author(s):  
Graham J. Belsham

The foot-and-mouth disease virus (FMDV) Leader (L) protein is produced in two forms, Lab and Lb, differing only at their amino-termini, due to the use of separate initiation codons, usually 84 nt apart. It has been shown previously, and confirmed here, that precise deletion of the Lab coding sequence is lethal for the virus, whereas loss of the Lb coding sequence results in a virus that is viable in BHK cells. In addition, it is now shown that deletion of the ‘spacer’ region between these two initiation codons can be tolerated. Growth of the virus precisely lacking just the Lb coding sequence resulted in a previously undetected accumulation of frameshift mutations within the ‘spacer’ region. These mutations block the inappropriate fusion of amino acid sequences to the amino-terminus of the capsid protein precursor. Modification, by site-directed mutagenesis, of the Lab initiation codon, in the context of the virus lacking the Lb coding region, was also tolerated by the virus within BHK cells. However, precise loss of the Lb coding sequence alone blocked FMDV replication in primary bovine thyroid cells. Thus, the requirement for the Leader protein coding sequences is highly dependent on the nature and extent of the residual Leader protein sequences and on the host cell system used. FMDVs precisely lacking Lb and with the Lab initiation codon modified may represent safer seed viruses for vaccine production.


Author(s):  
Yang Lin ◽  
Xiaoyong Pan ◽  
Hong-Bin Shen

Abstract Motivation Long non-coding RNAs (lncRNAs) are generally expressed in a tissue-specific way, and subcellular localizations of lncRNAs depend on the tissues or cell lines that they are expressed. Previous computational methods for predicting subcellular localizations of lncRNAs do not take this characteristic into account, they train a unified machine learning model for pooled lncRNAs from all available cell lines. It is of importance to develop a cell-line-specific computational method to predict lncRNA locations in different cell lines. Results In this study, we present an updated cell-line-specific predictor lncLocator 2.0, which trains an end-to-end deep model per cell line, for predicting lncRNA subcellular localization from sequences.We first construct benchmark datasets of lncRNA subcellular localizations for 15 cell lines. Then we learn word embeddings using natural language models, and these learned embeddings are fed into convolutional neural network, long short-term memory and multilayer perceptron to classify subcellular localizations. lncLocator 2.0 achieves varying effectiveness for different cell lines and demonstrates the necessity of training cell-line-specific models. Furthermore, we adopt Integrated Gradients to explain the proposed model in lncLocator 2.0, and find some potential patterns that determine the subcellular localizations of lncRNAs, suggesting that the subcellular localization of lncRNAs is linked to some specific nucleotides. Availability The lncLocator 2.0 is available at www.csbio.sjtu.edu.cn/bioinf/lncLocator2 and the source code can be found at https://github.com/Yang-J-LIN/lncLocator2. Supplementary information Supplementary data are available at Bioinformatics online.


2010 ◽  
Vol 84 (16) ◽  
pp. 8219-8230 ◽  
Author(s):  
Monika Somberg ◽  
Stefan Schwartz

ABSTRACT Our results presented here demonstrate that the most abundant human papillomavirus type 16 (HPV-16) mRNAs expressing the viral oncogenes E6 and E7 are regulated by cellular ASF/SF2, itself defined as a proto-oncogene and overexpressed in cervical cancer cells. We show that the most frequently used 3′-splice site on the HPV-16 genome, site SA3358, which is used to produce primarily E4, E6, and E7 mRNAs, is regulated by ASF/SF2. Splice site SA3358 is immediately followed by 15 potential binding sites for the splicing factor ASF/SF2. Recombinant ASF/SF2 binds to the cluster of ASF/SF2 sites. Mutational inactivation of all 15 sites abolished splicing to SA3358 and redirected splicing to the downstream-located, late 3′-splice site SA5639. Overexpression of a mutant ASF/SF2 protein that lacks the RS domain, also totally inhibited the usage of SA3358 and redirected splicing to the late 3′-splice site SA5639. The 15 ASF/SF2 binding sites could be replaced by an ASF/SF2-dependent, HIV-1-derived splicing enhancer named GAR. This enhancer was also inhibited by the mutant ASF/SF2 protein that lacks the RS domain. Finally, silencer RNA (siRNA)-mediated knockdown of ASF/SF2 caused a reduction in spliced HPV-16 mRNA levels. Taken together, our results demonstrate that the major HPV-16 3′-splice site SA3358 is dependent on ASF/SF2. SA3358 is used by the most abundantly expressed HPV-16 mRNAs, including those encoding E6 and E7. High levels of ASF/SF2 may therefore be a requirement for progression to cervical cancer. This is supported by our earlier findings that ASF/SF2 is overexpressed in high-grade cervical lesions and cervical cancer.


2013 ◽  
Vol 368 (1632) ◽  
pp. 20130018 ◽  
Author(s):  
Andrea I. Ramos ◽  
Scott Barolo

In the era of functional genomics, the role of transcription factor (TF)–DNA binding affinity is of increasing interest: for example, it has recently been proposed that low-affinity genomic binding events, though frequent, are functionally irrelevant. Here, we investigate the role of binding site affinity in the transcriptional interpretation of Hedgehog (Hh) morphogen gradients . We noted that enhancers of several Hh-responsive Drosophila genes have low predicted affinity for Ci, the Gli family TF that transduces Hh signalling in the fly. Contrary to our initial hypothesis, improving the affinity of Ci/Gli sites in enhancers of dpp , wingless and stripe , by transplanting optimal sites from the patched gene, did not result in ectopic responses to Hh signalling. Instead, we found that these enhancers require low-affinity binding sites for normal activation in regions of relatively low signalling. When Ci/Gli sites in these enhancers were altered to improve their binding affinity, we observed patterning defects in the transcriptional response that are consistent with a switch from Ci-mediated activation to Ci-mediated repression. Synthetic transgenic reporters containing isolated Ci/Gli sites confirmed this finding in imaginal discs. We propose that the requirement for gene activation by Ci in the regions of low-to-moderate Hh signalling results in evolutionary pressure favouring weak binding sites in enhancers of certain Hh target genes.


1993 ◽  
Vol 13 (8) ◽  
pp. 5034-5042
Author(s):  
C L Wellington ◽  
M E Greenberg ◽  
J G Belasco

The protein-coding region of the c-fos proto-oncogene transcript contains elements that direct the rapid deadenylation and decay of this mRNA in mammalian cells. The function of these coding region instability determinants requires movement of ribosomes across mRNAs containing them. Three types of mechanisms could account for this translational requirement. Two of these possibilities, (i) that rapid mRNA decay might be mediated by the nascent polypeptide chain and (ii) that it might result from an unusual codon usage, have experimental precedent. Here, we present evidence that the destabilizing elements in the c-fos coding region are not recognized in either of these two ways. Instead, the ability of the c-fos coding region to function as a potent mRNA destabilizer when translated in the +1 reading frame indicates that the signals for rapid deadenylation and decay reside in the sequence or structure of the RNA comprising this c-fos domain.


1988 ◽  
Vol 8 (8) ◽  
pp. 3439-3447 ◽  
Author(s):  
W Bajwa ◽  
T E Torchia ◽  
J E Hopper

GAL3 gene expression is required for rapid GAL4-mediated galactose induction of the galactose-melibiose regulon genes in Saccharomyces cerevisiae. Here we show by Northern (RNA) blot analysis that GAL3 gene expression is itself galactose inducible. Like the GAL1, GAL7, GAL10, and MEL1 genes, the GAL3 gene is severely glucose repressed. Like the MEL1 gene, but in contrast to the GAL1, GAL7, and GAL10 genes, GAL3 is expressed at readily detectable basal levels in cells grown in noninducing, nonrepressing media. We determined the sequence of the S. cerevisiae GAL3 gene and its 5'-noncoding region. Within the 5'-noncoding region of the GAL3 gene, we found two sequences similar to the UASGal elements of the other galactose-melibiose regulon genes. Deletion analysis indicated that only the most ATG proximal of these sequences is required for GAL3 expression. The coding region of GAL3 consists of a 1,275-base-pair open reading frame in the direction of transcription. A comparison of the deduced 425-amino-acid sequence with the protein data bank revealed three regions of striking similarity between the GAL3 protein and the GAL1-specified galactokinase of Saccharomyces carlsbergensis. One of these regions also showed striking similarity to sequences within the galactokinase protein of Escherichia coli. On the basis of these protein sequence similarities, we propose that the GAL3 protein binds a molecule identical to or structurally related to one of the substrates or products of the galactokinase-catalyzed reaction.


2001 ◽  
Vol 204 (16) ◽  
pp. 2803-2816 ◽  
Author(s):  
P. K. LOI ◽  
S. A. EMMAL ◽  
Y. PARK ◽  
N. J. TUBLITZ

SUMMARYThe crustacean cardioactive peptide (CCAP) gene was isolated from the tobacco hawkmoth Manduca sexta. The gene has an open reading frame of 125 amino acid residues containing a single, complete copy of CCAP. Analysis of the gene structure revealed three introns interrupting the coding region. A comparison of the M. sexta CCAP gene with the Drosophila melanogaster genome database reveals significant similarities in sequence and gene structure.The spatial and temporal expression patterns of the CCAP gene in the M. sexta central nervous system were determined in all major post-embryonic stages using in situ hybridization techniques. The CCAP gene is expressed in a total of 116 neurons in the post-embryonic M. sextacentral nervous system. Nine pairs of cells are observed in the brain, 4.5 pairs in the subesophageal ganglion, three pairs in each thoracic ganglion(T1-T3), three pairs in the first abdominal ganglion (A1), five pairs each in the second to sixth abdominal ganglia (A2-A6) and 7.5 pairs in the terminal ganglion. The CCAP gene is expressed in every ganglion in each post-embryonic stage, except in the thoracic ganglia of first- and second-instar larvae. The number of cells expressing the CCAP gene varies during post-embryonic life,starting at 52 cells in the first instar and reaching a maximum of 116 shortly after pupation. One set of thoracic neurons expressing CCAP mRNA shows unusual variability in expression levels immediately prior to larval ecdysis. Using previously published CCAP immunocytochemical data, it was determined that 91 of 95 CCAP-immunopositive neurons in the M. sexta central nervous system also express the M. sexta CCAP gene, indicating that there is likely to be only a single CCAP gene in M. sexta.


Sign in / Sign up

Export Citation Format

Share Document