scholarly journals utR.annotation: a tool for annotating genomic variants that could influence post-transcriptional regulation

2021 ◽  
Author(s):  
Yating Liu ◽  
Joseph Dougherty

Whole genome sequencing of patient populations is identifying thousands of new variants in UnTranslated Regions(UTRs). While the consequences of UTR mutations are not as easily predicted from primary sequence as coding mutations are, there are some known features of UTRs modulate their function. utR.annotation is an R package that can be used to annotate potential deleterious variants in the UTR regions for both human and mouse species. Given a CSV or VCF format variant file, utR.annotation provides information of each variant on whether and how it alters known translational regulators including:upstream Open Reading Frames (uORFs), upstream Kozak sequences, polyA signals, the Kozak sequence at the annotated translation initiation site, start codon, and stop codon, conservation scores in the variant position, and whether and how it changes ribosome loading based on a model from empirical data.

Author(s):  
Yating Liu ◽  
Joseph D Dougherty

Abstract Summary Whole genome sequencing of patient populations is identifying thousands of new variants in UnTranslated Regions(UTRs). While the consequences of UTR mutations are not as easily predicted from primary sequence as coding mutations are, there are some known features of UTRs that modulate their function. utr.annotation is an R package that can be used to annotate potential deleterious variants in the UTR regions for both human and mouse species. Given a CSV or VCF format variant file, utr.annotation provides information of each variant on whether and how it alters known translational regulators including: upstream Open Reading Frames (uORFs), upstream Kozak sequences, polyA signals, Kozak sequences at the annotated translation start site, start codons, and stop codons, conservation scores in the variant position, and whether and how it changes ribosome loading based on a model derived from empirical data. Availability utr.annotation is freely available on Bitbucket (https://bitbucket.org/jdlabteam/utr.annotation/src/master/) and CRAN (https://cran.r-project.org/web/packages/utr.annotation/index.html) Supplementary information Supplementary data are available at https://wustl.box.com/s/yye99bryfin89nav45gv91l5k35fxo7z.


2017 ◽  
Vol 29 (1) ◽  
pp. 153
Author(s):  
K. Uh ◽  
J. Ryu ◽  
C. Ray ◽  
K. Lee

Ten-eleven translocation (TET) enzymes catalyse oxidation of 5-methylcytosine to 5-hydroxymethyl cytosine. This TET-mediated conversion of 5-methylcytosine to 5-hydroxymethyl cytosine is implicated in initiating the DNA demethylation process, observed post-fertilization. Three members (TET1–3) of the TET family are differentially expressed during embryo development and appear to have different roles. Previous studies in mice suggest that TET1 is a key regulator in maintaining pluripotency in embryonic stem cells by managing epigenetic marks such as DNA methylation. This would imply that TET1 should be a regulator of epigenetic marks during embryo development, although this has not been demonstrated. Previously, we have cloned porcine TET1 from blastocysts (GenBank accession number KC137683) and demonstrated that the level of TET1 (mRNA and protein) was high in blastocysts. The protein level was greater in the inner cell mass compared with the trophectoderm. In this study, we generated TET1 knockout porcine embryos using CRISPR/Cas9 system to study the role of TET1 in controlling epigenetic marks during porcine embryo development. First, 2 sgRNA, immediately downstream of the presumable translation initiation site, were designed and synthesised; location of the sgRNA were nucleotide position at 2 to 21 bp and 23 to 42 bp, respectively (KC137683). Then, sgRNA (10 ng μL−1 each) and Cas9 mRNA (20 ng μL−1) were injected into the cytoplasm of IVF zygotes, and Day 7 blastocysts were genotyped. All embryos carried mutations on both alleles of TET1 (10/10), one homozygous and 9 biallelic mutations. However, immunocytochemistry analysis of other CRISPR/Cas9 injected embryos revealed that TET1 was not removed (10/10), indicating that the sgRNA may have not introduced a premature stop codon 3′ to the presumable translation initiation site. Therefore, 2 new sgRNA were designed to generate a premature stop codon at the 5′ side of a key functional domain, the 2-oxoglutarate-Fe(II)-dependent oxygenase domain (4690 to 5160 bp); the locations of the 2 sgRNA were 4450 to 4469 bp and 4501 to 4520 bp, respectively. Similarly, all of the embryos carried mutations in TET1 (7/7), 2 homozygous and 5 biallelic mutations. In addition, TET1 proteins were not detected in 11 of 16 blastocysts, confirmed by immunocytochemistry. In this study, we successfully generated embryos lacking TET1 by introducing designed CRISPR/Cas9 system during embryogenesis. Presence of TET1 from the first injection experiment suggests that the presumable translation initiation site is not accurate. Discrepancy between genotyping and immunocytochemistry results from the second injection experiment indicates that embryos possessing TET1 protein probably have mutations in triplets, thus no premature stop codon was synthesised. Further studies will focus on identifying the role of TET1 in maintaining pluripotency and epigenetic modification during pre-implantation stage using these embryos.


Viruses ◽  
2019 ◽  
Vol 11 (1) ◽  
pp. 83 ◽  
Author(s):  
Hong Liu ◽  
Rui Liu ◽  
Chang Li ◽  
Hui Wang ◽  
Hong Zhu ◽  
...  

Three dsRNAs, in sizes of approximately 2.5–5 kbp, were detected in the plant pathogenic fungus Nigrospora oryzae strain CS-7.5-4. Genomic analysis showed that the 5.0 kb dsRNA was a victorivirus named as Nigrospora oryzae victorivirus 2 (NoRV2). The genome of NoRV2 was 5166 bp in length containing two overlapping open reading frames (ORFs), ORF1 and ORF2. ORF1 was deduced to encode a coat protein (CP) showing homology to the CPs of viruses belonging to the Totiviridae family. The stop codon of ORF1 and the start codon of ORF2 were overlapped by the tetranucleotide sequence AUGA. ORF2 was predicted to encode an RNA-dependent RNA polymerase (RdRp), which was highly similar to the RdRps of victoriviruses. Virus-like particle examination demonstrated that the genome of NoRV2 was solely encapsidated by viral particles with a diameter of approximately 35 nm. The other two dsRNAs that were less than 3.0 kb were predicted to be the genomes of two mitoviruses, named as Nigrospora oryzae mitovirus 1 (NoMV1) and Nigrospora oryzae mitovirus 2 (NoMV2). Both NoMV1 and NoMV2 were A-U rich and with lengths of 2865 and 2507 bp, respectively. Mitochondrial codon usage inferred that each of the two mitoviruses contains a major large ORF encoding a mitoviral RdRp. Horizontal transfer experiments showed that the NoMV1 and NoMV2 could be cotransmitted horizontally via hyphal contact to other virus-free N. oryzae strains and causes phenotypic change to the recipient, such as an increase in growth rate. This is the first report of mitoviruses in N. oryzae.


2001 ◽  
Vol 67 (3) ◽  
pp. 1262-1267 ◽  
Author(s):  
Shuhei Fujimoto ◽  
Yasuyoshi Ike

ABSTRACT Two novel Enterococcus faecalis-Escherichia colishuttle vectors that utilize the promoter and ribosome binding site ofbacA on the E. faecalis plasmid pPD1 were constructed. The vectors were named pMGS100 and pMGS101. pMGS100 was designed to overexpress cloned genes in E. coli andE. faecalis and encodes the bacA promoter followed by a cloning site and stop codon. pMGS101 was designed for the overexpression and purification of a cloned protein fused to a Strep-tag consisting of 9 amino acids at the carboxyl terminus. The Strep-tag provides the cloned protein with an affinity to immobilized streptavidin that facilitates protein purification. We cloned a promoterless β-galactosidase gene from E. coli and cloned the traA gene of the E. faecalis plasmid pAD1 into the vectors to test gene expression and protein purification, respectively. β-Galactosidase was expressed in E. coliand E. faecalis at levels of 103 and 10 Miller units, respectively. By cloning the pAD1 traA into pMGS101, the protein could be purified directly from a crude lysate of E. faecalis or E. coli with an immobilized streptavidin matrix by one-step affinity chromatography. The ability of TraA to bind DNA was demonstrated by the DNA-associated protein tag affinity chromatography method using lysates prepared from both E. coli and E. faecalis that overexpress TraA. The results demonstrated the usefulness of the vectors for the overexpression and cis/trans analysis of regulatory genes, purification and copurification of proteins from E. faecalis, DNA binding analysis, determination of translation initiation site, and other applications that require proteins purified from E. faecalis.


2012 ◽  
Vol 78 (19) ◽  
pp. 7082-7089 ◽  
Author(s):  
Y. S. Lapteva ◽  
O. E. Zolova ◽  
M. G. Shlyapnikov ◽  
I. M. Tsfasman ◽  
T. A. Muranova ◽  
...  

ABSTRACTLytic enzymes are the group of hydrolases that break down structural polymers of the cell walls of various microorganisms. In this work, we determined the nucleotide sequences of theLysobactersp. strain XL1alpAandalpBgenes, which code for, respectively, secreted lytic endopeptidases L1 (AlpA) and L5 (AlpB).In silicoanalysis of their amino acid sequences showed these endopeptidases to be homologous proteins synthesized as precursors similar in structural organization: the mature enzyme sequence is preceded by an N-terminal signal peptide and a pro region. On the basis of phylogenetic analysis, endopeptidases AlpA and AlpB were assigned to the S1E family [clan PA(S)] of serine peptidases. Expression of thealpAandalpBopen reading frames (ORFs) inEscherichia coliconfirmed that they code for functionally active lytic enzymes. Each ORF was predicted to have the Shine-Dalgarno sequence located at a canonical distance from the start codon and a potential Rho-independent transcription terminator immediately after the stop codon. ThealpAandalpBmRNAs were experimentally found to be monocistronic; transcription start points were determined for both mRNAs. The synthesis of thealpAandalpBmRNAs was shown to occur predominantly in the late logarithmic growth phase. The amount ofalpAmRNA in cells ofLysobactersp. strain XL1 was much higher, which correlates with greater production of endopeptidase L1 than of L5.


1998 ◽  
Vol 180 (7) ◽  
pp. 1822-1830 ◽  
Author(s):  
Rafael Maldonado ◽  
Alan J. Herr

ABSTRACT Ribosomes translating bacteriophage T4 gene 60 mRNA bypass 50 noncoding nucleotides from a takeoff site at codon 46 to a landing site just upstream of codon 47. A key signal for efficient bypassing is contained within the nascent peptide synthesized prior to takeoff. Here we show that this signal is insensitive to the addition of coding information at its N terminus. In addition, analysis of amino-terminal fusions, which allow detection of all major products synthesized from the gene 60 mRNA, show that 50% of ribosomes bypass the coding gap while the rest either terminate at a UAG stop codon immediately following codon 46 or fail to resume coding. Bypassing efficiency estimates significantly lower than 50% were obtained with enzymatic reporter systems that relied on comparing test constructs to constructs with a precise excision of the gap (gap deletion). Further analysis showed that these estimates are distorted by differences between test and gap deletion functional mRNA levels. An internal translation initiation site at Met12 of gene 60(which eliminates part of the essential nascent peptide) also distorts these estimates. Together, these results support an efficiency estimate of ∼50%, less than previously reported. This estimate suggests that bypassing efficiency is determined by the competition between reading signals and release factors and gives new insight into the kinetics of bypassing signal action.


2020 ◽  
Author(s):  
Alexandra Walton ◽  
Diego Revinski ◽  
Arnauld Sergé ◽  
Stéphane Audebert ◽  
Luc Camoin ◽  
...  

AbstractFirst described in Drosophila melanogaster, planar cell polarity (PCP) is a developmental process essential for embryogenesis and development of polarized structures in Metazoans. This signaling pathway involves a set of evolutionarily conserved genes encoding transmembrane (Vangl, Frizzled, Celsr) and cytoplasmic (Prickle, Dishevelled) molecules. Vangl2 is of major importance in embryonic development as illustrated by its pivotal role during neural tube closure in human, mouse, Xenopus and zebrafish embryos. The regulated and poorly understood traffic of Vangl2 to the plasma membrane is a key event for its function in development. Here we report on the molecular and functional characterization of a novel 569-amino acid N-terminally extended Vangl2 isoform, Vangl2-Long, that arises from an alternative non-AUG translation initiation site, lying 144 base pair upstream of the conventional start codon. While missing in Vangl1 paralogs and in all invertebrates, including Drosophila melanogaster, this N-terminal extension is conserved in all vertebrate Vangl2 sequences and confers a subcellular localization in the Golgi apparatus, probably as a result of an extended retention time in this organelle. Vangl2-Long belongs to a multimeric complex with Vangl1 and Vangl2 and we show that its down-regulation leads to severe PCP-related phenotypes in Xenopus embryos, including shorter body axis and neural tube closure defects. Altogether, our study unveils a novel level of complexity in Vangl2 expression, trafficking and function.


Author(s):  
Delano James ◽  
James Phelan ◽  
Daniel Sanderson

Blackcurrant leaf chlorosis associated virus (BCLCaV) was detected recently by next-generation sequencing (NGS) and proposed as a new and distinct species in the genus Idaeovirus. Genomic components of BCLCaV that were detected and confirmed include: 1) RNA-1 that is monocistronic and encodes the replicase complex; 2) a bicistronic RNA-2 that encodes a movement protein (MP) and the coat protein (CP) of the virus, with open reading frames (ORF) that overlap by a single adenine (A) nucleotide (nt) representing the third position of an opal stop codon of the MP ORF2a and the first position of the start codon of the CP ORF2b; 3) a subgenomic form of RNA-2 (RNA-3) that contains ORF2b; and 4) a concatenated form of RNA-2 that consists of a complementary and inverted RNA-3 conjoined to the full-length RNA-2. Analysis of NGS-derived paired-end reads revealed the existence of bridge reads encompassing the 3’-terminus and 5’-terminus of RNA-2 or RNA-3 of BCLCaV. The full RNA-2 or RNA-3 could be amplified using outward facing or abutting primers; also RNA-2/RNA-3 could be detected even after three consecutive RNase R enzyme treatments with denaturation at 95 oC preceding each digestion. Evidence was obtained indicating that there are circular forms of BCLCaV RNA-2 and RNA-3.


Sign in / Sign up

Export Citation Format

Share Document