scholarly journals Refactoring gene sequences for broad assembly standards compatibility

2017 ◽  
Author(s):  
Tyson R. Shepherd

AbstractFour cloning standards in synthetic biology are BioBrick, BglBrick, MoClo and GoldenBraid, with each requiring their constitutive parts be compatible with the associated restriction enzymes. To standardize parts for the broadest usage, it would be useful to synthesize genes that are simultaneously compatible with all 4 popular assembly strategies. Here it is shown that using a defined set of rules, implemented in a computational program, any protein coding sequence can be made compatible with all four standards by silent mutations. Using a coding sequence as an input, all BioBrick, BglBrick, MoClo, and GoldenBraid restriction sites and chi recombination hot spots can be destroyed with silent mutations that approximate the codon usage of the organism. As an application, all open reading frames in the model organisms Escherichia Coli and Bacillus Subtilis are computationally refactored, showing the feasibility of implementing this umbrella strategy for synthesizing genes with the broadest compatibility.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Robin-Lee Troskie ◽  
Yohaann Jafrani ◽  
Tim R. Mercer ◽  
Adam D. Ewing ◽  
Geoffrey J. Faulkner ◽  
...  

AbstractPseudogenes are gene copies presumed to mainly be functionless relics of evolution due to acquired deleterious mutations or transcriptional silencing. Using deep full-length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we identify here hundreds of novel transcribed pseudogenes expressed in tissue-specific patterns. Some pseudogene transcripts have intact open reading frames and are translated in cultured cells, representing unannotated protein-coding genes. To assess the biological impact of noncoding pseudogenes, we CRISPR-Cas9 delete the nucleus-enriched pseudogene PDCL3P4 and observe hundreds of perturbed genes. This study highlights pseudogenes as a complex and dynamic component of the human transcriptional landscape.



Biomedicines ◽  
2021 ◽  
Vol 9 (8) ◽  
pp. 911
Author(s):  
Joana Silva ◽  
Pedro Nina ◽  
Luísa Romão

ATP-binding cassette subfamily E member 1 (ABCE1) belongs to the ABC protein family of transporters; however, it does not behave as a drug transporter. Instead, ABCE1 actively participates in different stages of translation and is also associated with oncogenic functions. Ribosome profiling analysis in colorectal cancer cells has revealed a high ribosome occupancy in the human ABCE1 mRNA 5′-leader sequence, indicating the presence of translatable upstream open reading frames (uORFs). These cis-acting translational regulatory elements usually act as repressors of translation of the main coding sequence. In the present study, we dissect the regulatory function of the five AUG and five non-AUG uORFs identified in the human ABCE1 mRNA 5′-leader sequence. We show that the expression of the main coding sequence is tightly regulated by the ABCE1 AUG uORFs in colorectal cells. Our results are consistent with a model wherein uORF1 is efficiently translated, behaving as a barrier to downstream uORF translation. The few ribosomes that can bypass uORF1 (and/or uORF2) must probably initiate at the inhibitory uORF3 or uORF5 that efficiently repress translation of the main ORF. This inhibitory property is slightly overcome in conditions of endoplasmic reticulum stress. In addition, we observed that these potent translation-inhibitory AUG uORFs function equally in cancer and in non-tumorigenic colorectal cells, which is consistent with a lack of oncogenic function. In conclusion, we establish human ABCE1 as an additional example of uORF-mediated translational regulation and that this tight regulation contributes to control ABCE1 protein levels in different cell environments.



2021 ◽  
Vol 12 (1) ◽  
Author(s):  
David S. M. Lee ◽  
Joseph Park ◽  
Andrew Kromer ◽  
Aris Baras ◽  
Daniel J. Rader ◽  
...  

AbstractRibosome-profiling has uncovered pervasive translation in non-canonical open reading frames, however the biological significance of this phenomenon remains unclear. Using genetic variation from 71,702 human genomes, we assess patterns of selection in translated upstream open reading frames (uORFs) in 5’UTRs. We show that uORF variants introducing new stop codons, or strengthening existing stop codons, are under strong negative selection comparable to protein-coding missense variants. Using these variants, we map and validate gene-disease associations in two independent biobanks containing exome sequencing from 10,900 and 32,268 individuals, respectively, and elucidate their impact on protein expression in human cells. Our results suggest translation disrupting mechanisms relating uORF variation to reduced protein expression, and demonstrate that translation at uORFs is genetically constrained in 50% of human genes.



2021 ◽  
Author(s):  
Yanyi Jiang ◽  
Xiaofan Chen ◽  
Wei Zhang

AbstractIn RNA field, the demarcation between coding and non-coding has been negotiated by the recent discovery of occasionally translated circular RNAs (circRNAs). Although absent of 5’ cap structure, circRNAs can be translated cap-independently. Complementary intron-mediated overexpression is one of the most utilized methodologies for circRNA research but not without bearing echoing skepticism for its poorly defined mechanism and latent coexistent side products. In this study, leveraging such circRNA overexpression system, we have interrogated the protein-coding potential of 30 human circRNAs containing infinite open reading frames in HEK293T cells. Surprisingly, pervasive translation signals are detected by immunoblotting. However, intensive mutagenesis reveals that numerous translation signals are generated independently of circRNA synthesis. We have developed a dual tag strategy to isolate translation noise and directly demonstrate that the fallacious translation signals originate from cryptically spliced linear transcripts. The concomitant linear RNA byproducts, presumably concatemers, can be translated to allow pseudo rolling circle translation signals, and can involve backsplicing junction (BSJ) to disqualify the BSJ-based evidence for circRNA translation. We also find non-AUG start codons may engage in the translation initiation of circRNAs. Taken together, our systematic evaluation sheds light on heterogeneous translational outputs from circRNA overexpression vector and comes with a caveat that ectopic overexpression technique necessitates extremely rigorous control setup in circRNA translation and functional investigation.



2020 ◽  
Vol 6 (21) ◽  
pp. eaaz2059 ◽  
Author(s):  
Liman Niu ◽  
Fangzhou Lou ◽  
Yang Sun ◽  
Libo Sun ◽  
Xiaojie Cai ◽  
...  

Many annotated long noncoding RNAs (lncRNAs) harbor predicted short open reading frames (sORFs), but the coding capacities of these sORFs and the functions of the resulting micropeptides remain elusive. Here, we report that human lncRNA MIR155HG encodes a 17–amino acid micropeptide, which we termed miPEP155 (P155). MIR155HG is highly expressed by inflamed antigen-presenting cells, leading to the discovery that P155 interacts with the adenosine 5′-triphosphate binding domain of heat shock cognate protein 70 (HSC70), a chaperone required for antigen trafficking and presentation in dendritic cells (DCs). P155 modulates major histocompatibility complex class II–mediated antigen presentation and T cell priming by disrupting the HSC70-HSP90 machinery. Exogenously injected P155 improves two classical mouse models of DC-driven auto inflammation. Collectively, we demonstrate the endogenous existence of a micropeptide encoded by a transcript annotated as “non-protein coding” and characterize a micropeptide as a regulator of antigen presentation and a suppressor of inflammatory diseases.



2020 ◽  
Vol 40 (6) ◽  
Author(s):  
Corrine Corrina R. Hartford ◽  
Ashish Lal

ABSTRACT Recent advancements in genetic and proteomic technologies have revealed that more of the genome encodes proteins than originally thought possible. Specifically, some putative long noncoding RNAs (lncRNAs) have been misannotated as noncoding. Numerous lncRNAs have been found to contain short open reading frames (sORFs) which have been overlooked because of their small size. Many of these sORFs encode small proteins or micropeptides with fundamental biological importance. These micropeptides can aid in diverse processes, including cell division, transcription regulation, and cell signaling. Here we discuss strategies for establishing the coding potential of putative lncRNAs and describe various functions of known micropeptides.



2020 ◽  
Vol 36 (6-7) ◽  
pp. 675-677
Author(s):  
Bertrand Jordan

A systematic search for non-conventional open reading frames in human DNA reveals a large number of small ORFs encoding peptides generally smaller than 100 amino-acids. These ORFs are transcribed and translated into small proteins, which are demonstrated to have functional significance by bulk CRISPR inactivation. Evidence is also found for bicistronic mRNAs including such a small ORF upstream of a canonical coding sequence. These findings add a new facet to our understanding of biological processes.



1998 ◽  
Vol 62 (3) ◽  
pp. 985-1019 ◽  
Author(s):  
Kenneth E. Rudd

SUMMARY A physical map, EcoMap10, of the now completely sequenced Escherichia coli chromosome is presented. Calculated genomic positions for the eight restriction enzymes BamHI, HindIII, EcoRI, EcoRV, BglI, KpnI, PstI, and PvuII are depicted. Both sequenced and unsequenced Kohara/Isono miniset clones are aligned to this calculated restriction map. DNA sequence searches identify the precise locations of insertion sequence elements and repetitive extragenic palindrome clusters. EcoGene10, a revised set of genes and functionally uncharacterized open reading frames (ORFs), is also depicted on EcoMap10. The complete set of unnamed ORFs in EcoGene10 are assigned provisional names beginning with the letter “y” by using a systematic nomenclature.



1991 ◽  
Vol 11 (9) ◽  
pp. 4306-4313 ◽  
Author(s):  
B A Arrick ◽  
A L Lee ◽  
R L Grendell ◽  
R Derynck

We have cloned and sequenced the 5' untranslated region of the transforming growth factor-beta 3 (TGF-beta 3) mRNA as well as the adjacent genomic sequence. S1 nuclease analysis identified a single transcription start site. We have thus determined that the 5' untranslated region is about 1.1 kb long and contains 11 open reading frames. In vitro translation of the TGF-beta 3 precursor coding sequence was markedly inhibited by the presence of the 5' untranslated region. Similarly, when the 5' untranslated region of TGF-beta 3 was introduced upstream of the coding sequence of chloramphenicol acetyltransferase, in vitro translation was inhibited. Furthermore, upon transfection into 293 cells, chloramphenicol acetyltransferase expression was inhibited by the 5' untranslated region of TGF-beta 3. The degree of translational inhibition was inversely proportional to the amount of transfected DNA. Mutation analysis implicated multiple segments of the 5' untranslated region as contributing to the inhibitory effect. Deletion of much of the 5'-most 640 nucleotides, including 8 of the 11 upstream ATGs, relieved much but not all of the inhibitory influence of the 5' untranslated region of TGF-beta 3 mRNA. The two upstream open reading frames closest to the initiator codon for the TGF-beta 3 coding sequence also decreased translational efficiency, since mutation of either ATG resulted in increased translation. Transfection results with T47-D cells, a cell line which expresses TGF-beta 3 mRNA, were similar to those obtained with the 293 cell line. Thus, TGF-beta 3 mRNA is a recent example of an expanding group of growth-related mRNAs in which the 5' untranslated region contains upstream open reading frames and other sequences which inhibit translation.



2004 ◽  
Vol 78 (20) ◽  
pp. 11187-11197 ◽  
Author(s):  
Lisa M. Kattenhorn ◽  
Ryan Mills ◽  
Markus Wagner ◽  
Alexandre Lomsadze ◽  
Vsevolod Makeev ◽  
...  

ABSTRACT Proteins associated with the murine cytomegalovirus (MCMV) viral particle were identified by a combined approach of proteomic and genomic methods. Purified MCMV virions were dissociated by complete denaturation and subjected to either separation by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and in-gel digestion or treated directly by in-solution tryptic digestion. Peptides were separated by nanoflow liquid chromatography and analyzed by tandem mass spectrometry (LC-MS/MS). The MS/MS spectra obtained were searched against a database of MCMV open reading frames (ORFs) predicted to be protein coding by an MCMV-specific version of the gene prediction algorithm GeneMarkS. We identified 38 proteins from the capsid, tegument, glycoprotein, replication, and immunomodulatory protein families, as well as 20 genes of unknown function. Observed irregularities in coding potential suggested possible sequence errors in the 3′-proximal ends of m20 and M31. These errors were experimentally confirmed by sequencing analysis. The MS data further indicated the presence of peptides derived from the unannotated ORFs ORFc225441-226898 (m166.5) and ORF105932-106072. Immunoblot experiments confirmed expression of m166.5 during viral infection.



Sign in / Sign up

Export Citation Format

Share Document