overlapping reading frames
Recently Published Documents


TOTAL DOCUMENTS

43
(FIVE YEARS 2)

H-INDEX

21
(FIVE YEARS 0)

2021 ◽  
Vol 17 (10) ◽  
pp. e1009475
Author(s):  
Antoine L. Decrulle ◽  
Antoine Frénoy ◽  
Thomas A. Meiller-Legrand ◽  
Aude Bernheim ◽  
Chantal Lotton ◽  
...  

Evolution is often an obstacle to the engineering of stable biological systems due to the selection of mutations inactivating costly gene circuits. Gene overlaps induce important constraints on sequences and their evolution. We show that these constraints can be harnessed to increase the stability of costly genes by purging loss-of-function mutations. We combine computational and synthetic biology approaches to rationally design an overlapping reading frame expressing an essential gene within an existing gene to protect. Our algorithm succeeded in creating overlapping reading frames in 80% of E. coli genes. Experimentally, scoring mutations in both genes of such overlapping construct, we found that a significant fraction of mutations impacting the gene to protect have a deleterious effect on the essential gene. Such an overlap thus protects a costly gene from removal by natural selection by associating the benefit of this removal with a larger or even lethal cost. In our synthetic constructs, the overlap converts many of the possible mutants into evolutionary dead-ends, reducing the evolutionary potential of the system and thus increasing its stability over time.



2021 ◽  
Author(s):  
Laura Munoz-Baena ◽  
Art Poon

Gene overlap occurs when two or more genes are encoded by the same nucleotides. This phenomenon is found in all taxonomic domains, but is particularly common in viruses, where it may increase the information content of compact genomes or influence the creation of new genes. Here we report a global comparative study of overlapping reading frames (OvRFs) of 12,609 virus reference genomes in the NCBI database. We retrieved metadata associated with all annotated reading frames in each genome record to calculate the number, length, and frameshift of OvRFs. Our results show that while the number of OvRFs increases with genome length, they tend to be shorter in longer genomes. The majority of overlaps involve +2 frameshifts, predominantly found in dsDNA viruses. However, the longest overlaps involve no shift in reading frame (+0), increasing the selective burden of the same nucleotide positions within codons, instead of exposing additional sites to purifying selection. Next, we develop a new graph-based representation of the distribution of OvRFs among the reading frames of genomes in a given virus family. In the absence of an unambiguous partition of reading frames by homology at this taxonomic level, we used an alignment-free k-mer based approach to cluster protein coding sequences by similarity. We connect these clusters with two types of directed edges to indicate (1) that constituent reading frames are adjacent in one or more genomes, and (2) that the reading frames overlap. These adjacency graphs not only provide a natural visualization scheme, but also a novel statistical framework for analyzing the effects of gene- and genome-level attributes on the frequencies of overlaps.



2019 ◽  
Author(s):  
Carol Smith ◽  
Jill G. Canestrari ◽  
Jing Wang ◽  
Keith M. Derbyshire ◽  
Todd A. Gray ◽  
...  

ABSTRACTORF boundaries in bacterial genomes have largely been drawn by gene prediction algorithms. These algorithms often fail to predict ORFs with non-canonical features. Recent developments in genome-scale mapping of translation have facilitated the empirical identification of ORFs. Here, we use ribosome profiling approaches to map initiating and elongating ribosomes in Mycobacterium tuberculosis. Thus, we identify over 1,000 novel ORFs, revealing that much of the genome encodes proteins in overlapping reading frames, and/or on both strands. Most of the novel ORFs are short (sORFs), impeding their identification by traditional methods. The strong codon bias that characterizes annotated mycobacterial ORFs is not evident in the aggregate novel sORFs; hence most are unlikely to encode functional proteins. Our data suggest that bacterial transcriptomes are subject to pervasive translation. We speculate that the inefficiency of expressing spurious sORFs may be offset by positive contributions to M. tuberculosis biology through activities of a small subset.



PeerJ ◽  
2019 ◽  
Vol 6 ◽  
pp. e6176 ◽  
Author(s):  
Mikk Puustusmaa ◽  
Aare Abroi

Identifying cis-acting elements and understanding regulatory mechanisms of a gene is crucial to fully understand the molecular biology of an organism. In general, it is difficult to identify previously uncharacterised cis-acting elements with an unknown consensus sequence. The task is especially problematic with viruses containing regions of limited or no similarity to other previously characterised sequences. Fortunately, the fast increase in the number of sequenced genomes allows us to detect some of these elusive cis-elements. In this work, we introduce a web-based tool called cRegions. It was developed to identify regions within a protein-coding sequence where the conservation in the amino acid sequence is caused by the conservation in the nucleotide sequence. The cRegion can be the first step in discovering novel cis-acting sequences from diverged protein-coding genes. The results can be used as a basis for future experimental analysis. We applied cRegions on the non-structural and structural polyproteins of alphaviruses as an example and successfully detected all known cis-acting elements. In this publication and in previous work, we have shown that cRegions is able to detect a wide variety of functional elements in DNA and RNA viruses. These functional elements include splice sites, stem-loops, overlapping reading frames, internal promoters, ribosome frameshifting signals and other embedded elements with yet unknown function. The cRegions web tool is available athttp://bioinfo.ut.ee/cRegions/.



2015 ◽  
Vol 31 (10) ◽  
pp. 947-947 ◽  
Author(s):  
Christopher Monit ◽  
Richard A. Goldstein ◽  
Greg Towers ◽  
Stéphane Hué


2013 ◽  
Vol 9 (1) ◽  
pp. 20120396 ◽  
Author(s):  
Jasna Lalić ◽  
Santiago F. Elena

How, and to what extent, does the environment influence the way mutations interact? Do environmental changes affect both the sign and the magnitude of epistasis? Are there any correlations between environments in the variability, sign or magnitude of epistasis? Very few studies have tackled these questions. Here, we addressed them in the context of viral emergence. Most emerging viruses are RNA viruses with small genomes, overlapping reading frames and multifunctional proteins for which epistasis is abundant. Understanding the effect of host species in the sign and magnitude of epistasis will provide insights into the evolutionary ecology of infectious diseases and the predictability of viral emergence.



2013 ◽  
Vol 2013 ◽  
pp. 1-7
Author(s):  
Debnath Bhattacharyya ◽  
Bijoy Kumar Mandal ◽  
Tai-hoon Kim

We design an Algorithm for bioengine. As a program are enable optimal alignments searching between two sequences, the host sequence (normal plant) as well as query sequence (virus). Searching for homologues has become a routine operation of biological sequences in 4 × 4 combination with different subsequence (word size). This program takes the advantage of the high degree of homology between such sequences to construct an alignment of the matching regions. There is a main aim which is to detect the overlapping reading frames. This program also enables to find out the highly infected colones selection highest matching region with minimum gap or mismatch zones and unique virus colones matches. This is a small, portable, interactive, front-end program intended to be used to find out the regions of matching between host sequence and query subsequences. All the operations are carried out in fraction of seconds, depending on the required task and on the sequence length.



2009 ◽  
Vol 83 (22) ◽  
pp. 11996-12001 ◽  
Author(s):  
Yuichiro Nakatsu ◽  
Makoto Takeda ◽  
Masaharu Iwasaki ◽  
Yusuke Yanagi

ABSTRACT The P, V, and C proteins of measles virus are encoded in overlapping reading frames of the P gene, which makes it difficult to analyze the functions of the individual proteins in the context of virus infection. We established a system to analyze the C protein independently from the P and V proteins by placing its gene in an additional transcription unit between the H and L genes. Analyses with this system indicated that a highly attenuated Edmonston lineage vaccine strain encodes a fully functional C protein, and the P and/or V protein is involved in the attenuated phenotype.



2009 ◽  
Vol 83 (20) ◽  
pp. 10719-10736 ◽  
Author(s):  
Corinne Rancurel ◽  
Mahvash Khosravi ◽  
A. Keith Dunker ◽  
Pedro R. Romero ◽  
David Karlin

ABSTRACT It is widely assumed that new proteins are created by duplication, fusion, or fission of existing coding sequences. Another mechanism of protein birth is provided by overlapping genes. They are created de novo by mutations within a coding sequence that lead to the expression of a novel protein in another reading frame, a process called “overprinting.” To investigate this mechanism, we have analyzed the sequences of the protein products of manually curated overlapping genes from 43 genera of unspliced RNA viruses infecting eukaryotes. Overlapping proteins have a sequence composition globally biased toward disorder-promoting amino acids and are predicted to contain significantly more structural disorder than nonoverlapping proteins. By analyzing the phylogenetic distribution of overlapping proteins, we were able to confirm that 17 of these had been created de novo and to study them individually. Most proteins created de novo are orphans (i.e., restricted to one species or genus). Almost all are accessory proteins that play a role in viral pathogenicity or spread, rather than proteins central to viral replication or structure. Most proteins created de novo are predicted to be fully disordered and have a highly unusual sequence composition. This suggests that some viral overlapping reading frames encoding hypothetical proteins with highly biased composition, often discarded as noncoding, might in fact encode proteins. Some proteins created de novo are predicted to be ordered, however, and whenever a three-dimensional structure of such a protein has been solved, it corresponds to a fold previously unobserved, suggesting that the study of these proteins could enhance our knowledge of protein space.



Sign in / Sign up

Export Citation Format

Share Document