scholarly journals Stepwise Evolution and Exceptional Conservation of ORF1a/b Overlap in Coronaviruses

Author(s):  
Han Mei ◽  
Sergei Kosakovsky Pond ◽  
Anton Nekrutenko

Abstract The programmed frameshift element (PFE) rerouting translation from ORF1a to ORF1b is essential for propagation of coronaviruses. The overlap between the two reading frames, a slippery sequence, and an ensemble of secondary structure elements places severe constraints on this region as most possible nucleotide substitution may disrupt one or more of these features. Here we performed a comparative analysis of all available coronaviral genomic data available to date to demonstrate exceptional conservation and detect signatures of selection within the PFE region.

2021 ◽  
Author(s):  
Han Mei ◽  
Anton Nekrutenko

The programmed frameshift element (PFE) rerouting translation from ORF1a to ORF1b is essential for propagation of coronaviruses. A combination of genomic features that make up PFE--the overlap between the two reading frames, a slippery sequence, as well as an ensemble of complex secondary structure elements--puts severe constraints on this region as most possible nucleotide substitution may disrupt one or more of these elements. The vast amount of SARS-CoV-2 sequencing data generated within the past year provides an opportunity to assess evolutionary dynamics of PFE in great detail. Here we performed a comparative analysis of all available coronaviral genomic data available to date. We show that the overlap between ORF1a and b evolved as a set of discrete 7, 16, 22, 25, and 31 nucleotide stretches with a well defined phylogenetic specificity. We further examined sequencing data from over 350,000 complete genomes and 55,000 raw read datasets to demonstrate exceptional conservation of the PFE region.


Gene ◽  
1989 ◽  
Vol 82 (1) ◽  
pp. 65-75 ◽  
Author(s):  
Norman R. Pace ◽  
David K. Smith ◽  
Gary J. Olsen ◽  
Bryan D. James

2001 ◽  
Vol 69 (4) ◽  
pp. 2612-2620 ◽  
Author(s):  
Takeshi Haneda ◽  
Nobuhiko Okada ◽  
Noriko Nakazawa ◽  
Takatoshi Kawakami ◽  
Hirofumi Danbara

ABSTRACT The complete nucleotide sequence of pKDSC50, a large virulence plasmid from Salmonella enterica serovar Choleraesuis strain RF-1, has been determined. We identified 48 of the open reading frames (ORFs) encoded by the 49,503-bp molecule. pKDSC50 encodes a known virulence-associated operon, the spv operon, which is composed of genes essential for systemic infection by nontyphoidalSalmonella. Analysis of the genetic organization of pKDSC50 suggests that the plasmid is composed of several virulence-associated genes, which include the spvRABCD genes, plasmid replication and maintenance genes, and one insertion sequence element. A second virulence-associated region including the pef(plasmid-encoded fimbria) operon and rck (resistance to complement killing) gene, which has been identified on the virulence plasmid of S. enterica serovar Typhimurium, was absent. Two different replicon regions, similar to the RepFIIA and RepFIB replicons, were found. Both showed high similarity to those of the pO157 plasmid of enterohemorrhagic Escherichia coliO157:H7 and the enteropathogenic E. coli (EPEC) adherence factor plasmid harbored by EPEC strain B171 (O111:NM), as well as the virulence plasmids of Salmonella serovars Typhimurium and Enteritidis. Comparative analysis of the nucleotide sequences of the 50-kb virulence plasmid of serovar Choleraesuis and the 94-kb virulence plasmid of serovar Typhimurium revealed that 47 out of 48 ORFs of the virulence plasmid of serovar Choleraesuis are highly homologous to the corresponding ORFs of the virulence plasmid of serovar Typhimurium, suggesting a common ancestry.


Proteins are made up of basic units called amino acids which are held together by bonds namely hydrogen and ionic bond. The way in which the amino acids are sequenced has been categorized into two dimensional and three dimensional structures. The main advantage of predicting secondary structure is to produce tertiary structure likelihoods that are in great demand for continuous detection of proteins. This paper reviews the different methods adopted for predicting the protein secondary structure and provides a comparative analysis of accuracies obtained from various input datasets [1].


2018 ◽  
Author(s):  
Yang Yang ◽  
Quanquan Gu ◽  
Yang Zhang ◽  
Takayo Sasaki ◽  
Julianna Crivello ◽  
...  

SummaryA large amount of multi-species functional genomic data from high-throughput assays are becoming available to help understand the molecular mechanisms for phenotypic diversity across species. However, continuous-trait probabilistic models, which are key to such comparative analysis, remain underexplored. Here we develop a new model, called phylogenetic hidden Markov Gaussian processes (Phylo-HMGP), to simultaneously infer heterogeneous evolutionary states of functional genomic features in a genome-wide manner. Both simulation studies and real data application demonstrate the effectiveness of Phylo-HMGP. Importantly, we applied Phylo-HMGP to analyze a new cross-species DNA replication timing (RT) dataset from the same cell type in five primate species (human, chimpanzee, orangutan, gibbon, and green monkey). We demonstrate that our Phylo-HMGP model enables discovery of genomic regions with distinct evolutionary patterns of RT. Our method provides a generic framework for comparative analysis of multi-species continuous functional genomic signals to help reveal regions with conserved or lineage-specific regulatory roles.


2021 ◽  
Author(s):  
Cooper Alastair Grace ◽  
Sarah Forrester ◽  
Vladimir Costa Silva ◽  
Aleksander Aare ◽  
Hannah Kilford ◽  
...  

AbstractThe Leishmania donovani species complex are the causative agents of visceral leishmaniasis, which cause 20-40,000 fatalities a year. Here, we conduct a screen for balancing selection in this specie complex. We sequence 93 isolates of L. infantum from Brazil and used 387 publicly-available L. donovani and L. infantum genomes, to describe the global diversity of this species complex. We identify five genetically-distinct populations that are sufficiently represented by genomic data to search for signatures of selection. We show that multiple metrics identify genes with robust signatures of balancing selection. We produce a curated set of 19 genes with robust signatures, including zeta toxin, nodulin-like and flagellum attachment proteins. Candidate genes were generally not shared between populations, consistent with divergent rather than long-term balancing selection in these species. This study highlights the extent of genetic divergence between L. donovani complex parasites and provides candidate genes for further study.


Sign in / Sign up

Export Citation Format

Share Document