Conserved recombination patterns across coronavirus subgenera

Recombination contributes to the genetic diversity found in coronaviruses and is known to be a prominent mechanism whereby they evolve. It is apparent, both from controlled experiments and in genome sequences sampled from nature, that patterns of recombination in coronaviruses are non-random and that this is likely attributable to a combination of sequence features that favour the occurrence of recombination breakpoints at specific genomic sites, and selection disfavouring the survival of recombinants within which favourable intra-genome interactions have been disrupted. Here we leverage available whole-genome sequence data for six coronavirus subgenera to identify specific patterns of recombination that are conserved between multiple subgenera and then identify the likely factors that underlie these conserved patterns. Specifically, we confirm the non-randomness of recombination breakpoints across all six tested coronavirus subgenera, locate conserved recombination hot- and cold-spots, and determine that the locations of transcriptional regulatory sequences are likely major determinants of conserved recombination breakpoint hot-spot locations. We find that while the locations of recombination breakpoints are not uniformly associated with degrees of nucleotide sequence conservation, they display significant tendencies in multiple coronavirus subgenera to occur in low guanine-cytosine content genome regions, in non-coding regions, at the edges of genes, and at sites within the Spike gene that are predicted to be minimally disruptive of Spike protein folding. While it is apparent that sequence features such as transcriptional regulatory sequences are likely major determinants of where the template-switching events that yield recombination breakpoints most commonly occur, it is evident that selection against misfolded recombinant proteins also strongly impacts observable recombination breakpoint distributions in coronavirus genomes sampled from nature.

Download Full-text

Characterizing transcriptional regulatory sequences in coronaviruses and their role in recombination

10.1101/2020.06.21.163410 ◽

2020 ◽

Cited By ~ 2

Author(s):

Yiyan Yang ◽

Wei Yan ◽

A. Brantley Hall ◽

Xiaofang Jiang

Keyword(s):

Secondary Structure ◽

Rna Viruses ◽

Leader Sequence ◽

Template Switching ◽

Regulatory Sequences ◽

Core Sequence ◽

Recombination Hotspots ◽

Transcriptional Regulatory ◽

Recombination Breakpoints

ABSTRACTNovel coronaviruses, including SARS-CoV-2, SARS, and MERS, often originate from recombination events. The mechanism of recombination in RNA viruses is template switching. Coronavirus transcription also involves template switching at specific regions, called transcriptional regulatory sequences (TRS). It is hypothesized but not yet verified that TRS sites are prone to recombination events. Here, we developed a tool called SuPER to systematically identify TRS in coronavirus genomes and then investigated whether recombination is more common at TRS. We ran SuPER on 506 coronavirus genomes and identified 465 TRS-L and 3509 TRS-B. We found that the TRS-L core sequence (CS) and the secondary structure of the leader sequence are generally conserved within coronavirus genera but different between genera. By examining the location of recombination breakpoints with respect to TRS-B CS, we observed that recombination hotspots are more frequently co-located with TRS-B sites than expected.

Download Full-text

Characterizing Transcriptional Regulatory Sequences in Coronaviruses and Their Role in Recombination

Molecular Biology and Evolution ◽

10.1093/molbev/msaa281 ◽

2020 ◽

Author(s):

Yiyan Yang ◽

Wei Yan ◽

A Brantley Hall ◽

Xiaofang Jiang

Keyword(s):

Secondary Structure ◽

Rna Viruses ◽

Leader Sequence ◽

Template Switching ◽

Regulatory Sequences ◽

Core Sequence ◽

Recombination Hotspots ◽

Transcriptional Regulatory ◽

Recombination Breakpoints

Abstract Novel coronaviruses, including SARS-CoV-2, SARS, and MERS, often originate from recombination events. The mechanism of recombination in RNA viruses is template switching. Coronavirus transcription also involves template switching at specific regions, called transcriptional regulatory sequences (TRS). It is hypothesized but not yet verified that TRS sites are prone to recombination events. Here, we developed a tool called SuPER to systematically identify TRS in coronavirus genomes and then investigated whether recombination is more common at TRS. We ran SuPER on 506 coronavirus genomes and identified 465 TRS-L and 3,509 TRS-B. We found that the TRS-L core sequence (CS) and the secondary structure of the leader sequence are generally conserved within coronavirus genera but different between genera. By examining the location of recombination breakpoints with respect to TRS-B CS, we observed that recombination hotspots are more frequently colocated with TRS-B sites than expected.

Download Full-text

Shared Common Ancestry of Rodent Alphacoronaviruses Sampled Globally

Viruses ◽

10.3390/v11020125 ◽

2019 ◽

Vol 11 (2) ◽

pp. 125 ◽

Cited By ~ 7

Author(s):

Theocharis Tsoleridis ◽

Joseph Chappell ◽

Okechukwu Onianwa ◽

Denise Marston ◽

Anthony Fooks ◽

...

Keyword(s):

High Throughput Sequencing ◽

Full Genome Sequence ◽

Regulatory Sequences ◽

Spike Gene ◽

Transcriptional Regulatory ◽

Geographic Origins ◽

History Of ◽

The Uk ◽

Host Jumping

The recent discovery of novel alphacoronaviruses (alpha-CoVs) in European and Asian rodents revealed that rodent coronaviruses (CoVs) sampled worldwide formed a discrete phylogenetic group within this genus. To determine the evolutionary history of rodent CoVs in more detail, particularly the relative frequencies of virus-host co-divergence and cross-species transmission, we recovered longer fragments of CoV genomes from previously discovered European rodent alpha-CoVs using a combination of PCR and high-throughput sequencing. Accordingly, the full genome sequence was retrieved from the UK rat coronavirus, along with partial genome sequences from the UK field vole and Poland-resident bank vole CoVs, and a short conserved ORF1b fragment from the French rabbit CoV. Genome and phylogenetic analysis showed that despite their diverse geographic origins, all rodent alpha-CoVs formed a single monophyletic group and shared similar features, such as the same gene constellations, a recombinant beta-CoV spike gene, and similar core transcriptional regulatory sequences (TRS). These data suggest that all rodent alpha CoVs sampled so far originate from a single common ancestor, and that there has likely been a long-term association between alpha CoVs and rodents. Despite this likely antiquity, the phylogenetic pattern of the alpha-CoVs was also suggestive of relatively frequent host-jumping among the different rodent species.

Download Full-text

Faculty Opinions recommendation of Optimal algorithms for haplotype assembly from whole-genome sequence data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13339986.14707085 ◽

2011 ◽

Author(s):

Alejandro Schaffer

Keyword(s):

Genome Sequence ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Optimal Algorithms ◽

Genome Sequence Data ◽

Haplotype Assembly

Download Full-text

TIGER: inferring DNA replication timing from whole-genome sequence data

Bioinformatics ◽

10.1093/bioinformatics/btab166 ◽

2021 ◽

Cited By ~ 1

Author(s):

Amnon Koren ◽

Dashiell J Massey ◽

Alexa N Bracci

Keyword(s):

Dna Replication ◽

Genome Sequence ◽

Genomic Dna ◽

Sequence Data ◽

Replication Timing ◽

Whole Genome Sequence ◽

Supplementary Information ◽

Whole Genome ◽

Genome Sequence Data ◽

Dna Replication Timing

Abstract Motivation Genomic DNA replicates according to a reproducible spatiotemporal program, with some loci replicating early in S phase while others replicate late. Despite being a central cellular process, DNA replication timing studies have been limited in scale due to technical challenges. Results We present TIGER (Timing Inferred from Genome Replication), a computational approach for extracting DNA replication timing information from whole genome sequence data obtained from proliferating cell samples. The presence of replicating cells in a biological specimen leads to non-uniform representation of genomic DNA that depends on the timing of replication of different genomic loci. Replication dynamics can hence be observed in genome sequence data by analyzing DNA copy number along chromosomes while accounting for other sources of sequence coverage variation. TIGER is applicable to any species with a contiguous genome assembly and rivals the quality of experimental measurements of DNA replication timing. It provides a straightforward approach for measuring replication timing and can readily be applied at scale. Availability and Implementation TIGER is available at https://github.com/TheKorenLab/TIGER. Supplementary information Supplementary data are available at Bioinformatics online

Download Full-text

Whole genome sequence data of Bacillus australimaris strain B28A, isolated from Marine Water in India

Data in Brief ◽

10.1016/j.dib.2021.107240 ◽

2021 ◽

pp. 107240

Author(s):

Wael Ali Mohammed Hadi ◽

Boby T Edwin ◽

A Jayakumaran Nair

Keyword(s):

Genome Sequence ◽

Sequence Data ◽

Marine Water ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Whole genome sequence data of Mycobacterium tuberculosis XDR strain, isolated from patient in Kazakhstan

Data in Brief ◽

10.1016/j.dib.2020.106416 ◽

2020 ◽

Vol 33 ◽

pp. 106416

Author(s):

Asset Daniyarov ◽

Askhat Molkenov ◽

Saule Rakhimova ◽

Ainur Akhmetova ◽

Zhannur Nurkina ◽

...

Keyword(s):

Mycobacterium Tuberculosis ◽

Genome Sequence ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Whole-genome sequence data suggests environmental adaptation of Ethiopian sheep populations

Genome Biology and Evolution ◽

10.1093/gbe/evab014 ◽

2021 ◽

Author(s):

Pamela Wiener ◽

Christelle Robert ◽

Abulgasim Ahbara ◽

Mazdak Salavati ◽

Ayele Abebe ◽

...

Keyword(s):

High Altitude ◽

Environmental Variables ◽

Large Scale ◽

Sequence Data ◽

Strong Association ◽

Environmental Adaptation ◽

Whole Genome Sequence ◽

Single Nucleotide Variants ◽

High Altitude Adaptation ◽

Altitude Adaptation

Abstract Great progress has been made over recent years in the identification of selection signatures in the genomes of livestock species. This work has primarily been carried out in commercial breeds for which the dominant selection pressures, are associated with artificial selection. As agriculture and food security are likely to be strongly affected by climate change, a better understanding of environment-imposed selection on agricultural species is warranted. Ethiopia is an ideal setting to investigate environmental adaptation in livestock due to its wide variation in geo-climatic characteristics and the extensive genetic and phenotypic variation of its livestock. Here, we identified over three million single nucleotide variants across 12 Ethiopian sheep populations and applied landscape genomics approaches to investigate the association between these variants and environmental variables. Our results suggest that environmental adaptation for precipitation-related variables is stronger than that related to altitude or temperature, consistent with large-scale meta-analyses of selection pressure across species. The set of genes showing association with environmental variables was enriched for genes highly expressed in human blood and nerve tissues. There was also evidence of enrichment for genes associated with high-altitude adaptation although no strong association was identified with hypoxia-inducible-factor (HIF) genes. One of the strongest altitude-related signals was for a collagen gene, consistent with previous studies of high-altitude adaptation. Several altitude-associated genes also showed evidence of adaptation with temperature, suggesting a relationship between responses to these environmental factors. These results provide a foundation to investigate further the effects of climatic variables on small ruminant populations.

Download Full-text

Elucidating the genetic basis of an oligogenic birth defect using whole genome sequence data in a non-model organism, Bubalus bubalis

Scientific Reports ◽

10.1038/srep39719 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 10

Author(s):

Lynsey K. Whitacre ◽

Jesse L. Hoff ◽

Robert D. Schnabel ◽

Sara Albarella ◽

Francesca Ciotola ◽

...

Keyword(s):

Genome Sequence ◽

Birth Defect ◽

Genetic Basis ◽

Sequence Data ◽

Model Organism ◽

Bubalus Bubalis ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Phylogeny and Taxonomical Investigation ofTrichodermaspp. from Indian Region of Indo-Burma Biodiversity Hot Spot Region with Special Reference to Manipur

BioMed Research International ◽

10.1155/2015/285261 ◽

2015 ◽

Vol 2015 ◽

pp. 1-21 ◽

Cited By ~ 8

Author(s):

Th. Kamala ◽

S. Indira Devi ◽

K. Chandradev Sharma ◽

K. Kennedy

Keyword(s):

Sequence Data ◽

Hot Spot ◽

Morphological Characteristics ◽

Indian Region ◽

Agricultural Practice ◽

Phenotypic Data ◽

Unique Region ◽

Seedborne Pathogens ◽

Fungal Phytopathogens ◽

Agroclimatic Zones

Towards assessing the genetic diversity and occurrence ofTrichodermaspecies from the Indian region of Indo-Burma Biodiversity hotspot, a total of 193Trichodermastrains were isolated from cultivated soils of nine different districts of Manipur comprising 4 different agroclimatic zones. The isolates were grouped based on the morphological characteristics. ITS-RFLP of the rDNA region using three restriction digestion enzymes: Mob1, Taq1, and Hinf1, showed interspecific variations among 65 isolates ofTrichoderma. Based on ITS sequence data, a total of 22 different types of representativeTrichodermaspecies were reported and phylogenetic analysis showed 4 well-separated main clades in whichT. harzianumwas found to be the most prevalent spp. among all theTrichodermaspp. Combined molecular and phenotypic data leads to the development of a taxonomy of all the 22 differentTrichodermaspp., which was reported for the first time from this unique region. All these species were found to produce different extrolites and enzymes responsible for the biocontrol activities against the harmful fungal phytopathogens that hamper in food production. This potential indigenousTrichodermaspp. can be targeted for the development of suitable bioformulation against soil and seedborne pathogens in sustainable agricultural practice.

Download Full-text