scholarly journals Tandem Repeats in Bacillus: Unique Features and Taxonomic Distribution

2021 ◽  
Vol 22 (10) ◽  
pp. 5373
Author(s):  
Juan A. Subirana ◽  
Xavier Messeguer

Little is known about DNA tandem repeats across prokaryotes. We have recently described an enigmatic group of tandem repeats in bacterial genomes with a constant repeat size but variable sequence. These findings strongly suggest that tandem repeat size in some bacteria is under strong selective constraints. Here, we extend these studies and describe tandem repeats in a large set of Bacillus. Some species have very few repeats, while other species have a large number. Most tandem repeats have repeats with a constant size (either 52 or 20–21 nt), but a variable sequence. We characterize in detail these intriguing tandem repeats. Individual species have several families of tandem repeats with the same repeat length and different sequence. This result is in strong contrast with eukaryotes, where tandem repeats of many sizes are found in any species. We discuss the possibility that they are transcribed as small RNA molecules. They may also be involved in the stabilization of the nucleoid through interaction with proteins. We also show that the distribution of tandem repeats in different species has a taxonomic significance. The data we present for all tandem repeats and their families in these bacterial species will be useful for further genomic studies.

2020 ◽  
Vol 202 (21) ◽  
Author(s):  
Juan A. Subirana ◽  
Xavier Messeguer

ABSTRACT DNA tandem repeats, or satellites, are well described in eukaryotic species, but little is known about their prevalence across prokaryotes. Here, we performed the most complete characterization to date of satellites in bacteria. We identified 121,638 satellites from 12,233 fully sequenced and assembled bacterial genomes with a very uneven distribution. We also determined the families of satellites which have a related sequence. There are 85 genomes that are particularly satellite rich and contain several families of satellites of yet unknown function. Interestingly, we only found two main types of noncoding satellites, depending on their repeat sizes, 22/44 or 52 nucleotides (nt). An intriguing feature is the constant size of the repeats in the genomes of different species, whereas their sequences show no conservation. Individual species also have several families of satellites with the same repeat length and different sequences. This result is in marked contrast with previous findings in eukaryotes, where noncoding satellites of many sizes are found in any species investigated. We describe in greater detail these noncoding satellites in the spirochete Leptospira interrogans and in several bacilli. These satellites undoubtedly play a specific role in the species which have acquired them. We discuss the possibility that they represent binding sites for transcription factors not previously described or that they are involved in the stabilization of the nucleoid through interaction with proteins. IMPORTANCE We found an enigmatic group of noncoding satellites in 85 bacterial genomes with a constant repeat size but variable sequence. This pattern of DNA organization is unique and had not been previously described in bacteria. These findings strongly suggest that satellite size in some bacteria is under strong selective constraints and thus that satellites are very likely to play a fundamental role. We also provide a list and properties of all satellites in 12,233 genomes, which may be used for further genomic analysis.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Chelsea A. Weitekamp ◽  
Allison Kvasnicka ◽  
Scott P. Keely ◽  
Nichole E. Brinkman ◽  
Xia Meng Howey ◽  
...  

Abstract Background Across taxa, animals with depleted intestinal microbiomes show disrupted behavioral phenotypes. Axenic (i.e., microbe-free) mice, zebrafish, and fruit flies exhibit increased locomotor behavior, or hyperactivity. The mechanism through which bacteria interact with host cells to trigger normal neurobehavioral development in larval zebrafish is not well understood. Here, we monoassociated zebrafish with either one of six different zebrafish-associated bacteria, mixtures of these host-associates, or with an environmental bacterial isolate. Results As predicted, the axenic cohort was hyperactive. Monoassociation with three different host-associated bacterial species, as well as with the mixtures, resulted in control-like locomotor behavior. Monoassociation with one host-associate and the environmental isolate resulted in the hyperactive phenotype characteristic of axenic larvae, while monoassociation with two other host-associated bacteria partially blocked this phenotype. Furthermore, we found an inverse relationship between the total concentration of bacteria per larvae and locomotor behavior. Lastly, in the axenic and associated cohorts, but not in the larvae with complex communities, we detected unexpected bacteria, some of which may be present as facultative predators. Conclusions These data support a growing body of evidence that individual species of bacteria can have different effects on host behavior, potentially related to their success at intestinal colonization. Specific to the zebrafish model, our results suggest that differences in the composition of microbes in fish facilities could affect the results of behavioral assays within pharmacological and toxicological studies.


Genetics ◽  
2000 ◽  
Vol 156 (2) ◽  
pp. 549-557 ◽  
Author(s):  
Anne J Welcker ◽  
Jacky de Montigny ◽  
Serge Potier ◽  
Jean-Luc Souciet

Abstract Chromosomal rearrangements, such as deletions, duplications, or Ty transposition, are rare events. We devised a method to select for such events as Ura+ revertants of a particular ura2 mutant. Among 133 Ura+ revertants, 14 were identified as the result of a deletion in URA2. Of seven classes of deletions, six had very short regions of identity at their junctions (from 7 to 13 bp long). This strongly suggests a nonhomologous recombination mechanism for the formation of these deletions. The total Ura+ reversion rate was increased 4.2-fold in a rad52Δ strain compared to the wild type, and the deletion rate was significantly increased. All the deletions selected in the rad52Δ context had microhomologies at their junctions. We propose two mechanisms to explain the occurrence of these deletions and discuss the role of microhomology stretches in the formation of fusion proteins.


mSystems ◽  
2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Matthew R. Olm ◽  
Alexander Crits-Christoph ◽  
Spencer Diamond ◽  
Adi Lavy ◽  
Paula B. Matheus Carnevali ◽  
...  

ABSTRACT Longstanding questions relate to the existence of naturally distinct bacterial species and genetic approaches to distinguish them. Bacterial genomes in public databases form distinct groups, but these databases are subject to isolation and deposition biases. To avoid these biases, we compared 5,203 bacterial genomes from 1,457 environmental metagenomic samples to test for distinct clouds of diversity and evaluated metrics that could be used to define the species boundary. Bacterial genomes from the human gut, soil, and the ocean all exhibited gaps in whole-genome average nucleotide identities (ANI) near the previously suggested species threshold of 95% ANI. While genome-wide ratios of nonsynonymous and synonymous nucleotide differences (dN/dS) decrease until ANI values approach ∼98%, two methods for estimating homologous recombination approached zero at ∼95% ANI, supporting breakdown of recombination due to sequence divergence as a species-forming force. We evaluated 107 genome-based metrics for their ability to distinguish species when full genomes are not recovered. Full-length 16S rRNA genes were least useful, in part because they were underrecovered from metagenomes. However, many ribosomal proteins displayed both high metagenomic recoverability and species discrimination power. Taken together, our results verify the existence of sequence-discrete microbial species in metagenome-derived genomes and highlight the usefulness of ribosomal genes for gene-level species discrimination. IMPORTANCE There is controversy about whether bacterial diversity is clustered into distinct species groups or exists as a continuum. To address this issue, we analyzed bacterial genome databases and reports from several previous large-scale environment studies and identified clear discrete groups of species-level bacterial diversity in all cases. Genetic analysis further revealed that quasi-sexual reproduction via horizontal gene transfer is likely a key evolutionary force that maintains bacterial species integrity. We next benchmarked over 100 metrics to distinguish these bacterial species from each other and identified several genes encoding ribosomal proteins with high species discrimination power. Overall, the results from this study provide best practices for bacterial species delineation based on genome content and insight into the nature of bacterial species population genetics.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Mathys Grapotte ◽  
Manu Saraswat ◽  
Chloé Bessière ◽  
Christophe Menichelli ◽  
Jordan A. Ramilowski ◽  
...  

AbstractUsing the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.


Gene ◽  
2008 ◽  
Vol 410 (1) ◽  
pp. 18-25 ◽  
Author(s):  
Edit Kassai-Jáger ◽  
Csaba Ortutay ◽  
Gábor Tóth ◽  
Tibor Vellai ◽  
Zoltán Gáspári

2017 ◽  
Author(s):  
Marcus M. Dillon ◽  
Way Sung ◽  
Michael Lynch ◽  
Vaughn S. Cooper

ABSTRACTThe causes and consequences of spatiotemporal variation in mutation rates remains to be explored in nearly all organisms. Here we examine relationships between local mutation rates and replication timing in three bacterial species whose genomes have multiple chromosomes:Vibrio fischeri, Vibrio cholerae, andBurkholderia cenocepacia. Following five evolution experiments with these bacteria conducted in the near-absence of natural selection, the genomes of clones from each lineage were sequenced and analyzed to identify variation in mutation rates and spectra. In lineages lacking mismatch repair, base-substitution mutation rates vary in a mirrored wave-like pattern on opposing replichores of the large chromosome ofV. fischeriandV. cholerae, where concurrently replicated regions experience similar base-substitution mutation rates. The base-substitution mutation rates on the small chromosome are less variable in both species but occur at similar rates as the concurrently replicated regions of the large chromosome. Neither nucleotide composition nor frequency of nucleotide motifs differed among regions experiencing high and low base-substitution rates, which along with the inferred ~800 Kb wave period suggests that the source of the periodicity is not sequence-specific but rather a systematic process related to the cell cycle. These results support the notion that base-substitution mutation rates are likely to vary systematically across many bacterial genomes, which exposes certain genes to elevated deleterious mutational load.


2020 ◽  
Author(s):  
Mathys Grapotte ◽  
Manu Saraswat ◽  
Chloé Bessière ◽  
Christophe Menichelli ◽  
Jordan A. Ramilowski ◽  
...  

Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of Transcription Start Sites (TSSs) in several species. Strikingly, ~ 72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probed these unassigned TSSs and showed that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we developed Cap Trap RNA-seq, a technology which combines cap trapping and long reads MinION sequencing. We trained sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveiled the importance of STR surrounding sequences not only to distinguish STR classes, as defined by the repeated DNA motif, one from each other, but also to predict their transcription. Excitingly, our models predicted that genetic variants linked to human diseases affect STR-associated transcription and correspond precisely to the key positions identified by our models to predict transcription. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.


Genetics ◽  
1998 ◽  
Vol 148 (2) ◽  
pp. 839-850
Author(s):  
William P Young ◽  
Paul A Wheeler ◽  
Virginia H Coryell ◽  
Paul Keim ◽  
Gary H Thorgaard

Abstract We report the first detailed genetic linkage map of rainbow trout (Oncorhynchus mykiss). The segregation analysis was performed using 76 doubled haploid rainbow trout produced by androgenesis from a hybrid between the “OSU” and “Arlee” androgenetically derived homozygous lines. Four hundred and seventy-six markers segregated into 31 major linkage groups and 11 small groups (<5 markers/group). The minimum genome size is estimated to be 2627.5 cM in length. The sex-determining locus segregated to a distal position on one of the linkage groups. We analyzed the chromosomal distribution of three classes of markers: (1) amplified fragment length polymorphisms, (2) variable number of tandem repeats, and (3) markers obtained using probes homologous to the 5′ or 3′ end of salmonid-specific small interspersed nuclear elements. Many of the first class of markers were clustered in regions that appear to correspond to centromeres. The second class of markers were more telomeric in distribution, and the third class were intermediate. Tetrasomic inheritance, apparently related to the tetraploid ancestry of salmonid fishes, was detected at one simple sequence repeat locus and suggested by the presence of one extremely large linkage group that appeared to consist of two smaller groups linked at their tips. The double haploid rainbow trout lines and linkage map present a foundation for further genomic studies.


2020 ◽  
Author(s):  
Orestis Nousias ◽  
Federica Montesanto

AbstractMicrobial communities play a fundamental role in the association with marine algae, in fact they are recognized to be actively involved in growth and morphogenesis.Porphyra purpurea is a red algae commonly found in the intertidal zone with an high economical value, indeed several species belonging to the genus Porphyra are intensely cultivated in the Eastern Asian countries. Moreover, P. purpurea is widely used as model species in different fields, mainly due to its peculiar life cycle. Despite of that, little is known about the microbial community associated to this species. Here we report the microbial-associated diversity of P. purpurea in four different localities (Ireland, Italy United Kingdom and USA) through the analysis of eight metagenomic datasets obtained from the publicly available metagenomic nucleotide database (https://www.ebi.ac.uk/ena/). The metagenomic datasets were quality controlled with FastQC version 0.11.8, pre-processed with Trimmomatic version 0.39 and analysed with Methaplan 3.0, with a reference database containing clade specific marker genes from ~ 99.500 bacterial genomes, following the pan-genome approach, in order to identify the putative bacterial taxonomies and their relative abundances. Furthermore, we compared the results to the 16S rRNA metagenomic analysis pipeline of MGnify database to evaluate the effectiveness of the two methods. Out of the 43 bacterial species identified with MetaPhlAn 3.0 only 5 were common with the MGnify results and from the 21 genera, only 9 were common. This approach highlighted the different taxonomical resolution of a 16S rRNA OTU-based method in contrast to the pan-genome approach deployed by MetaPhlAn 3.0.


Sign in / Sign up

Export Citation Format

Share Document