scholarly journals Flanker: a tool for comparative genomics of gene flanking regions

2021 ◽  
Author(s):  
William Matlock ◽  
Samuel Lipworth ◽  
Bede Constantinides ◽  
Timothy E.A. Peto ◽  
A. Sarah Walker ◽  
...  

AbstractAnalysing the flanking sequences surrounding genes of interest is often highly relevant to understanding the role of mobile genetic elements (MGEs) in horizontal gene transfer, particular for antimicrobial resistance genes. Here, we present Flanker, a Python package which performs alignment-free clustering of gene flanking sequences in a consistent format, allowing investigation of MGEs without prior knowledge of their structure. Flanker clusters flanking sequences based on Mash distances, allowing for easy comparison of similarity and the extent of this similarity across sequences. Additionally, Flanker can be flexibly parameterised to finetune outputs by characterising upstream and downstream regions separately and investigating variable lengths of flanking sequence. We apply Flanker to two recent datasets describing plasmid-associated carriage of important carbapenemase genes (blaOXA-48 and blaKPC-2/3) and show that it successfully identifies distinct clusters of flanking sequences (flank patterns), including both known and previously uncharacterised structural variants. We demonstrate that flank patterns are linked to geographical regions and carbapenem phenotypes, suggesting they may be useful as epidemiological markers. Flanker is freely available under an MIT license at https://github.com/wtmatlock/flanker.Data SummaryNCBI accession numbers for all sequencing data used in this study is provided in Supplementary Table 1. The analysis performed in this manuscript can be reproduced in a binder environment provided on the Flanker Github page (https://github.com/wtmatlock/flanker).

2021 ◽  
Vol 7 (9) ◽  
Author(s):  
William Matlock ◽  
Samuel Lipworth ◽  
Bede Constantinides ◽  
Timothy E. A. Peto ◽  
A. Sarah Walker ◽  
...  

Analysing the flanking sequences surrounding genes of interest is often highly relevant to understanding the role of mobile genetic elements (MGEs) in horizontal gene transfer, particular for antimicrobial-resistance genes. Here, we present Flanker, a Python package that performs alignment-free clustering of gene flanking sequences in a consistent format, allowing investigation of MGEs without prior knowledge of their structure. These clusters, known as ‘flank patterns’ (FPs), are based on Mash distances, allowing for easy comparison of similarity across sequences. Additionally, Flanker can be flexibly parameterized to fine-tune outputs by characterizing upstream and downstream regions separately, and investigating variable lengths of flanking sequence. We apply Flanker to two recent datasets describing plasmid-associated carriage of important carbapenemase genes (bla OXA-48 and bla KPC-2/3) and show that it successfully identifies distinct clusters of FPs, including both known and previously uncharacterized structural variants. For example, Flanker identified four Tn4401 profiles that could not be sufficiently characterized using TETyper or MobileElementFinder, demonstrating the utility of Flanker for flanking-gene characterization. Similarly, using a large (n=226) European isolate dataset, we confirm findings from a previous smaller study demonstrating association between Tn1999.2 and bla OXA-48 upregulation and demonstrate 17 FPs (compared to the 5 previously identified). More generally, the demonstration in this study that FPs are associated with geographical regions and antibiotic-susceptibility phenotypes suggests that they may be useful as epidemiological markers. Flanker is freely available under an MIT license at https://github.com/wtmatlock/flanker.


2021 ◽  
Vol 7 (7) ◽  
Author(s):  
Casper Jamin ◽  
Sien De Koster ◽  
Stefanie van Koeveringe ◽  
Dieter De Coninck ◽  
Klaas Mensaert ◽  
...  

Whole-genome sequencing (WGS) is becoming the de facto standard for bacterial typing and outbreak surveillance of resistant bacterial pathogens. However, interoperability for WGS of bacterial outbreaks is poorly understood. We hypothesized that harmonization of WGS for outbreak surveillance is achievable through the use of identical protocols for both data generation and data analysis. A set of 30 bacterial isolates, comprising of various species belonging to the Enterobacteriaceae family and Enterococcus genera, were selected and sequenced using the same protocol on the Illumina MiSeq platform in each individual centre. All generated sequencing data were analysed by one centre using BioNumerics (6.7.3) for (i) genotyping origin of replications and antimicrobial resistance genes, (ii) core-genome multi-locus sequence typing (cgMLST) for Escherichia coli and Klebsiella pneumoniae and whole-genome multi-locus sequencing typing (wgMLST) for all species. Additionally, a split k-mer analysis was performed to determine the number of SNPs between samples. A precision of 99.0% and an accuracy of 99.2% was achieved for genotyping. Based on cgMLST, a discrepant allele was called only in 2/27 and 3/15 comparisons between two genomes, for E. coli and K. pneumoniae, respectively. Based on wgMLST, the number of discrepant alleles ranged from 0 to 7 (average 1.6). For SNPs, this ranged from 0 to 11 SNPs (average 3.4). Furthermore, we demonstrate that using different de novo assemblers to analyse the same dataset introduces up to 150 SNPs, which surpasses most thresholds for bacterial outbreaks. This shows the importance of harmonization of data-processing surveillance of bacterial outbreaks. In summary, multi-centre WGS for bacterial surveillance is achievable, but only if protocols are harmonized.


2017 ◽  
Author(s):  
Yasutsugu Suzuki ◽  
Lionel Frangeul ◽  
Laura B. Dickson ◽  
Hervé Blanc ◽  
Yann Verdier ◽  
...  

AbstractEndogenous viral elements derived from non-retroviral RNA viruses were described in various animal genomes. Whether they have a biological function such as host immune protection against related viruses is a field of intense study. Here, we investigated the repertoire of endogenous flaviviral elements (EFVEs) inAedesmosquitoes, the vectors of arboviruses such as dengue and chikungunya viruses. Previous studies identified three EFVEs fromAe. albopictusand one fromAe. aegypticell lines. However, in-depth characterization of EFVEs in wild-type mosquito populations and individualsin vivohas not been performed. We detected the full-length DNA sequence of the previously described EFVEs and their respective transcripts in severalAe. albopictusandAe. aegyptipopulations from geographically distinct areas. However, EFVE-derived proteins were not detected by mass spectrometry. Using deep sequencing, we detected the production of piRNA-like small RNAs in antisense orientation, targeting the EFVEs and their flanking regionsin vivo. The EFVEs were integrated in repetitive regions of the mosquito genomes, and their flanking sequences varied among mosquito populations from different geographical regions. We bioinformatically predicted several new EFVEs from a VietnameseAe. albopictuspopulation and observed variation in the occurrence of those elements among mosquito populations. Phylogenetic analysis of anAe. aegyptiEFVE suggested that it integrated prior to the global expansion of the species and subsequently diverged among and within populations. Together, this study revealed substantial structural and nucleotide diversity of flaviviral integrations inAedesgenomes. Unraveling this diversity will help to elucidate the potential biological function of these EFVEs.ImportanceEndogenous viral elements (EVEs) are whole or partial viral sequences integrated in host genomes. Interestingly, some EVEs have important functions for host fitness and antiviral defense. Because mosquitoes also have EVEs in their genomes, we decided to thoroughly characterized them to lay the foundation of the potential use of these EVEs to manipulate the mosquito antiviral response. Here, we focused on EVEs related to theFlavivirusgenus, to which dengue and Zika viruses belong, inAedesmosquito individuals from geographically distinct areas. We showed the existencein vivoof flaviviral EVEs previously identified in mosquito cell lines and we detected new ones. We showed that EVEs have evolved differently in each mosquito population. They produced transcripts and small RNAs, but not proteins, suggesting a function at the RNA level. Our study uncovers the diverse repertoire of flaviviral EVEs inAedesmosquito populations and suggests a role in the host antiviral system.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Flavia Mascagni ◽  
Gabriele Usai ◽  
Andrea Cavallini ◽  
Andrea Porceddu

AbstractWe identified and characterized the pseudogene complements of five plant species: four dicots (Arabidopsis thaliana, Vitis vinifera, Populus trichocarpa and Phaseolus vulgaris) and one monocot (Oryza sativa). Retroposition was considered of modest importance for pseudogene formation in all investigated species except V. vinifera, which showed an unusually high number of retro-pseudogenes in non coding genic regions. By using a pipeline for the classification of sequence duplicates in plant genomes, we compared the relative importance of whole genome, tandem, proximal, transposed and dispersed duplication modes in the pseudo and functional gene complements. Pseudogenes showed higher tendencies than functional genes to genomic dispersion. Dispersed pseudogenes were prevalently fragmented and showed high sequence divergence at flanking regions. On the contrary, those deriving from whole genome duplication were proportionally less than expected based on observations on functional loci and showed higher levels of flanking sequence conservation than dispersed pseudogenes. Pseudogenes deriving from tandem and proximal duplications were in excess compared to functional loci, probably reflecting the high evolutionary rate associated with these duplication modes in plant genomes. These data are compatible with high rates of sequence turnover at neutral sites and double strand break repairs mediated duplication mechanisms.


2021 ◽  
Vol 22 (5) ◽  
pp. 2409
Author(s):  
Anastasia A. Bizyaeva ◽  
Dmitry A. Bunin ◽  
Valeria L. Moiseenko ◽  
Alexandra S. Gambaryan ◽  
Sonja Balk ◽  
...  

Nucleic acid aptamers are generally accepted as promising elements for the specific and high-affinity binding of various biomolecules. It has been shown for a number of aptamers that the complexes with several related proteins may possess a similar affinity. An outstanding example is the G-quadruplex DNA aptamer RHA0385, which binds to the hemagglutinins of various influenza A virus strains. These hemagglutinins have homologous tertiary structures but moderate-to-low amino acid sequence identities. Here, the experiment was inverted, targeting the same protein using a set of related, parallel G-quadruplexes. The 5′- and 3′-flanking sequences of RHA0385 were truncated to yield parallel G-quadruplex with three propeller loops that were 7, 1, and 1 nucleotides in length. Next, a set of minimal, parallel G-quadruplexes with three single-nucleotide loops was tested. These G-quadruplexes were characterized both structurally and functionally. All parallel G-quadruplexes had affinities for both recombinant hemagglutinin and influenza virions. In summary, the parallel G-quadruplex represents a minimal core structure with functional activity that binds influenza A hemagglutinin. The flanking sequences and loops represent additional features that can be used to modulate the affinity. Thus, the RHA0385–hemagglutinin complex serves as an excellent example of the hypothesis of a core structure that is decorated with additional recognizing elements capable of improving the binding properties of the aptamer.


Genetics ◽  
1991 ◽  
Vol 129 (4) ◽  
pp. 1021-1032 ◽  
Author(s):  
M J Mahan ◽  
J R Roth

Abstract Homologous recombination between sequences present in inverse order within the same chromosome can result in inversion formation. We have previously shown that inverse order sequences at some sites (permissive) recombine to generate the expected inversion; no inversions are found when the same inverse order sequences flank other (nonpermissive) regions of the chromosome. In hopes of defining how permissive and nonpermissive intervals are determined, we have constructed a strain that carries a large chromosomal inversion. Using this inversion mutant as the parent strain, we have determined the "permissivity" of a series of chromosomal sites for secondary inversions. For the set of intervals tested, permissivity seems to be dictated by the nature of the genetic material present within the chromosomal interval being tested rather than the flanking sequences or orientation of this material in the chromosome. Almost all permissive intervals include the origin or terminus of replication. We suggest that the rules for recovery of inversions reflect mechanistic restrictions on the occurrence of inversions rather than lethal consequences of the completed rearrangement.


GigaScience ◽  
2021 ◽  
Vol 10 (5) ◽  
Author(s):  
Colin Farrell ◽  
Michael Thompson ◽  
Anela Tosevska ◽  
Adewale Oyetunde ◽  
Matteo Pellegrini

Abstract Background Bisulfite sequencing is commonly used to measure DNA methylation. Processing bisulfite sequencing data is often challenging owing to the computational demands of mapping a low-complexity, asymmetrical library and the lack of a unified processing toolset to produce an analysis-ready methylation matrix from read alignments. To address these shortcomings, we have developed BiSulfite Bolt (BSBolt), a fast and scalable bisulfite sequencing analysis platform. BSBolt performs a pre-alignment sequencing read assessment step to improve efficiency when handling asymmetrical bisulfite sequencing libraries. Findings We evaluated BSBolt against simulated and real bisulfite sequencing libraries. We found that BSBolt provides accurate and fast bisulfite sequencing alignments and methylation calls. We also compared BSBolt to several existing bisulfite alignment tools and found BSBolt outperforms Bismark, BSSeeker2, BISCUIT, and BWA-Meth based on alignment accuracy and methylation calling accuracy. Conclusion BSBolt offers streamlined processing of bisulfite sequencing data through an integrated toolset that offers support for simulation, alignment, methylation calling, and data aggregation. BSBolt is implemented as a Python package and command line utility for flexibility when building informatics pipelines. BSBolt is available at https://github.com/NuttyLogic/BSBolt under an MIT license.


Trials ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Mojca Jensterle ◽  
Simona Ferjan ◽  
Tadej Battelino ◽  
Jernej Kovač ◽  
Saba Battelino ◽  
...  

Abstract Background Preclinical studies demonstrated that glucagon-like peptide 1 (GLP-1) is locally synthesized in taste bud cells and that GLP-1 receptor exists on the gustatory nerves in close proximity to GLP-1-containing taste bud cells. This local paracrine GLP-1 signalling seems to be specifically involved in the perception of sweets. However, the role of GLP-1 in taste perception remains largely unaddressed in clinical studies. Whether any weight-reducing effects of GLP-1 receptor agonists are mediated through the modulation of taste perception is currently unknown. Methods and analysis This is an investigator-initiated, randomized single-blind, placebo-controlled clinical trial. We will enrol 30 women with obesity and polycystic ovary syndrome (PCOS). Participants will be randomized in a 1:1 ratio to either semaglutide 1.0 mg or placebo for 16 weeks. The primary endpoints are alteration of transcriptomic profile of tongue tissue as changes in expression level from baseline to follow-up after 16 weeks of treatment, measured by RNA sequencing, and change in taste sensitivity as detected by chemical gustometry. Secondary endpoints include change in neural response to visual food cues and to sweet-tasting substances as assessed by functional MRI, change in body weight, change in fat mass and change in eating behaviour and food intake. Discussion This is the first study to investigate the role of semaglutide on taste perception, along with a neural response to visual food cues in reward processing regions. The study may identify the tongue and the taste perception as a novel target for GLP-1 receptor agonists. Ethics and disseminations The study has been approved by the Slovene National Medical Ethics Committee and will be conducted in accordance with the Declaration of Helsinki and Good Clinical Practice guidelines. Results will be submitted for publication in an international peer-reviewed scientific journal. Trial registration ClinicalTrials.govNCT04263415. Retrospectively registered on 10 February 2020


Author(s):  
Christie M. Ballantyne ◽  
Harold Bays ◽  
Alberico L. Catapano ◽  
Anne Goldberg ◽  
Kausik K. Ray ◽  
...  
Keyword(s):  

A Correction to this paper has been published: 10.1007/s10557-021-07188-w


Sign in / Sign up

Export Citation Format

Share Document