Whole genome sequencing of Borrelia miyamotoi isolate Izh-4: reference for a complex bacterial genome

Abstract Background The genus Borrelia comprises spirochaetal bacteria maintained in natural transmission cycles by tick vectors and vertebrate reservoir hosts. The main groups are represented by a species complex including the causative agents of Lyme borreliosis and relapsing fever group Borrelia. Borrelia miyamotoi belongs to the relapsing fever group of spirochetes and forms distinct populations in North America, Asia, and Europe. As all Borrelia species B. miyamotoi possess an unusual and complex genome consisting of a linear chromosome and a number of linear and circular plasmids. The species is considered an emerging human pathogen and an increasing number of human cases are being described in the Northern hemisphere. The aim of this study was to produce a high quality reference genome that will facilitate future studies into genetic differences between different populations and the genome plasticity of B. miyamotoi. Results We used multiple available sequencing methods, including Pacific Bioscience single-molecule real-time technology (SMRT) and Oxford Nanopore technology (ONT) supplemented with highly accurate Illumina sequences, to explore the suitability for whole genome assembly of the Russian B. miyamotoi isolate, Izh-4. Plasmids were typed according to their potential plasmid partitioning genes (PF32, 49, 50, 57/62). Comparing and combining results of both long-read (SMRT and ONT) and short-read methods (Illumina), we determined that the genome of the isolate Izh-4 consisted of one linear chromosome, 12 linear and two circular plasmids. Whilst the majority of plasmids had corresponding contigs in the Asian B. miyamotoi isolate FR64b, there were only four that matched plasmids of the North American isolate CT13–2396, indicating differences between B. miyamotoi populations. Several plasmids, e.g. lp41, lp29, lp23, and lp24, were found to carry variable major proteins. Amongst those were variable large proteins (Vlp) subtype Vlp-α, Vlp-γ, Vlp-δ and also Vlp-β. Phylogenetic analysis of common plasmids types showed the uniqueness in Russian/Asian isolates of B. miyamotoi compared to other isolates. Conclusions We here describe the genome of a Russian B. miyamotoi clinical isolate, providing a solid basis for future comparative genomics of B. miyamotoi isolates. This will be a great impetus for further basic, molecular and epidemiological research on this emerging tick-borne pathogen.

Download Full-text

Whole genome sequencing of Borrelia miyamotoi isolate Izh-4: reference for a complex bacterial genome

10.21203/rs.2.16381/v2 ◽

2019 ◽

Author(s):

Konstantin V. Kuleshov ◽

Gabriele Margos ◽

Volker Fingerle ◽

Joris Koetsveld ◽

Irina A. Goptar ◽

...

Keyword(s):

Single Molecule ◽

Bacterial Genome ◽

Borrelia Miyamotoi ◽

Whole Genome ◽

Relapsing Fever ◽

Linear Chromosome ◽

The North ◽

Long Read ◽

Variable Major Proteins ◽

Tick Vectors

Abstract Background: The genus Borrelia comprises spirochaetal bacteria maintained in natural transmission cycles by tick vectors and vertebrate reservoir hosts. The main groups are represented by a species complex including the causative agents of Lyme borreliosis and relapsing fever group Borrelia. Borrelia miyamotoi belongs to the relapsing-fever group of spirochetes and forms distinct populations in North America, Asia, and Europe. As all Borrelia species B. miyamotoi possess an unusual and complex genome consisting of a linear chromosome and a number of linear and circular plasmids. The species is considered an emerging human pathogen and an increasing number of human cases are being described in the Northern hemisphere. The aim of this study was to produce a high quality reference genome that will facilitate future studies into genetic differences between different populations and the genome plasticity of B. miyamotoi. Results: We used multiple available sequencing methods, including Pacific Bioscience single-molecule real-time technology (SMRT) and Oxford Nanopore technology (ONT) supplemented with highly accurate Illumina sequences, to explore the suitability for whole genome assembly of the Russian B. miyamotoi isolate, Izh-4. Plasmids were typed according to their potential plasmid partitioning genes (PF32, 49, 50, 57/62). Comparing and combining results of both long-read (SMRT and ONT) and short-read methods (Illumina), we determined that the genome of the isolate Izh-4 consisted of one linear chromosome, 12 linear and two circular plasmids. Whilst the majority of plasmids had corresponding contigs in the Asian B. miyamotoi isolate FR64b, there were only four that matched plasmids of the North American isolate CT13-2396, indicating differences between B. miyamotoi populations. Several plasmids, e.g. lp41, lp29, lp23, and lp24, were found to carry variable major proteins. Amongst those were variable large proteins (Vlp) subtype Vlp-α, Vlp-γ, Vlp-δ and also Vlp-β. Phylogenetic analysis of common plasmids types showed the uniqueness in Russian/Asian isolates of B. miyamotoi compared to other isolates. Conclusions: We here describe the genome of a Russian B. miyamotoi clinical isolate, providing a solid basis for future comparative genomics of B. miyamotoi isolates. This will be a great impetus for further basic, molecular and epidemiological research on this emerging tick-borne pathogen.

Download Full-text

Whole genome sequencing of Borrelia miyamotoi isolate Izh-4: reference for a complex bacterial genome

10.21203/rs.2.16381/v1 ◽

2019 ◽

Author(s):

Konstantin V. Kuleshov ◽

Gabriele Margos ◽

Volker Fingerle ◽

Joris Koetsveld ◽

Irina A. Goptar ◽

...

Keyword(s):

Single Molecule ◽

Bacterial Genome ◽

Borrelia Miyamotoi ◽

Whole Genome ◽

Relapsing Fever ◽

Linear Chromosome ◽

The North ◽

Long Read ◽

Variable Major Proteins ◽

Tick Vectors

Abstract Background The genus Borrelia comprises spirochaetal bacteria maintained in natural transmission cycles by tick vectors and vertebrate reservoir hosts. The main groups are represented by a species complex including the causative agents of Lyme borreliosis and relapsing fever group Borrelia . Borrelia miyamotoi belongs to the relapsing-fever group of spirochetes and forms distinct populations in North America, Asia, and Europe. As all Borrelia species B. miyamotoi possess an unusual and complex genome consisting of a linear chromosome and a number of linear and circular plasmids. The species is considered a relatively new human pathogen and an increasing number of human cases are being described in the Northern hemisphere. The aim of this study was to produce a high quality reference genome that will facilitate future studies into genetic differences between different populations and the genome plasticity of B. miyamotoi . Results We used multiple available sequencing methods, including Pacific Bioscience single-molecule real-time technology (SMRT) and Oxford Nanopore technology (ONT) supplemented with highly accurate Illumina sequences, to explore the suitability for whole genome assembly of the Russian B. miyamotoi isolate, Izh-4. Plasmids were typed according to their potential plasmid partitioning genes (PF32, 49, 50, 57/62). Comparing and combining results of both long-read methods (SMRT and ONT), we determined that the genome of the Izh-4 consisted of one linear chromosome, 12 linear and two circular plasmids. Whilst the majority of plasmids had corresponding assembly fragments in the Asian B. miyamotoi isolate FR64b, there were only four that matched plasmids of the North American isolate CT13-2396, indicating differences between B. miyamotoi populations. Several plasmids, e.g. lp41, lp29, lp23, and lp24, were found to carry variable major proteins. Amongst those were variable large proteins (Vlp) subtype Vlp-α, Vlp-γ, Vlp-δ and also Vlp-β. Phylogenetic analysis of common plasmids types showed the uniqueness in Russian/Asian isolates of B. miyamotoi compared to other isolates. Conclusions We here describe the genome of a Russian B. miyamotoi clinical isolate, providing a solid basis for future comparative genomics of B. miyamotoi isolates. This will be a great impetus for further basic, molecular and epidemiological research on this emerging tick-borne pathogen.

Download Full-text

Cas9-Assisted Targeting of CHromosome segments (CATCH) for targeted nanopore sequencing and optical genome mapping

10.1101/110163 ◽

2017 ◽

Cited By ~ 5

Author(s):

Tslil Gabrieli ◽

Hila Sharim ◽

Yael Michaeli ◽

Yuval Ebenstein

Keyword(s):

Single Molecule ◽

Genome Mapping ◽

Single Point ◽

Read Length ◽

Whole Genome ◽

Sequencing Analysis ◽

Nanopore Sequencing ◽

Sequencing Data ◽

Whole Genome Analysis ◽

Long Read

ABSTRACTVariations in the genetic code, from single point mutations to large structural or copy number alterations, influence susceptibility, onset, and progression of genetic diseases and tumor transformation. Next-generation sequencing analysis is unable to reliably capture aberrations larger than the typical sequencing read length of several hundred bases. Long-read, single-molecule sequencing methods such as SMRT and nanopore sequencing can address larger variations, but require costly whole genome analysis. Here we describe a method for isolation and enrichment of a large genomic region of interest for targeted analysis based on Cas9 excision of two sites flanking the target region and isolation of the excised DNA segment by pulsed field gel electrophoresis. The isolated target remains intact and is ideally suited for optical genome mapping and long-read sequencing at high coverage. In addition, analysis is performed directly on native genomic DNA that retains genetic and epigenetic composition without amplification bias. This method enables detection of mutations and structural variants as well as detailed analysis by generation of hybrid scaffolds composed of optical maps and sequencing data at a fraction of the cost of whole genome sequencing.

Download Full-text

High contiguity de novo genome assembly and DNA modification analyses for the fungus fly, Sciara coprophila, using single-molecule sequencing

BMC Genomics ◽

10.1186/s12864-021-07926-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

John M. Urban ◽

Michael S. Foulk ◽

Jacob E. Bliss ◽

C. Michelle Coleman ◽

Nanyan Lu ◽

...

Keyword(s):

Genome Sequence ◽

Single Molecule ◽

De Novo ◽

Bacterial Genome ◽

Draft Genome ◽

Dna Amplification ◽

Chromosome Elimination ◽

Paternal Chromosome ◽

De Novo Genome Assembly ◽

Long Read

Abstract Background The lower Dipteran fungus fly, Sciara coprophila, has many unique biological features that challenge the rule of genome DNA constancy. For example, Sciara undergoes paternal chromosome elimination and maternal X chromosome nondisjunction during spermatogenesis, paternal X elimination during embryogenesis, intrachromosomal DNA amplification of DNA puff loci during larval development, and germline-limited chromosome elimination from all somatic cells. Paternal chromosome elimination in Sciara was the first observation of imprinting, though the mechanism remains a mystery. Here, we present the first draft genome sequence for Sciara coprophila to take a large step forward in addressing these features. Results We assembled the Sciara genome using PacBio, Nanopore, and Illumina sequencing. To find an optimal assembly using these datasets, we generated 44 short-read and 50 long-read assemblies. We ranked assemblies using 27 metrics assessing contiguity, gene content, and dataset concordance. The highest-ranking assemblies were scaffolded using BioNano optical maps. RNA-seq datasets from multiple life stages and both sexes facilitated genome annotation. A set of 66 metrics was used to select the first draft assembly for Sciara. Nearly half of the Sciara genome sequence was anchored into chromosomes, and all scaffolds were classified as X-linked or autosomal by coverage. Conclusions We determined that X-linked genes in Sciara males undergo dosage compensation. An entire bacterial genome from the Rickettsia genus, a group known to be endosymbionts in insects, was co-assembled with the Sciara genome, opening the possibility that Rickettsia may function in sex determination in Sciara. Finally, the signal level of the PacBio and Nanopore data support the presence of cytosine and adenine modifications in the Sciara genome, consistent with a possible role in imprinting.

Download Full-text

Chromosome and Plasmids of the Tick-Borne Relapsing Fever Agent Borrelia hermsii

Genome Announcements ◽

10.1128/genomea.00528-16 ◽

2016 ◽

Vol 4 (3) ◽

Cited By ~ 5

Author(s):

Alan G. Barbour

Keyword(s):

Next Generation Sequencing ◽

Linear Plasmids ◽

Relapsing Fever ◽

Borrelia Hermsii ◽

Linear Chromosome ◽

Short Read ◽

Zoonotic Pathogen ◽

Long Read ◽

Complete Sequences ◽

Generation Sequencing

The zoonotic pathogen Borrelia hermsii bears its multiple paralogous genes for variable antigens on several linear plasmids. Application of combined long-read and short-read next-generation sequencing provided complete sequences for antigen-encoding plasmids as well as other linear and circular plasmids and the linear chromosome of the genome.

Download Full-text

DNA methylation-calling tools for Oxford Nanopore sequencing: a survey and human epigenome-wide evaluation

Genome Biology ◽

10.1186/s13059-021-02510-z ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Yang Liu ◽

Wojciech Rosikiewicz ◽

Ziwei Pan ◽

Nathaniel Jillette ◽

Ping Wang ◽

...

Keyword(s):

Dna Methylation ◽

Single Molecule ◽

Evaluation Criteria ◽

Systematic Evaluation ◽

Whole Genome ◽

Nanopore Sequencing ◽

Sequencing Data ◽

Long Read ◽

Genome Scale ◽

Analytical Tools

Abstract Background Nanopore long-read sequencing technology greatly expands the capacity of long-range, single-molecule DNA-modification detection. A growing number of analytical tools have been developed to detect DNA methylation from nanopore sequencing reads. Here, we assess the performance of different methylation-calling tools to provide a systematic evaluation to guide researchers performing human epigenome-wide studies. Results We compare seven analytic tools for detecting DNA methylation from nanopore long-read sequencing data generated from human natural DNA at a whole-genome scale. We evaluate the per-read and per-site performance of CpG methylation prediction across different genomic contexts, CpG site coverage, and computational resources consumed by each tool. The seven tools exhibit different performances across the evaluation criteria. We show that the methylation prediction at regions with discordant DNA methylation patterns, intergenic regions, low CG density regions, and repetitive regions show room for improvement across all tools. Furthermore, we demonstrate that 5hmC levels at least partly contribute to the discrepancy between bisulfite and nanopore sequencing. Lastly, we provide an online DNA methylation database (https://nanome.jax.org) to display the DNA methylation levels detected by nanopore sequencing and bisulfite sequencing data across different genomic contexts. Conclusions Our study is the first systematic benchmark of computational methods for detection of mammalian whole-genome DNA modifications in nanopore sequencing. We provide a broad foundation for cross-platform standardization and an evaluation of analytical tools designed for genome-scale modified base detection using nanopore sequencing.

Download Full-text

Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing

Nature Communications ◽

10.1038/s41467-019-12493-y ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 26

Author(s):

Peter Edge ◽

Vikas Bansal

Keyword(s):

Single Molecule ◽

Variant Calling ◽

Small Scale ◽

Whole Genome ◽

Limited Information ◽

Single Nucleotide Variants ◽

Pacific Biosciences ◽

Sequencing Technologies ◽

Long Reads ◽

Long Read

Abstract Whole-genome sequencing using sequencing technologies such as Illumina enables the accurate detection of small-scale variants but provides limited information about haplotypes and variants in repetitive regions of the human genome. Single-molecule sequencing (SMS) technologies such as Pacific Biosciences and Oxford Nanopore generate long reads that can potentially address the limitations of short-read sequencing. However, the high error rate of SMS reads makes it challenging to detect small-scale variants in diploid genomes. We introduce a variant calling method, Longshot, which leverages the haplotype information present in SMS reads to accurately detect and phase single-nucleotide variants (SNVs) in diploid genomes. We demonstrate that Longshot achieves very high accuracy for SNV detection using whole-genome Pacific Biosciences data, outperforms existing variant calling methods, and enables variant detection in duplicated regions of the genome that cannot be mapped using short reads.

Download Full-text

Complete Genome Sequence of the African Strain AXO1947 of Xanthomonas oryzae pv. oryzae

Genome Announcements ◽

10.1128/genomea.01730-15 ◽

2016 ◽

Vol 4 (1) ◽

Cited By ~ 8

Author(s):

J. C. Huguet-Tapia ◽

Z. Peng ◽

B. Yang ◽

Z. Yin ◽

S. Liu ◽

...

Keyword(s):

Single Molecule ◽

Bacterial Genome ◽

Xanthomonas Oryzae ◽

Sequencing Technology ◽

Tal Effectors ◽

Single Chromosome ◽

Long Read ◽

Target Characterization ◽

African Clade

Xanthomonas oryzae pv. oryzae is the etiological agent of bacterial rice blight. Three distinct clades of X. oryzae pv. oryzae are known. We present the complete annotated genome of the African clade strain AXO194 using long-read single-molecule PacBio sequencing technology. The genome comprises a single chromosome of 4,674,975 bp and encodes for nine transcriptional activator-like (TAL) effectors. The approach and data presented in this announcement provide information for complex bacterial genome organization and the discovery of new virulence effectors, and they facilitate target characterization of TAL effectors.

Download Full-text

Long-read whole genome sequencing and comparative analysis of six strains of the human pathogenOrientia tsutsugamushi

10.1101/280958 ◽

2018 ◽

Cited By ~ 2

Author(s):

Elizabeth M. Batty ◽

Suwittra Chaemchuen ◽

Stuart D. Blacksell ◽

Daniel Paris ◽

Rory Bowden ◽

...

Keyword(s):

Comparative Genomics ◽

Single Molecule ◽

Genomic Variation ◽

Intracellular Bacteria ◽

Whole Genome ◽

Orientia Tsutsugamushi ◽

Bacterial Genomes ◽

Obligate Intracellular ◽

Long Read ◽

Obligate Intracellular Bacteria

AbstractBackgroundOrientia tsutsugamushiis a clinically important but neglected obligate intracellular bacterial pathogen of the Rickettsiaceae family that causes the potentially life-threatening human disease scrub typhus. In contrast to the genome reduction seen in many obligate intracellular bacteria, early genetic studies ofOrientiahave revealed one of the most repetitive bacterial genomes sequenced to date. The dramatic expansion of mobile elements has hampered efforts to generate complete genome sequences using short read sequencing methodologies, and consequently there have been few studies of the comparative genomics of this neglected species.ResultsWe report new high-quality genomes ofOrientia tsutsugamushi,generated using PacBio single molecule long read sequencing, for six strains: Karp, Kato, Gilliam, TA686, UT76 and UT176. In comparative genomics analyses of these strains together with existing reference genomes from Ikeda and Boryong strains, we identify a relatively small core genome of 657 genes, grouped into core gene ‘islands’ and separated by repeat regions, and use the core genes to infer the first whole-genome phylogeny ofOrientia.ConclusionsComplete assemblies of multiple Orientia genomes verify initial suggestions that these are remarkable organisms. They have large genomes with widespread amplification of repeat elements and massive chromosomal rearrangements between strains. At the gene level, Orientia has a relatively small set of universally conserved genes, similar to other obligate intracellular bacteria, and the relative expansion in genome size can be accounted for by gene duplication and repeat amplification. Our study demonstrates the utility of long read sequencing to investigate complex bacterial genomes and characterise genomic variation.

Download Full-text

Long-read, whole genome shotgun sequence data for five model organisms

10.1101/008037 ◽

2014 ◽

Cited By ~ 2

Author(s):

Kristi E Kim ◽

Paul Peluso ◽

Primo Baybayan ◽

Patricia Jane Yeadon ◽

Charles Yu ◽

...

Keyword(s):

Single Molecule ◽

De Novo ◽

Sequence Data ◽

Genome Structure ◽

Model Systems ◽

Model Organisms ◽

Biological Research ◽

Whole Genome ◽

De Novo Genome Assembly ◽

Long Read

Single molecule, real-time (SMRT) sequencing from Pacific Biosciences is increasingly used in many areas of biological research including de novo genome assembly, structural-variant identification, haplotype phasing, mRNA isoform discovery, and base-modification analyses. High-quality, public datasets of SMRT sequences can spur development of analytic tools that can accommodate unique characterisitcs of SMRT data (long read lengths, lack of GC or amplification bias, and a random error profile leading to high consensus accuracy). In this paper, we describe eight high-coverage SMRT sequence datasets from five organisms (Escherichia coli, Saccharomyces cerevisiae, Neurospora crassa, Arabidopsis thaliana, and Drosophila melanogaster) that have been publicly released to the general scientific community (NCBI Sequence Read Archive ID SRP040522). Data were generated using two sequencing chemistries (P4-C2 and P5-C3) on the PacBio RS II instrument. The datasets reported here can be used without restriction by the research community to generate whole-genome assemblies, test new algorithms, investigate genome structure and evolution, and identify base modifications in some of the most widely-studied model systems in biological research.

Download Full-text