Complete Genome Sequence of the Wolbachia wAlbB Endosymbiont of Aedes albopictus
AbstractWolbachia, an alpha-proteobacterium closely related to Rickettsia is a maternally transmitted, intracellular symbiont of arthropods and nematodes. Aedes albopictus mosquitoes are naturally infected with Wolbachia strains wAlbA and wAlbB. Cell line Aa23 established from Ae. albopictus embryos retains only wAlbB and is a key model to study host-endosymbiont interactions. We have assembled the complete circular genome of wAlbB from the Aa23 cell line using long-read PacBio sequencing at 500X median coverage. The assembled circular chromosome is 1.48 megabases in size, an increase of more than 300 kb over the published draft wAlbB genome. The annotation of the genome identified 1,205 protein coding genes, 34 tRNA, 3 rRNA, 1 tmRNA and 3 other ncRNA loci. The long reads enabled sequencing over complex repeat regions which are difficult to resolve with short-read sequencing. Thirteen percent of the genome is comprised of IS elements distributed throughout the genome, some of which cause pseudogenization. Prophage WO genes encoding some essential components of phage particle assembly are missing, while the remainder are scattered around the genome. Orthology analysis identified a core proteome of 536 orthogroups across all completed Wolbachia genomes. The majority of proteins could be annotated using Pfam and eggNOG analyses, including ankyrins and components of the T4SS. KEGG analysis revealed the absence of 5 genes in wAlbB which are present in other Wolbachia. The availability of a complete circular chromosome from wAlbB will enable further biochemical, molecular and genetic analyses on this strain and related Wolbachia.Data depositionRaw data from PacBio sequencing have been deposited in the NCBI SRA database under BioProject accession number PRJNA454708, as runs SRR7784284, SRR7784285, SRR7784286, SRR7784287. The paired-end reads from Illumina library used for indel correction are available from NCBI SRA database as accession SRR7623731. The assembled genome and annotations have been submitted to the NCBI GenBank database with accession number CP031221.