UK circulating strains of human parainfluenza 3: an amplicon based next generation sequencing method and phylogenetic analysis

Background:Human parainfluenza viruses type 3 (HPIV3) are a prominent cause of respiratory infection with a significant impact in both pediatric and transplant patient cohorts. Currently there is a paucity of whole genome sequence data that would allow for detailed epidemiological and phylogenetic analysis of circulating strains in the UK. Although it is known that HPIV3 peaks annually in the UK, to date there are no whole genome sequences of HPIV3 UK strains available. Methods:Clinical strains were obtained from HPIV3 positive respiratory patient samples collected between 2011 and 2015. These were then amplified using an amplicon based method, sequenced on the Illumina platform and assembled using a new robust bioinformatics pipeline. Phylogenetic analysis was carried out in the context of other epidemiological studies and whole genome sequence data currently available with stringent exclusion of significantly culture-adapted strains of HPIV3.Results:In the current paper we have presented twenty full genome sequences of UK circulating strains of HPIV3 and a detailed phylogenetic analysis thereof. We have analysed the variability along the HPIV3 genome and identified a short hypervariable region in the non-coding segment between the M (matrix) and F (fusion) genes. The epidemiological classifications obtained by using this region and whole genome data were then compared and found to be identical.Conclusions:The majority of HPIV3 strains were observed at different geographical locations and with a wide temporal spread, reflecting the global distribution of HPIV3. Consistent with previous data, a particular subcluster or strain was not identified as specific to the UK, suggesting that a number of genetically diverse strains circulate at any one time. A small hypervariable region in the HPIV3 genome was identified and it was shown that, in the absence of full genome data, this region could be used for epidemiological surveillance of HPIV3.

Download Full-text

UK circulating strains of human parainfluenza 3: an amplicon based next generation sequencing method and phylogenetic analysis

Wellcome Open Research ◽

10.12688/wellcomeopenres.14730.2 ◽

2018 ◽

Vol 3 ◽

pp. 118 ◽

Cited By ~ 2

Author(s):

Anna Smielewska ◽

Edward Emmott ◽

Kyriaki Ranellou ◽

Ashley Popay ◽

Ian Goodfellow ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Genome Sequence ◽

Sequence Data ◽

Hypervariable Region ◽

Whole Genome Sequence ◽

Whole Genome ◽

Full Genome ◽

Genome Sequences ◽

Genome Data ◽

The Uk

Download Full-text

Taxonomic revision of Harveyi clade bacteria (family Vibrionaceae) based on analysis of whole genome sequences

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijs.0.051110-0 ◽

2013 ◽

Vol 63 (Pt_7) ◽

pp. 2742-2751 ◽

Cited By ~ 38

Author(s):

Henryk Urbanczyk ◽

Yoshitoshi Ogura ◽

Tetsuya Hayashi

Keyword(s):

Genome Sequence ◽

Sequence Data ◽

Draft Genome ◽

Taxonomic Revision ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequences ◽

Genome Sequence Data ◽

The Family

Use of inadequate methods for classification of bacteria in the so-called Harveyi clade (family Vibrionaceae, Gammaproteobacteria) has led to incorrect assignment of strains and proliferation of synonymous species. In order to resolve taxonomic ambiguities within the Harveyi clade and to test usefulness of whole genome sequence data for classification of Vibrionaceae, draft genome sequences of 12 strains were determined and analysed. The sequencing included type strains of seven species: Vibrio sagamiensis NBRC 104589T, Vibrio azureus NBRC 104587T, Vibrio harveyi NBRC 15634T, Vibrio rotiferianus LMG 21460T, Vibrio campbellii NBRC 15631T, Vibrio jasicida LMG 25398T, and Vibrio owensii LMG 25443T. Draft genome sequences of strain LMG 25430, previously designated the type strain of [Vibrio communis], and two strains (MWB 21 and 090810c) from the ‘beijerinckii’ lineage were also determined. Whole genomes of two additional strains (ATCC 25919 and 200612B) that previously could not be assigned to any Harveyi clade species were also sequenced. Analysis of the genome sequence data revealed a clear case of synonymy between V. owensii and [V. communis], confirming an earlier proposal to synonymize both species. Both strains from the ‘beijerinckii’ lineage were classified as V. jasicida, while the strains ATCC 25919 and 200612B were classified as V. owensii and V. campbellii, respectively. We also found that two strains, AND4 and Ex25, are closely related to Harveyi clade bacteria, but could not be assigned to any species of the family Vibrionaceae. The use of whole genome sequence data for the taxonomic classification of the Harveyi clade bacteria and other members of the family Vibrionaceae is also discussed.

Download Full-text

Identification ofKlebsiellacapsule synthesis loci from whole genome data

10.1101/071415 ◽

2016 ◽

Cited By ~ 3

Author(s):

Kelly L. Wyres ◽

Ryan R. Wick ◽

Claire Gorrie ◽

Adam Jenney ◽

Rainer Follador ◽

...

Keyword(s):

Dna Sequences ◽

Sequence Data ◽

Gene Clusters ◽

Whole Genome Sequence ◽

Multi Drug Resistance ◽

Reference Database ◽

Whole Genome ◽

Genome Sequences ◽

Protein Coding ◽

Genome Data

AbstractBackgroundKlebsiella pneumoniaeand close relatives are a growing cause of healthcare-associated infections for which increasing rates of multi-drug resistance are a major concern. TheKlebsiellapolysaccharide capsule is a major virulence determinant and epidemiological marker. However, little is known about capsule epidemiology since serological typing is not widely accessible, and many isolates are serologically non-typeable. Molecular methods for capsular typing are needed, but existing methods lack sensitivity and specificity and fail to take advantage of the information available in whole-genome sequence data, which is increasingly being generated for surveillance and investigation ofKlebsiella.MethodsWe investigated the diversity of capsule synthesis loci (K loci) among a large, diverse collection of 2503 genome sequences ofK. pneumoniaeand closely related species. We incorporated analyses of both full-length K locus DNA sequences and clustered protein coding sequences to identify, annotate and compare K locus structures, and we propose a novel method for identifying K loci based on full locus information extracted from whole genome sequences.ResultsA total of 134 distinct K loci were identified, including 31 novel types. Comparative analysis of K locus gene content detected 508 unique protein coding gene clusters that appear to reassort via homologous recombination, generating novel K locus types. Extensive nucleotide diversity was detected among thewziandwzcgenes, both within and between K loci, indicating that current typing schemes based on these genes are inadequate. As a solution, we introduceKaptive, a novel software tool that automates the process of identifying K loci from large sets ofKlebsiellagenomes based on full locus information.ConclusionsThis work highlights the extensive diversity ofKlebsiellaK loci and the proteins that they encode. We propose a standardised K locus nomenclature forKlebsiella, present a curated reference database of all known K loci, and introduce a tool for identifying K loci from genome data (https://github.com/katholt/Kaptive). These developments constitute important new resources for theKlebsiellacommunity for use in genomic surveillance and epidemiology.

Download Full-text

Faculty Opinions recommendation of Optimal algorithms for haplotype assembly from whole-genome sequence data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13339986.14707085 ◽

2011 ◽

Author(s):

Alejandro Schaffer

Keyword(s):

Genome Sequence ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Optimal Algorithms ◽

Genome Sequence Data ◽

Haplotype Assembly

Download Full-text

TIGER: inferring DNA replication timing from whole-genome sequence data

Bioinformatics ◽

10.1093/bioinformatics/btab166 ◽

2021 ◽

Cited By ~ 1

Author(s):

Amnon Koren ◽

Dashiell J Massey ◽

Alexa N Bracci

Keyword(s):

Dna Replication ◽

Genome Sequence ◽

Genomic Dna ◽

Sequence Data ◽

Replication Timing ◽

Whole Genome Sequence ◽

Supplementary Information ◽

Whole Genome ◽

Genome Sequence Data ◽

Dna Replication Timing

Abstract Motivation Genomic DNA replicates according to a reproducible spatiotemporal program, with some loci replicating early in S phase while others replicate late. Despite being a central cellular process, DNA replication timing studies have been limited in scale due to technical challenges. Results We present TIGER (Timing Inferred from Genome Replication), a computational approach for extracting DNA replication timing information from whole genome sequence data obtained from proliferating cell samples. The presence of replicating cells in a biological specimen leads to non-uniform representation of genomic DNA that depends on the timing of replication of different genomic loci. Replication dynamics can hence be observed in genome sequence data by analyzing DNA copy number along chromosomes while accounting for other sources of sequence coverage variation. TIGER is applicable to any species with a contiguous genome assembly and rivals the quality of experimental measurements of DNA replication timing. It provides a straightforward approach for measuring replication timing and can readily be applied at scale. Availability and Implementation TIGER is available at https://github.com/TheKorenLab/TIGER. Supplementary information Supplementary data are available at Bioinformatics online

Download Full-text

Whole genome sequence data of Bacillus australimaris strain B28A, isolated from Marine Water in India

Data in Brief ◽

10.1016/j.dib.2021.107240 ◽

2021 ◽

pp. 107240

Author(s):

Wael Ali Mohammed Hadi ◽

Boby T Edwin ◽

A Jayakumaran Nair

Keyword(s):

Genome Sequence ◽

Sequence Data ◽

Marine Water ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Whole genome sequence data of Mycobacterium tuberculosis XDR strain, isolated from patient in Kazakhstan

Data in Brief ◽

10.1016/j.dib.2020.106416 ◽

2020 ◽

Vol 33 ◽

pp. 106416

Author(s):

Asset Daniyarov ◽

Askhat Molkenov ◽

Saule Rakhimova ◽

Ainur Akhmetova ◽

Zhannur Nurkina ◽

...

Keyword(s):

Mycobacterium Tuberculosis ◽

Genome Sequence ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Elucidating the genetic basis of an oligogenic birth defect using whole genome sequence data in a non-model organism, Bubalus bubalis

Scientific Reports ◽

10.1038/srep39719 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 10

Author(s):

Lynsey K. Whitacre ◽

Jesse L. Hoff ◽

Robert D. Schnabel ◽

Sara Albarella ◽

Francesca Ciotola ◽

...

Keyword(s):

Genome Sequence ◽

Birth Defect ◽

Genetic Basis ◽

Sequence Data ◽

Model Organism ◽

Bubalus Bubalis ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Whole genome characterization of strains belonging to the Ralstonia solanacearum species complex and in silico analysis of TaqMan assays for detection in this heterogenous species complex

European Journal of Plant Pathology ◽

10.1007/s10658-020-02190-8 ◽

2021 ◽

Author(s):

Viola Kurm ◽

Ilse Houwers ◽

Claudia E. Coipan ◽

Peter Bonants ◽

Cees Waalwijk ◽

...

Keyword(s):

Ralstonia Solanacearum ◽

In Silico ◽

Species Complex ◽

Sequence Data ◽

In Silico Analysis ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequences ◽

Pcr Assays

AbstractIdentification and classification of members of the Ralstonia solanacearum species complex (RSSC) is challenging due to the heterogeneity of this complex. Whole genome sequence data of 225 strains were used to classify strains based on average nucleotide identity (ANI) and multilocus sequence analysis (MLSA). Based on the ANI score (>95%), 191 out of 192(99.5%) RSSC strains could be grouped into the three species R. solanacearum, R. pseudosolanacearum, and R. syzygii, and into the four phylotypes within the RSSC (I,II, III, and IV). R. solanacearum phylotype II could be split in two groups (IIA and IIB), from which IIB clustered in three subgroups (IIBa, IIBb and IIBc). This division by ANI was in accordance with MLSA. The IIB subgroups found by ANI and MLSA also differed in the number of SNPs in the primer and probe sites of various assays. An in-silico analysis of eight TaqMan and 11 conventional PCR assays was performed using the whole genome sequences. Based on this analysis several cases of potential false positives or false negatives can be expected upon the use of these assays for their intended target organisms. Two TaqMan assays and two PCR assays targeting the 16S rDNA sequence should be able to detect all phylotypes of the RSSC. We conclude that the increasing availability of whole genome sequences is not only useful for classification of strains, but also shows potential for selection and evaluation of clade specific nucleic acid-based amplification methods within the RSSC.

Download Full-text

46 Footprints of Selection in Angus and Hanwoo Beef Cattle Using Imputed Whole Genome Sequence Data

Journal of Animal Science ◽

10.1093/jas/skab235.042 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 25-25

Author(s):

Muhammad Yasir Nawaz ◽

Rodrigo Pelicioni Savegnago ◽

Cedric Gondro

Keyword(s):

Beef Cattle ◽

Genome Sequence ◽

Sequence Data ◽

Whole Genome Sequence ◽

Fixation Index ◽

Whole Genome ◽

Extended Haplotype Homozygosity ◽

Extended Haplotype ◽

Genome Sequence Data ◽

Genomic Regions

Abstract In this study, we detected genome wide footprints of selection in Hanwoo and Angus beef cattle using different allele frequency and haplotype-based methods based on imputed whole genome sequence data. Our dataset included 13,202 Angus and 10,437 Hanwoo animals with 10,057,633 and 13,241,550 imputed SNPs, respectively. A subset of data with 6,873,624 common SNPs between the two populations was used to estimate signatures of selection parameters, both within (runs of homozygosity and extended haplotype homozygosity) and between (allele fixation index, extended haplotype homozygosity) the breeds in order to infer evidence of selection. We observed that correlations between various measures of selection ranged between 0.01 to 0.42. Assuming these parameters were complementary to each other, we combined them into a composite selection signal to identify regions under selection in both beef breeds. The composite signal was based on the average of fractional ranks of individual selection measures for every SNP. We identified some selection signatures that were common between the breeds while others were independent. We also observed that more genomic regions were selected in Angus as compared to Hanwoo. Candidate genes within significant genomic regions may help explain mechanisms of adaptation, domestication history and loci for important traits in Angus and Hanwoo cattle. In the future, we will use the top SNPs under selection for genomic prediction of carcass traits in both breeds.

Download Full-text