ncbi short read archive Latest Research Papers

The complete genome sequences of two species of seventeen-year cicadas: Magicicada septendecim and Magicicada septendecula

F1000Research ◽

10.12688/f1000research.27309.1 ◽

2021 ◽

Vol 10 ◽

pp. 215

Author(s):

Harold B. White ◽

Stacy Pirro

Keyword(s):

North America ◽

Related Species ◽

De Novo ◽

Eastern North America ◽

Whole Genome ◽

Genome Sequences ◽

Short Read ◽

Short Read Archive ◽

Periodical Cicadas ◽

Ncbi Short Read Archive

The genus Magicicada (Hemiptera: Cicadidae) includes the periodical cicadas of Eastern North America. Spending the majority of their long lives underground, the adult cicadas emerge every 13 or 17 years to spend 4-6 weeks as adult to mate. We present the whole genome sequences of two species of 17-year cicadas, Magicicada septendecim and Magicicada septendecula. The reads were assembled by a de novo method followed by alignments to related species. Annotation was performed by GeneMark-ES. The raw and assembled data is available via NCBI Short Read Archive and Assembly databases.

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

PLoS Pathogens ◽

10.1371/journal.ppat.1009181 ◽

2020 ◽

Vol 16 (12) ◽

pp. e1009181

Author(s):

Michael J. Mansfield ◽

Benjamin J-M Tremblay ◽

Ji Zeng ◽

Xin Wei ◽

Harold Hodgins ◽

...

Keyword(s):

Selective Advantage ◽

Phylogenomic Analysis ◽

Toxin A ◽

Diagnostic Assays ◽

Surface Patches ◽

Clostridioides Difficile ◽

Distinct Cell ◽

Ncbi Short Read Archive ◽

Toxin Sequence ◽

Entire Sequence

Clostridioides difficile is the major worldwide cause of antibiotic-associated gastrointestinal infection. A pathogenicity locus (PaLoc) encoding one or two homologous toxins, toxin A (TcdA) and toxin B (TcdB) is essential for C. difficile pathogenicity. However, toxin sequence variation poses major challenges for the development of diagnostic assays, therapeutics, and vaccines. Here, we present a comprehensive phylogenomic analysis of 8,839 C. difficile strains and their toxins including 6,492 genomes that we assembled from the NCBI short read archive. A total of 5,175 tcdA and 8,022 tcdB genes clustered into 7 (A1-A7) and 12 (B1-B12) distinct subtypes, which form the basis of a new method for toxin-based subtyping of C. difficile. We developed a haplotype coloring algorithm to visualize amino acid variation across all toxin sequences, which revealed that TcdB has diversified through extensive homologous recombination throughout its entire sequence, and formed new subtypes through distinct recombination events. In contrast, TcdA varies mainly in the number of repeats in its C-terminal repetitive region, suggesting that recombination-mediated diversification of TcdB provides a selective advantage in C. difficile evolution. The application of toxin subtyping is then validated by classifying 351 C. difficile clinical isolates from Brigham and Women’s Hospital in Boston, demonstrating its clinical utility. Subtyping partitions TcdB into binary functional and antigenic groups generated by intragenic recombinations, including two distinct cell-rounding phenotypes, whether recognizing frizzled proteins as receptors, and whether it can be efficiently neutralized by monoclonal antibody bezlotoxumab, the only FDA-approved therapeutic antibody. Our analysis also identifies eight universally conserved surface patches across the TcdB structure, representing ideal targets for developing broad-spectrum therapeutics. Finally, we established an open online database (DiffBase) as a central hub for collection and classification of C. difficile toxins, which will help clinicians decide on therapeutic strategies targeting specific toxin variants, and allow researchers to monitor the ongoing evolution and diversification of C. difficile.

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

10.1101/2020.07.09.194449 ◽

2020 ◽

Author(s):

Michael J. Mansfield ◽

Benjamin J-M Tremblay ◽

Ji Zeng ◽

Xin Wei ◽

Harold Hodgins ◽

...

Keyword(s):

Selective Advantage ◽

Phylogenomic Analysis ◽

Toxin A ◽

Diagnostic Assays ◽

Surface Patches ◽

Clostridioides Difficile ◽

Distinct Cell ◽

Ncbi Short Read Archive ◽

Toxin Sequence ◽

Entire Sequence

AbstractClostridioides difficile is the major worldwide cause of antibiotic-associated gastrointestinal infection. A pathogenicity locus (PaLoc) encoding one or two homologous toxins, toxin A (TcdA) and toxin B (TcdB) is essential for C. difficile pathogenicity. However, toxin sequence variation poses major challenges for the development of diagnostic assays, therapeutics, and vaccines. Here, we present a comprehensive phylogenomic analysis 8,839 C. difficile strains and their toxins including 6,492 genomes that we assembled from the NCBI short read archive. A total of 5,175 tcdA and 8,022 tcdB genes clustered into 7 (A1-A7) and 12 (B1-B12) distinct subtypes, which form the basis of a new method for toxin-based subtyping of C. difficile. We developed a haplotype coloring algorithm to visualize amino acid variation across all toxin sequences, which revealed that TcdB has diversified through extensive homologous recombination throughout its entire sequence, and formed new subtypes through distinct recombination events. In contrast, TcdA varies mainly in the number of repeats in its C-terminal repetitive region, suggesting that recombination-mediated diversification of TcdB provides a selective advantage in C. difficile evolution. The application of toxin subtyping is then validated by classifying 351 C. difficile clinical isolates from Brigham and Women’s Hospital in Boston, demonstrating its clinical utility. Subtyping partitions TcdB into binary functional and antigenic groups generated by intragenic recombinations, including two distinct cell-rounding phenotypes, whether recognizing frizzled proteins as receptors, and whether it can be efficiently neutralized by monoclonal antibody bezlotoxumab, the only FDA-approved therapeutic antibody. Our analysis also identifies eight universally conserved surface patches across the TcdB structure, representing ideal targets for developing broad-spectrum therapeutics. Finally, we established an open online database (DiffBase) as a central hub for collection and classification of C. difficile toxins, which will help clinicians decide on therapeutic strategies targeting specific toxin variants, and allow researchers to monitor the ongoing evolution and diversification of C. difficile.

Cryptic prophages within a Streptococcus pyogenes genotype emm4 lineage

10.1101/2020.05.19.103838 ◽

2020 ◽

Author(s):

Alex Remmington ◽

Samuel Haywood ◽

Julia Edgar ◽

Claire E. Turner

Keyword(s):

Streptococcus Pyogenes ◽

Gene Loss ◽

Sequence Data ◽

Genome Data ◽

The United Kingdom ◽

Ncbi Short Read Archive ◽

Genes Encoding ◽

Variable Extent ◽

The Usa ◽

The Uk

AbstractThe major human pathogen Streptococcus pyogenes shares an intimate evolutionary history with mobile genetic elements, which in many cases, carry genes encoding bacterial virulence factors. During recent whole genome sequencing of a longitudinal sample of S. pyogenes isolates in the United Kingdom, we identified a lineage within emm4 that clustered with the reference genome MEW427. Like MEW427, this lineage was characterised by substantial gene loss within all three prophage regions, compared to MGAS10750 and isolates outside of the MEW427-like lineage. Gene loss primarily affected lysogeny, replicatory and regulatory modules, and to a lesser and more variable extent, structural genes. Importantly, prophage-encoded superantigen and DNase genes were retained in all isolates. In isolates where the prophage elements were complete, like MGAS10750, they could be induced experimentally, but not in MEW427-like isolates with degraded prophages. We also found gene loss within the chromosomal island SpyCIM4 of MEW427-like isolates, although surprisingly, the SpyCIM4 element could not be experimentally induced in either MGAS10750-like or MEW427-like isolates. This did not, however, appear to abolish expression of the mismatch repair operon, within which this element resides. The inclusion of further emm4 genomes in our analyses ratified our observations and revealed an international emm4 lineage characterised by prophage degradation. Intriguingly, the USA population of emm4 S. pyogenes appeared to constitute predominantly MEW427-like isolates, whereas the UK comprised both MEW427-like and MGAS10750-like strains. The degradation and cryptic nature of these elements may have important phenotypic ramifications for emm4 S. pyogenes and the geographical distribution of this lineage raises interesting questions on the population dynamics of the genotype.Data summaryAll raw sequence data used in this study has been previously published and was obtained from NCBI short read archive. Accession numbers and citations for the genome data for each individual isolate is provided in Supplementary Table 1.

Emergence and diversification of a highly invasive chestnut pathogen lineage across south-eastern Europe

10.1101/2020.02.15.950170 ◽

2020 ◽

Cited By ~ 3

Author(s):

Lea Stauber ◽

Thomas Badet ◽

Simone Prospero ◽

Daniel Croll

Keyword(s):

Eastern Europe ◽

Cryphonectria Parasitica ◽

Data Availability ◽

Sequencing Data ◽

South Eastern ◽

Chestnut Blight Fungus ◽

Fitness Advantage ◽

European Chestnut ◽

Ncbi Short Read Archive ◽

South Eastern Europe

AbstractInvasive microbial species constitute a major threat to biodiversity, agricultural production and human health. Invasions are often dominated by one or a small number of genotypes, yet the underlying factors driving invasions are poorly understood. The chestnut blight fungus Cryphonectria parasitica first decimated the American chestnut and a recent outbreak threatens European chestnut trees. To unravel the mechanisms underpinning the invasion of south-eastern Europe, we sequenced 188 genomes of predominantly European strains. Genotypes outside of the invasion zone showed high levels of diversity with evidence for frequent and ongoing recombination. The invasive lineage emerged from the highly diverse European genotype pool rather than a secondary introduction from Asia. The expansion across south-eastern Europe was mostly clonal and is dominated by a single mating type suggesting a fitness advantage of asexual reproduction. Our findings show how an intermediary, highly diverse bridgehead population gave rise to an invasive, largely clonally expanding pathogen.Data availabilityAll raw sequencing data is available on the NCBI Short Read Archive (BioProject PRJNA604575)

Next-generation sequencing of double stranded RNA is greatly improved by treatment with the inexpensive denaturing reagent DMSO

10.1101/644591 ◽

2019 ◽

Author(s):

Alexander H. Wilcox ◽

Eric Delwart ◽

Samuel L. Díaz Muñoz

Keyword(s):

Next Generation Sequencing ◽

Limit Of Detection ◽

Genetic Material ◽

Dsrna Virus ◽

Next Generation ◽

Short Read ◽

Double Stranded Rna ◽

Ncbi Short Read Archive ◽

Dmso Treatment ◽

Generation Sequencing

AbstractDouble stranded RNA (dsRNA) is the genetic material of important viruses and a key component of RNA interference-based immunity in eukaryotes. Previous studies have noted difficulties in determining the sequence of dsRNA molecules that have affected studies of immune function and estimates of viral diversity in nature. Dimethyl sulfoxide (DMSO) has been used to denature dsRNA prior to the reverse transcription stage to improve RT-PCR and Sanger sequencing. We systematically tested the utility of DMSO to improve sequencing yield of a dsRNA virus (Φ6) in a short-read next generation sequencing platform. DMSO treatment improved sequencing read recovery by over two orders of magnitude, even when RNA and cDNA concentrations were below the limit of detection. We also tested the effects of DMSO on a mock eukaryotic viral community and found that dsRNA virus reads increased with DMSO treatment. Furthermore, we provide evidence that DMSO treatment does not adversely affect recovery of reads from a single-stranded RNA viral genome (Influenza A/California/07/2009). We suggest that up to 50% DMSO treatment be used prior to cDNA synthesis when samples of interest are composed of or may contain dsRNA.Data SummarySequence data was deposited in the NCBI Short Read Archive (accession numbers: PRJNA527100, PRJNA527101, PRJNA527098). Data and code for analysis is available on GitHub (https://github.com/awilcox83/dsRNA-sequencing/, doi:10.5281/zenodo.1453423). Protocol for dsRNA sequencing is posted on protocols.io (doi:10.17504/protocols.io.ugnetve).

Extensive horizontal exchange of transposable elements in the Drosophila pseudoobscura group

10.1101/284117 ◽

2018 ◽

Author(s):

Tom Hill ◽

Andrea J. Betancourt

Keyword(s):

Horizontal Transfer ◽

Species Group ◽

Data Availability ◽

Drosophila Pseudoobscura ◽

Chromosome Size ◽

Short Read ◽

Short Read Archive ◽

Ncbi Short Read Archive ◽

Mobile Component ◽

Different Levels

AbstractWhile the horizontal transfer of a parasitic element can be a potentially catastrophic, it is increasingly recognized as a common occurrence. The horizontal exchange, or lack of exchange, of TE content between species results in different levels of divergence among a species group in the mobile component of their genomes. Here, we examine differences in the TE content of the Drosophila pseudoobscura species group. We identify several putative horizontal transfer events, and examine the role that horizontal transfer plays in the spread of TE families to new species and the homogenization of TE content in these species. Despite rampant exchange of TE families between species, we find that both TE content differs hugely across the group, likely due to differing activity of each TE family and differing suppression of TEs due to divergence in Y chromosome size, and its resulting effects of TE regulation. Overall, we show that TE content is highly dynamic in this species group, and that it plays a large role in shaping the differences seen between species.Data availabilityAll data used in this study (summarized in table S1) is freely available online through the NCBI short read archive (NCBI SRA: ERR127385, SRR330416, SRR330418, SRR1925723, SRR330426, SRR330420, SRR330423, SRR617430-74). All genomes used are either available through flybase.org or popoolation.at.

Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample

F1000Research ◽

10.12688/f1000research.9912.3 ◽

2016 ◽

Vol 5 ◽

pp. 2644 ◽

Cited By ~ 1

Author(s):

William P. Gilks ◽

Tanya M. Pennell ◽

Ilona Flis ◽

Matthew T. Webster ◽

Edward H. Morrow

Keyword(s):

Drosophila Melanogaster ◽

Complex Traits ◽

High Throughput Sequencing ◽

Population Sample ◽

Genomic Variation ◽

Genotype Data ◽

Whole Genome ◽

Short Read ◽

Short Read Archive ◽

Ncbi Short Read Archive

As part of a study into the molecular genetics of sexually dimorphic complex traits, we used high-throughput sequencing to obtain data on genomic variation in an outbred laboratory-adapted fruit fly (Drosophila melanogaster) population. We successfully resequenced the whole genome of 220 hemiclonal females that were heterozygous for the same Berkeley reference line genome (BDGP6/dm6), and a unique haplotype from the outbred base population (LHM). The use of a static and known genetic background enabled us to obtain sequences from whole-genome phased haplotypes. We used a BWA-Picard-GATK pipeline for mapping sequence reads to the dm6 reference genome assembly, at a median depth-of coverage of 31X, and have made the resulting data publicly-available in the NCBI Short Read Archive (Accession number SRP058502). We used Haplotype Caller to discover and genotype 1,726,931 small genomic variants (SNPs and indels, <200bp). Additionally we detected and genotyped 167 large structural variants (1-100Kb in size) using GenomeStrip/2.0. Sequence and genotype data are publicly-available at the corresponding NCBI databases: Short Read Archive, dbSNP and dbVar (BioProject PRJNA282591). We have also released the unfiltered genotype data, and the code and logs for data processing and summary statistics (https://zenodo.org/communities/sussex_drosophila_sequencing/).

Whole genome resequencing of a laboratory-adapted Drosophila melanogaster

F1000Research ◽

10.12688/f1000research.9912.2 ◽

2016 ◽

Vol 5 ◽

pp. 2644

Author(s):

William P. Gilks ◽

Tanya M. Pennell ◽

Ilona Flis ◽

Matthew T. Webster ◽

Edward H. Morrow

Keyword(s):

Drosophila Melanogaster ◽

Complex Traits ◽

High Throughput Sequencing ◽

Genomic Variation ◽

Genotype Data ◽

Whole Genome ◽

Unique Haplotype ◽

Short Read ◽

Short Read Archive ◽

Ncbi Short Read Archive

As part of a study into the molecular genetics of sexually dimorphic complex traits, we used high-throughput sequencing to obtain data on genomic variation in an outbred laboratory-adapted fruit fly (Drosophila melanogaster) population. We successfully resequenced the whole genome of 220 hemiclonal females that were heterozygous for the same Berkeley reference line genome (BDGP6/dm6), and a unique haplotype from the outbred base population (LHM). The use of a static and known genetic background enabled us to obtain sequences from whole-genome phased haplotypes. We used a BWA-Picard-GATK pipeline for mapping sequence reads to the dm6 reference genome assembly, at a median depth-of coverage of 31X, and have made the resulting data publicly-available in the NCBI Short Read Archive (Accession number SRP058502). We used Haplotype Caller to discover and genotype 1,726,931 small genomic variants (SNPs and indels, <200bp). Additionally we detected and genotyped 167 large structural variants (1-100Kb in size) using GenomeStrip/2.0. Sequence and genotype data are publicly-available at the corresponding NCBI databases: Short Read Archive, dbSNP and dbVar (BioProject PRJNA282591). We have also released the unfiltered genotype data, and the code and logs for data processing and summary statistics (https://zenodo.org/communities/sussex_drosophila_sequencing/).

Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample

F1000Research ◽

10.12688/f1000research.9912.1 ◽

2016 ◽

Vol 5 ◽

pp. 2644 ◽

Cited By ~ 1

Author(s):

William P. Gilks ◽

Tanya M. Pennell ◽

Ilona Flis ◽

Matthew T. Webster ◽

Edward H. Morrow

Keyword(s):

Drosophila Melanogaster ◽

Complex Traits ◽

Population Sample ◽

Genomic Variation ◽

Genotype Data ◽

Whole Genome ◽

Unique Haplotype ◽

Short Read ◽

Short Read Archive ◽

Ncbi Short Read Archive

As part of a study into the molecular genetics of sexually dimorphic complex traits, we used next-generation sequencing to obtain data on genomic variation in an outbred laboratory-adapted fruit fly (Drosophila melanogaster) population. We successfully resequenced the whole genome of 220 hemiclonal females that were heterozygous for the same Berkeley reference line genome (BDGP6/dm6), and a unique haplotype from the outbred base population (LHM). The use of a static and known genetic background enabled us to obtain sequences from whole genome phased haplotypes. We used a BWA-Picard-GATK pipeline for mapping sequence reads to the dm6 reference genome assembly, at a median depth of coverage of 31X, and have made the resulting data publicly-available in the NCBI Short Read Archive (Accession number SRP058502). We used Haplotype Caller to discover and genotype 1,726,931 small genomic variants (SNPs and indels, <200bp). Additionally we detected and genotyped 167 large structural variants (1-100Kb in size) using GenomeStrip/2.0. Sequence and genotype data are publicly-available at the corresponding NCBI databases: Short Read Archive, dbSNP and dbVar (BioProject PRJNA282591). We have also released the unfiltered genotype data, and the code and logs for data processing and summary statistics (https://zenodo.org/communities/sussex_drosophila_sequencing/).

ncbi short read archive
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

The complete genome sequences of two species of seventeen-year cicadas: Magicicada septendecim and Magicicada septendecula

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

Cryptic prophages within a Streptococcus pyogenes genotype emm4 lineage

Emergence and diversification of a highly invasive chestnut pathogen lineage across south-eastern Europe

Next-generation sequencing of double stranded RNA is greatly improved by treatment with the inexpensive denaturing reagent DMSO

Extensive horizontal exchange of transposable elements in the Drosophila pseudoobscura group

Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample

Whole genome resequencing of a laboratory-adapted Drosophila melanogaster

Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample

Export Citation Format

ncbi short read archiveRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

The complete genome sequences of two species of seventeen-year cicadas: Magicicada septendecim and Magicicada septendecula

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

Cryptic prophages within a Streptococcus pyogenes genotype emm4 lineage

Emergence and diversification of a highly invasive chestnut pathogen lineage across south-eastern Europe

Next-generation sequencing of double stranded RNA is greatly improved by treatment with the inexpensive denaturing reagent DMSO

Extensive horizontal exchange of transposable elements in the Drosophila pseudoobscura group

Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample

Whole genome resequencing of a laboratory-adapted Drosophila melanogaster

Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample

ncbi short read archive
Recently Published Documents