NCBI submission protocol for microbial pathogen surveillance v2

Mapping Intimacies ◽

10.17504/protocols.io.bz7cp9iw ◽

2021 ◽

Author(s):

Ruth E Timme ◽

Maria Balkey ◽

Robyn Randolph ◽

Julie Haendiges ◽

Sai Laxmi Gubbala Venkata ◽

...

Keyword(s):

Active Surveillance ◽

Genome Sequence ◽

Large Volume ◽

Pathogen Detection ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Web Interface ◽

Data Submission ◽

Set Up

PURPOSE: Step-by-step instructions for submitting pathogen whole genome sequence data to NCBI and to the NCBI Pathogen Detection portal. This protocol covers the steps needed to establish a new NCBI submission environment for your laboratory, including the creation of new BioProject(s) and submission groups. Once these are step up, the protocol then walks through the process for submitting raw reads to SRA and sample metadata to BioSample through the Submission portal. SCOPE: for use by any laboratory submitting WGS data for species under active surveillance within NCBI’s Pathogen Detection. (This includes US laboratories in GenomeTrakr, NARMS, Vet-LIRN, PulseNet, and other non-US networks and submitters). For new submitters, there's quite a bit of groundwork that needs to be established before a laboratory can start its first data submission. We recommend that one person in the laboratory take a few days to get everything set up in advance of when you expect to do your first data submission. If you need a pipeline for frequent or large volume submissions, follow Step 1 to get your NCBI submission environment established, then contact [email protected] to set up an account for submitting through the API. This protocol covers submission using NCBI's Submission Portal web-interface. Version history: V5: Linking directly to the metadata template guidance instead of including duplicate copies of the files in this protocol. Updated screenshot for choosing the pathogen template to reflect changes at NCBI. V4: updated screenshots to reflect NCBI submission portal changes. Updated custom BioSample template.

Download Full-text

Faculty Opinions recommendation of Optimal algorithms for haplotype assembly from whole-genome sequence data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13339986.14707085 ◽

2011 ◽

Author(s):

Alejandro Schaffer

Keyword(s):

Genome Sequence ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Optimal Algorithms ◽

Genome Sequence Data ◽

Haplotype Assembly

Download Full-text

TIGER: inferring DNA replication timing from whole-genome sequence data

Bioinformatics ◽

10.1093/bioinformatics/btab166 ◽

2021 ◽

Cited By ~ 1

Author(s):

Amnon Koren ◽

Dashiell J Massey ◽

Alexa N Bracci

Keyword(s):

Dna Replication ◽

Genome Sequence ◽

Genomic Dna ◽

Sequence Data ◽

Replication Timing ◽

Whole Genome Sequence ◽

Supplementary Information ◽

Whole Genome ◽

Genome Sequence Data ◽

Dna Replication Timing

Abstract Motivation Genomic DNA replicates according to a reproducible spatiotemporal program, with some loci replicating early in S phase while others replicate late. Despite being a central cellular process, DNA replication timing studies have been limited in scale due to technical challenges. Results We present TIGER (Timing Inferred from Genome Replication), a computational approach for extracting DNA replication timing information from whole genome sequence data obtained from proliferating cell samples. The presence of replicating cells in a biological specimen leads to non-uniform representation of genomic DNA that depends on the timing of replication of different genomic loci. Replication dynamics can hence be observed in genome sequence data by analyzing DNA copy number along chromosomes while accounting for other sources of sequence coverage variation. TIGER is applicable to any species with a contiguous genome assembly and rivals the quality of experimental measurements of DNA replication timing. It provides a straightforward approach for measuring replication timing and can readily be applied at scale. Availability and Implementation TIGER is available at https://github.com/TheKorenLab/TIGER. Supplementary information Supplementary data are available at Bioinformatics online

Download Full-text

Whole genome sequence data of Bacillus australimaris strain B28A, isolated from Marine Water in India

Data in Brief ◽

10.1016/j.dib.2021.107240 ◽

2021 ◽

pp. 107240

Author(s):

Wael Ali Mohammed Hadi ◽

Boby T Edwin ◽

A Jayakumaran Nair

Keyword(s):

Genome Sequence ◽

Sequence Data ◽

Marine Water ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Whole genome sequence data of Mycobacterium tuberculosis XDR strain, isolated from patient in Kazakhstan

Data in Brief ◽

10.1016/j.dib.2020.106416 ◽

2020 ◽

Vol 33 ◽

pp. 106416

Author(s):

Asset Daniyarov ◽

Askhat Molkenov ◽

Saule Rakhimova ◽

Ainur Akhmetova ◽

Zhannur Nurkina ◽

...

Keyword(s):

Mycobacterium Tuberculosis ◽

Genome Sequence ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Elucidating the genetic basis of an oligogenic birth defect using whole genome sequence data in a non-model organism, Bubalus bubalis

Scientific Reports ◽

10.1038/srep39719 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 10

Author(s):

Lynsey K. Whitacre ◽

Jesse L. Hoff ◽

Robert D. Schnabel ◽

Sara Albarella ◽

Francesca Ciotola ◽

...

Keyword(s):

Genome Sequence ◽

Birth Defect ◽

Genetic Basis ◽

Sequence Data ◽

Model Organism ◽

Bubalus Bubalis ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

46 Footprints of Selection in Angus and Hanwoo Beef Cattle Using Imputed Whole Genome Sequence Data

Journal of Animal Science ◽

10.1093/jas/skab235.042 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 25-25

Author(s):

Muhammad Yasir Nawaz ◽

Rodrigo Pelicioni Savegnago ◽

Cedric Gondro

Keyword(s):

Beef Cattle ◽

Genome Sequence ◽

Sequence Data ◽

Whole Genome Sequence ◽

Fixation Index ◽

Whole Genome ◽

Extended Haplotype Homozygosity ◽

Extended Haplotype ◽

Genome Sequence Data ◽

Genomic Regions

Abstract In this study, we detected genome wide footprints of selection in Hanwoo and Angus beef cattle using different allele frequency and haplotype-based methods based on imputed whole genome sequence data. Our dataset included 13,202 Angus and 10,437 Hanwoo animals with 10,057,633 and 13,241,550 imputed SNPs, respectively. A subset of data with 6,873,624 common SNPs between the two populations was used to estimate signatures of selection parameters, both within (runs of homozygosity and extended haplotype homozygosity) and between (allele fixation index, extended haplotype homozygosity) the breeds in order to infer evidence of selection. We observed that correlations between various measures of selection ranged between 0.01 to 0.42. Assuming these parameters were complementary to each other, we combined them into a composite selection signal to identify regions under selection in both beef breeds. The composite signal was based on the average of fractional ranks of individual selection measures for every SNP. We identified some selection signatures that were common between the breeds while others were independent. We also observed that more genomic regions were selected in Angus as compared to Hanwoo. Candidate genes within significant genomic regions may help explain mechanisms of adaptation, domestication history and loci for important traits in Angus and Hanwoo cattle. In the future, we will use the top SNPs under selection for genomic prediction of carcass traits in both breeds.

Download Full-text

ALPHLARD: a Bayesian method for analyzing HLA genes from whole genome sequence data

BMC Genomics ◽

10.1186/s12864-018-5169-9 ◽

2018 ◽

Vol 19 (1) ◽

Cited By ~ 6

Author(s):

Shuto Hayashi ◽

Rui Yamaguchi ◽

Shinichi Mizuno ◽

Mitsuhiro Komura ◽

Satoru Miyano ◽

...

Keyword(s):

Genome Sequence ◽

Bayesian Method ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data ◽

Hla Genes

Download Full-text

Peer Review #2 of "An evaluation of alternative methods for constructing phylogenies from whole genome sequence data: a case study with Salmonella (v0.1)"

10.7287/peerj.620v0.1/reviews/2 ◽

2014 ◽

Author(s):

NJ Loman

Keyword(s):

Peer Review ◽

Genome Sequence ◽

Sequence Data ◽

Alternative Methods ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Peer Review #1 of "An evaluation of alternative methods for constructing phylogenies from whole genome sequence data: a case study with Salmonella (v0.1)"

10.7287/peerj.620v0.1/reviews/1 ◽

2014 ◽

Author(s):

AE Darling

Keyword(s):

Peer Review ◽

Genome Sequence ◽

Sequence Data ◽

Alternative Methods ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data

Download Full-text

Emergence and expansion of highly infectious spike protein D614G mutant SARS-CoV-2 in central India

Scientific Reports ◽

10.1038/s41598-021-95822-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Shashi Sharma ◽

Paban Kumar Dash ◽

Sushil Kumar Sharma ◽

Ambuj Srivastava ◽

Jyoti S. Kumar ◽

...

Keyword(s):

Genome Sequence ◽

Sequence Data ◽

Central India ◽

Madhya Pradesh ◽

Whole Genome Sequence ◽

Health Economy ◽

Whole Genome ◽

Evolutionary Patterns ◽

First Case ◽

Multiple Introduction

AbstractCOVID-19 has emerged as global pandemic with largest damage to the public health, economy and human psyche.The genome sequence data obtained during the ongoing pandemic are valuable to understand the virus evolutionary patterns and spread across the globe. Increased availability of genome information of circulating SARS-CoV-2 strains in India will enable the scientific community to understand the emergence of new variants and their impact on human health. The first case of COVID-19 was detected in Chambal region of Madhya Pradesh state in mid of March 2020 followed by multiple introduction events and expansion of cases within next three months. More than 5000 COVID-19 suspected samples referred to Defence Research and Development Establishment, Gwalior, Madhya Pradesh were analyzed during the nation -wide lockdown and unlock period. A total of 136 cases were found positive over a span of three months that included virus introduction to the region and its further spread. Whole genome sequences employing Oxford nanopore technology were generated for 26 SARS-CoV-2 circulating in 10 different districts in Madhya Pradesh state of India. This period witnessed index cases with multiple travel histories responsible for introduction of COVID-19 followed by remarkable expansion of virus. The genome wide substitutions including in important viral proteins were identified. The detailed phylogenetic analysis revealed the circulating SARS-CoV-2 clustered in multiple clades including A2a, A4 and B. The cluster-wise segregation was observed, suggesting multiple introduction links and subsequent evolution of virus in the region. This is the first comprehensive whole genome sequence analysis from central India, which revealed the emergence and evolution of SARS-CoV-2 during thenation-wide lockdown and unlock.

Download Full-text