scholarly journals Genome sequencing of a single tardigrade Hypsibius dujardini individual

2016 ◽  
Author(s):  
Kazuharu Arakawa ◽  
Yuki Yoshida ◽  
Masaru Tomita

AbstractTardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molecular studies of tardigrades are prone to contamination. To minimize the possibility of microbial contaminations and to obtain high-quality genomic information, we have developed an ultra-low input library sequencing protocol to enable the genome sequencing of a single tardigrade Hypsibius dujardini individual. Here, we describe the details of our sequencing data and the ultra-low input library preparation methodologies.

2014 ◽  
Vol 2014 ◽  
pp. 1-8
Author(s):  
Momchilo Vuyisich ◽  
Ayesha Arefin ◽  
Karen Davenport ◽  
Shihai Feng ◽  
Cheryl Gleasner ◽  
...  

Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing andde novoassembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing andde novoassembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderiaspp.), which have the highest GC content and are the longest, we also show that the quality of both resequencing andde novoassembly is not decreased when only 10 ng of input genomic DNA is used.


2014 ◽  
Author(s):  
Han Fang ◽  
Yiyang Wu ◽  
Giuseppe Narzisi ◽  
Jason A. O'Rawe ◽  
Laura T. Jimenez Barrón ◽  
...  

BackgroundINDELs, especially those disrupting protein-coding regions of the genome, have been strongly associated with human diseases. However, there are still many errors with INDEL variant calling, driven by library preparation, sequencing biases, and algorithm artifacts.MethodsWe characterized whole genome sequencing (WGS), whole exome sequencing (WES), and PCR-free sequencing data from the same samples to investigate the sources of INDEL errors. We also developed a classification scheme based on the coverage and composition to rank high and low quality INDEL calls. We performed a large-scale validation experiment on 600 loci, and find high-quality INDELs to have a substantially lower error rate than low quality INDELs (7% vs. 51%).ResultsSimulation and experimental data show that assembly based callers are significantly more sensitive and robust for detecting large INDELs (>5bp) than alignment based callers, consistent with published data. The concordance of INDEL detection between WGS and WES is low (52%), and WGS data uniquely identifies 10.8-fold more high-quality INDELs. The validation rate for WGS-specific INDELs is also much higher than that for WES-specific INDELs (85% vs. 54%), and WES misses many large INDELs. In addition, the concordance for INDEL detection between standard WGS and PCR-free sequencing is 71%, and standard WGS data uniquely identifies 6.3-fold more low-quality INDELs. Furthermore, accurate detection with Scalpel of heterozygous INDELs requires 1.2-fold higher coverage than that for homozygous INDELs. Lastly, homopolymer A/T INDELs are a major source of low-quality INDEL calls, and they are highly enriched in the WES data.ConclusionsOverall, we show that accuracy of INDEL detection with WGS is much greater than WES even in the targeted region. We calculated that 60X WGS depth of coverage from the HiSeq platform is needed to recover 95% of INDELs detected by Scalpel. While this is higher than current sequencing practice, the deeper coverage may save total project costs because of the greater accuracy and sensitivity. Finally, we investigate sources of INDEL errors (e.g. capture deficiency, PCR amplification, homopolymers) with various data that will serve as a guideline to effectively reduce INDEL errors in genome sequencing.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e13016-e13016
Author(s):  
Shannon Terrell Bailey ◽  
Belynda Hicks ◽  
Bin Zhu ◽  
Nan Hu ◽  
Phil R. Taylor ◽  
...  

e13016 Background: Whole-genome sequencing (WGS) of formalin-fixed, paraffin-embedded (FFPE) samples could enable novel insights from archival sample collections, yet robust FFPE WGS is challenged by fragmented DNA, uneven genomic coverage & sequencing artifacts attributed to FFPE fixation. We report our proprietary extraction & library preparation methodology (SeqPlus) with high quality, uniform WGS sequencing performance comparable to that from fresh-frozen samples. Methods: We analyzed 20 paired esophageal carcinoma (EC) samples i.e., primary tumors & matched germline samples to assess SeqPlus performance on 10-15-year-old FFPE tissues, measure variant concordance between WGS and a high-depth sequencing panel (269 genes, 400x coverage) & identify novel genomic features. Results: At a targeted 70x WGS tumor sequencing depth, 93% of the genome was covered by ³ 20 reads, 99% of bases had 10x coverage & average duplicate reads were 31%. We noted similar transition/transversion ratios & mutational spectra as from fresh-frozen EC specimens, suggesting that extraction & library preparation contributes to prior FFPE artifacts. Concordance of tumor-specific SNVs & indels derived from WGS & targeted panel was high at 86%. All 76 targeted panel-detected variants above the WGS limit of detection (mutant allele frequency [MAF] > 10%) were detected by WGS, 2 variants (2 tumors) were detected only by WGS, and 12 variants at MAF ≤ 6% (9 tumors) were only detected by the targeted panel. Tumor WGS yielded SNV, indels & CNV findings beyond variants detected by targeted sequencing. WGS enabled detection of 10.4 putative cancer variants per tumor compared to 12 variants per patient from frozen specimens and a median of 7 (up to 16) cancer-associated variants in genes outside the targeted panel. WGS copy number analysis revealed CCND1, EGFR, TP63, and SOX2amplification, CDKN2A/B deletion and additional unrecognized genomic aberrations. Conclusions: Our study reinforces the utility of high-quality, uniform WGS sequencing of archival FFPE cancer samples with SeqPlus and unlocks the potential for massive-scale retrospective genomic analysis of archived pathology samples with associated clinical & outcomes data.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Jared O’Connell ◽  
Taedong Yun ◽  
Meghan Moreno ◽  
Helen Li ◽  
Nadia Litterman ◽  
...  

AbstractThere is currently a dearth of accessible whole genome sequencing (WGS) data for individuals residing in the Americas with Sub-Saharan African ancestry. We generated whole genome sequencing data at intermediate (15×) coverage for 2,294 individuals with large amounts of Sub-Saharan African ancestry, predominantly Atlantic African admixed with varying amounts of European and American ancestry. We performed extensive comparisons of variant callers, phasing algorithms, and variant filtration on these data to construct a high quality imputation panel containing data from 2,269 unrelated individuals. With the exception of the TOPMed imputation server (which notably cannot be downloaded), our panel substantially outperformed other available panels when imputing African American individuals. The raw sequencing data, variant calls and imputation panel for this cohort are all freely available via dbGaP and should prove an invaluable resource for further study of admixed African genetics.


2020 ◽  
Author(s):  
René A.M. Dirks ◽  
Peter Thomas ◽  
Robert C. Jones ◽  
Hendrik G. Stunnenberg ◽  
Hendrik Marks

AbstractEpigenetic profiling by ChIP-Seq has become a powerful tool for genome-wide identification of regulatory elements, for defining transcriptional regulatory networks and for screening for biomarkers. However, the ChIP-Seq protocol for low-input samples is laborious, time-consuming and suffers from experimental variation, resulting in poor reproducibility and low throughput. Although prototypic microfluidic ChIP-Seq platforms have been developed, these are poorly transferable as they require sophisticated custom-made equipment and in-depth microfluidic and ChIP expertise, while lacking parallelisation. To enable standardized, automated ChIP-Seq profiling of low-input samples, we constructed PDMS-based plates containing microfluidic Integrated Fluidic Circuits capable of performing 24 sensitive ChIP reactions within 30 minutes hands-on time. These disposable plates can conveniently be loaded into a widely available controller for pneumatics and thermocycling, making the ChIP-Seq procedure Plug and Play (PnP). We demonstrate high-quality ChIP-seq on hundreds to few thousands of cells for multiple widely-profiled post-translational histone modifications, together allowing genome-wide identification of regulatory elements. As proof of principle, we managed to generate high-quality epigenetic profiles of rare totipotent subpopulations of mESCs using our platform. In light of the ready-to-go ChIP plates and the automated workflow, we named our procedure PnP-ChIP-Seq. PnP-ChIP-Seq allows non-expert labs worldwide to conveniently run robust, standardized ChIP-Seq, while its high-throughput, consistency and sensitivity paves the way towards large-scale profiling of precious sample types such as rare subpopulations of cells or biopsies.Reviewer link to dataAll sequencing data has been submitted to the NCBI GEO database. Reviewer link: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=klwnocicrpaxrkv&acc=GSE120673


2018 ◽  
Author(s):  
Moez Sanaa ◽  
Régis Pouillot ◽  
Francisco J Garces-Vega ◽  
Errol Strain ◽  
Jane M Van Doren

Food safety risk assessments and large-scale epidemiological investigations have the potential to provide better and new types of information when whole genome sequence (WGS) data are effectively integrated. Today, the NCBI Pathogen Detection database WGS collections have grown significantly through improvements in technology, coordination, and collaboration, such as the GenomeTrakr and PulseNet networks. However, high-quality genomic data is not often coupled with high-quality epidemiological or food chain metadata. We have created a set of tools for cleaning, curation, integration, analysis and visualization of microbial genome sequencing data. It has been tested using Salmonella enterica and Listeria monocytogenes data sets provided by NCBI Pathogen Detection (160,000 sequenced isolates). GenomeGraphR presents foodborne pathogen WGS data and associated curated metadata in a user-friendly interface that allows a user to query a variety of research questions such as, transmission sources and dynamics, global reach, and persistence of genotypes associated with contamination in the food supply and foodborne illness across time or space. The application is freely available (https://fda-riskmodels.foodrisk.org/genomegraphr/).


2021 ◽  
Author(s):  
Julie Haendiges ◽  
Narjol Gonzalez-Escalona ◽  
Ruth E Timme ◽  
Maria Balkey

This procedure outlines the protocol for whole genome sequencing of bacterial organisms using the Illumina DNA Prep library preparation kit for sequencing on an Illumina MiSeq sequencer. This document applies to all laboratory personnel in the Division of Microbiology (DM) as well as laboratories in the GenomeTrakr Network. Complete in order: 1. DNA Extraction (Manual DNA Extraction or Automated DNA Extraction using the Qiacube) Step-by-step procedures to obtain high quality DNA from isolates in TSB for whole genome sequencing 2. DNA Quantitation Quantitation of extracted DNA using the Qubit Flourometer 3. Library Preparation for WGS (Included SOP or Library Preparation using Illumina Nextera XT ) Library preparation using NexteraXT or Illumina DNA Prep (previously Nextera DNA Flex) 4. Sequencing using Illumina MiSeq 5. Data Quality Checks and NCBI Submission


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sung Yong Park ◽  
Gina Faraci ◽  
Pamela M. Ward ◽  
Jane F. Emerson ◽  
Ha Youn Lee

AbstractCOVID-19 global cases have climbed to more than 33 million, with over a million total deaths, as of September, 2020. Real-time massive SARS-CoV-2 whole genome sequencing is key to tracking chains of transmission and estimating the origin of disease outbreaks. Yet no methods have simultaneously achieved high precision, simple workflow, and low cost. We developed a high-precision, cost-efficient SARS-CoV-2 whole genome sequencing platform for COVID-19 genomic surveillance, CorvGenSurv (Coronavirus Genomic Surveillance). CorvGenSurv directly amplified viral RNA from COVID-19 patients’ Nasopharyngeal/Oropharyngeal (NP/OP) swab specimens and sequenced the SARS-CoV-2 whole genome in three segments by long-read, high-throughput sequencing. Sequencing of the whole genome in three segments significantly reduced sequencing data waste, thereby preventing dropouts in genome coverage. We validated the precision of our pipeline by both control genomic RNA sequencing and Sanger sequencing. We produced near full-length whole genome sequences from individuals who were COVID-19 test positive during April to June 2020 in Los Angeles County, California, USA. These sequences were highly diverse in the G clade with nine novel amino acid mutations including NSP12-M755I and ORF8-V117F. With its readily adaptable design, CorvGenSurv grants wide access to genomic surveillance, permitting immediate public health response to sudden threats.


Sign in / Sign up

Export Citation Format

Share Document