A High-Throughput Skim-sequencing Approach for Genotyping, Dosage Estimation and Identifying Translocations

Abstract The development of next generation sequencing (NGS) enabled a shift from array-based genotyping to high-throughput genotyping by directly sequencing genomic libraries. Even though whole genome sequencing was initially too costly for routine analysis in large populations, such as those utilized for breeding or genetic studies, continued advancements in genome sequencing and bioinformatics have provided the opportunity to utilize whole-genome information. As new sequencing platforms can routinely provide high-quality sequencing data for sufficient genome coverage, a limitation comes in the time and high cost of library construction when multiplexing a large number of samples. Here we describe a high-throughput whole-genome skim-sequencing (skim-seq) approach that can be utilized for a broad range of genotyping and genomic characterization. Using optimized low-volume Illumina Nextera chemistry, we developed a skim-seq method and combined up to 960 samples in one multiplex library using dual index barcoding. With the dual-index barcoding, the number of samples for multiplexing can be adjusted depending on amount of data required and extended to 3,072 samples or more. Panels of double haploid wheat lines (Triticum aestivum, CDC Stanley x CDC Landmark), wheat-barley (T. aestivum x Hordeum vulgare) and wheat-wheatgrass (Triticum durum x Thinopyrum intermedium) introgression lines as well as known monosomic wheat stocks were genotyped using the skim-seq approach. Bioinformatics pipelines were developed for various applications where sequencing coverage ranged from 1x down to 0.01x per sample. Using reference genomes, we detected chromosome dosage, identified aneuploidy, and karyotyped introgression lines from the low coverage skim-seq data. Leveraging the recent advancements in genome sequencing, skim-seq provides an effective and low-cost tool for routine genotyping and genetic analysis, which can track and identify introgressions and genomic regions of interest in genetics research and applied breeding programs.

Download Full-text

Fast and low-cost decentralized surveillance of transmission of tuberculosis based on strain-specific PCRs tailored from whole genome sequencing data: a pilot study

Clinical Microbiology and Infection ◽

10.1016/j.cmi.2014.10.003 ◽

2015 ◽

Vol 21 (3) ◽

pp. 249.e1-249.e9 ◽

Cited By ~ 12

Author(s):

L. Pérez-Lago ◽

M. Martínez Lirola ◽

M. Herranz ◽

I. Comas ◽

E. Bouza ◽

...

Keyword(s):

Pilot Study ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Low Cost ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data

Download Full-text

Pathogen-Oriented Low-cost Assembly & Re-sequencing (POLAR): A highly sensitive and high-throughput SARS-CoV-2 diagnostic based on whole genome sequencing v1 (protocols.io.bearjad6)

protocols.io ◽

10.17504/protocols.io.bearjad6 ◽

2020 ◽

Author(s):

Brian Glenn ◽

Neva C ◽

Namita Mitra ◽

Saul Godinez ◽

Ragini Mahajan ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

High Throughput ◽

Low Cost ◽

Whole Genome ◽

Highly Sensitive

Download Full-text

In-House, Rapid, and Low-Cost SARS-CoV-2 Spike Gene Sequencing Protocol to Identify Variants of Concern Using Sanger Sequencing

10.1101/2021.08.09.21261723 ◽

2021 ◽

Author(s):

Fatimah Alhamlan ◽

Dana Bakheet ◽

Marie Bohol ◽

Madain Alsanea ◽

Basma Alahaideb ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

High Throughput ◽

Sanger Sequencing ◽

High Throughput Sequencing ◽

Low Cost ◽

Rna Extraction ◽

Whole Genome ◽

Diagnostic Laboratory ◽

Spike Gene

Background: The need for active genomic sequencing surveillance to rapidly identify circulating SARS-CoV-2 variants of concern (VOCs) is critical. However, increased global demand has led to a shortage of commercial SARS-CoV-2 sequencing kits, and not every country has the technological capability or the funds for high-throughput sequencing platforms. Therefore, this study aimed to develop and validate a rapid, cost-efficient genome sequencing protocol that uses supplies, equipment, and methodologic expertise available in standard molecular or diagnostic laboratories to identify circulating SARS-CoV-2 variants of concern. Methods: Sets of primers flanking the SARS-CoV-2 spike gene were designed using SARS-CoV-2 genome sequences retrieved from the Global Initiative on Sharing Avian Influenza Data (GISAID) Database and synthesized in-house. Primer specificity and final sequences were verified using online prediction analyses with BLAST. The primers were validated using 282 nasopharyngeal samples collected from patients assessed as positive for SARS-CoV-2 at the diagnostic laboratory of the hospital using a Rotor-Gene PCR cycler with an Altona Diagnostics SARS-CoV-2 kit. The patient samples were subjected to RNA extraction followed by cDNA synthesis, conventional polymerase chain reaction, and Sanger sequencing. Protocol specificity was confirmed by comparing these results with SARS-CoV-2 whole genome sequencing of the same samples. Results: Sanger sequencing using the newly designed primers and next-generation whole genome sequencing of 282 patient samples indicated identical variants of concern results: 123 samples contained the alpha variant (B.1.1.7); 78, beta (B.1.351), 0, gamma (P.1), and 13, delta (B.1.617.2). Moreover, the remaining samples were non-VOC that belonged to none of these variants and had 99.97% identity with the reference genome. Only four samples had poor sequence quality by Sanger sequencing owing to a low viral count (Ct value >38). Therefore, mutation calls were >98% accurate. Conclusions: Sanger sequencing method using in-house primers is an alternative approach that can be used in facilities with existing equipment to mitigate limitations in high throughput supplies required to identify SARS-CoV-2 variants of concern during the COVID-19 pandemic. This protocol is easily adaptable for detection of emerging variants.

Download Full-text

The Sequencing-Based Mapping Method for Effectively Cloning Plant Mutated Genes

International Journal of Molecular Sciences ◽

10.3390/ijms22126224 ◽

2021 ◽

Vol 22 (12) ◽

pp. 6224

Author(s):

Li Yu ◽

Yanshen Nie ◽

Jinxia Jiao ◽

Liufang Jian ◽

Jie Zhao

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Low Cost ◽

Causal Variant ◽

Mapping Method ◽

Small Population ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Plant Genes

A forward genetic approach is a powerful tool for identifying the genes underlying the phenotypes of interest. However, the conventional map-based cloning method is lengthy, requires a large mapping population and confirmation of many candidate genes in a broad genetic region to clone the causal variant. The whole-genome sequencing method clones the variants with a certain failure probability for multiple reasons, especially for heterozygotes, and could not be used to clone the mutation of epigenetic modifications. Here, we applied the highly complementary characteristics of these two methods and developed a sequencing-based mapping method (SBM) for identifying the location of plant variants effectively with a small population and low cost, which is very user-friendly for most popular laboratories. This method used the whole-genome sequencing data of two pooled populations to screen out enough markers. These markers were used to identify and narrow the candidate region by analyzing the marker-indexes and recombinants. Finally, the possible mutational sites were identified using the whole-genome sequencing data and verified in individual mutants. To elaborate the new method, we displayed the cloned processes in one Arabidopsis heterozygous mutant and two rice homozygous mutants. Thus, the sequencing-based mapping method could clone effectively different types of plant mutations and was a powerful tool for studying the functions of plant genes in the species with known genomic sequences.

Download Full-text

From whole genome sequencing data toward a simple genotyping tool: application to the animal pathogen Mycobacterium bovis

10.26226/morressier.56d5ba2ad462b80296c965c0 ◽

2016 ◽

Author(s):

Lorraine Michelet

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Mycobacterium Bovis ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data

Download Full-text

Plasmids or no plasmids? A comparison between the agilent TapeStation and whole-genome sequencing data in a large-scale bacterial sequencing project

10.26226/morressier.56d5ba27d462b80296c95fe7 ◽

2016 ◽

Author(s):

Sarah Alexander

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Large Scale ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Sequencing Project

Download Full-text

Faculty Opinions recommendation of Performance comparison of whole-genome sequencing platforms.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13440958.15521064 ◽

2012 ◽

Author(s):

François Cambien

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Performance Comparison ◽

Whole Genome ◽

Sequencing Platforms

Download Full-text

Faculty Opinions recommendation of Performance comparison of whole-genome sequencing platforms.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13440958.14815057 ◽

2012 ◽

Author(s):

Tom Tullius ◽

Stephen Parker

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Performance Comparison ◽

Whole Genome ◽

Sequencing Platforms

Download Full-text

High-precision and cost-efficient sequencing for real-time COVID-19 surveillance

Scientific Reports ◽

10.1038/s41598-021-93145-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Sung Yong Park ◽

Gina Faraci ◽

Pamela M. Ward ◽

Jane F. Emerson ◽

Ha Youn Lee

Keyword(s):

Los Angeles ◽

Whole Genome Sequencing ◽

Real Time ◽

Genome Sequencing ◽

High Precision ◽

High Throughput Sequencing ◽

Whole Genome ◽

Sequencing Data ◽

Public Health Response ◽

Cost Efficient

AbstractCOVID-19 global cases have climbed to more than 33 million, with over a million total deaths, as of September, 2020. Real-time massive SARS-CoV-2 whole genome sequencing is key to tracking chains of transmission and estimating the origin of disease outbreaks. Yet no methods have simultaneously achieved high precision, simple workflow, and low cost. We developed a high-precision, cost-efficient SARS-CoV-2 whole genome sequencing platform for COVID-19 genomic surveillance, CorvGenSurv (Coronavirus Genomic Surveillance). CorvGenSurv directly amplified viral RNA from COVID-19 patients’ Nasopharyngeal/Oropharyngeal (NP/OP) swab specimens and sequenced the SARS-CoV-2 whole genome in three segments by long-read, high-throughput sequencing. Sequencing of the whole genome in three segments significantly reduced sequencing data waste, thereby preventing dropouts in genome coverage. We validated the precision of our pipeline by both control genomic RNA sequencing and Sanger sequencing. We produced near full-length whole genome sequences from individuals who were COVID-19 test positive during April to June 2020 in Los Angeles County, California, USA. These sequences were highly diverse in the G clade with nine novel amino acid mutations including NSP12-M755I and ORF8-V117F. With its readily adaptable design, CorvGenSurv grants wide access to genomic surveillance, permitting immediate public health response to sudden threats.

Download Full-text