Integrated Addressable Dynamic Droplet Array (aDDA) as Sub‐Nanoliter Reactors for High‐Coverage Genome Sequencing of Single Yeast Cells (Small 37/2021)

Bleomycin (BLM) is a widely used chemotherapeutic drug. BLM-treated cells showed an elevated rate of mutations, but the underlying mechanisms remained unclear. In this study, the global genomic alterations in BLM-treated cells were explored in the yeast Saccharomyces cerevisiae . Using genetic assay and whole-genome sequencing, we found that the mutation rate could be greatly elevated in S. cerevisiae cells that underwent Zeocin TM (a BLM member) treatment. One-base deletion and T to G substitution at the 5’-GT-3’ motif was the most striking signature of Zeocin TM -induced mutations. This was mainly the result of translesion DNA synthesis involving Rev1 and polymerase ζ. Zeocin TM treatment led to the frequent loss of heterozygosity and chromosomal rearrangements in the diploid strains. The breakpoints of recombination events were significantly associated with certain chromosomal elements. Lastly, we identified multiple genomic alterations that contributed to BLM resistance in the Zeocin TM -treated mutants. Overall, this study provides new insights into the genotoxicity and evolutional effects of BLM. Importance Bleomycin is an antitumor antibiotic that can mutate genomic DNA. Using yeast models in combination with genome sequencing, the mutational signatures of Zeocin TM (a member of the bleomycin family) are disclosed. Translesion-synthesis polymerases are crucial for the viability of Zeocin TM -treated yeast cells at the sacrifice of a higher mutation rate. We also confirmed that multiple genomic alterations were associated with the improved resistance to Zeocin TM , providing novel insights into how bleomycin resistance is developed in cells.

Download Full-text

BarleyVarDB: a database of barley genomic variation

Database ◽

10.1093/database/baaa091 ◽

2020 ◽

Vol 2020 ◽

Author(s):

Cong Tan ◽

Brett Chapman ◽

Penghao Wang ◽

Qisen Zhang ◽

Gaofeng Zhou ◽

...

Keyword(s):

Genome Sequencing ◽

Reference Genome ◽

Agronomic Traits ◽

Genetic Research ◽

Pcr Primers ◽

Genomic Variation ◽

Whole Genome ◽

Hordeum Vulgare L ◽

High Coverage ◽

Barley Genome

Abstract Barley (Hordeum vulgare L.) is one of the first domesticated grain crops and represents the fourth most important cereal source for human and animal consumption. BarleyVarDB is a database of barley genomic variation. It can be publicly accessible through the website at http://146.118.64.11/BarleyVar. This database mainly provides three sets of information. First, there are 57 754 224 single nuclear polymorphisms (SNPs) and 3 600 663 insertions or deletions (InDels) included in BarleyVarDB, which were identified from high-coverage whole genome sequencing of 21 barley germplasm, including 8 wild barley accessions from 3 barley evolutionary original centers and 13 barley landraces from different continents. Second, it uses the latest barley genome reference and its annotation information publicly accessible, which has been achieved by the International Barley Genome Sequencing Consortium (IBSC). Third, 522 212 whole genome-wide microsatellites/simple sequence repeats (SSRs) were also included in this database, which were identified in the reference barley pseudo-molecular genome sequence. Additionally, several useful web-based applications are provided including JBrowse, BLAST and Primer3. Users can design PCR primers to asses polymorphic variants deposited in this database and use a user-friendly interface for accessing the barley reference genome. We envisage that the BarleyVarDB will benefit the barley genetic research community by providing access to all publicly available barley genomic variation information and barley reference genome as well as providing them with an ultra-high density of SNP and InDel markers for molecular breeding and identification of functional genes with important agronomic traits in barley. Database URL: http://146.118.64.11/BarleyVar

Download Full-text

Estimation of Nucleotide Diversity, Disequilibrium Coefficients, and Mutation Rates from High-Coverage Genome-Sequencing Projects

Molecular Biology and Evolution ◽

10.1093/molbev/msn185 ◽

2008 ◽

Vol 25 (11) ◽

pp. 2409-2419 ◽

Cited By ~ 80

Author(s):

M. Lynch

Keyword(s):

Genome Sequencing ◽

Nucleotide Diversity ◽

Mutation Rates ◽

High Coverage

Download Full-text

Progress in plant genome sequencing: research directions

Vavilov Journal of Genetics and Breeding ◽

10.18699/vj19.459 ◽

2019 ◽

Vol 23 (1) ◽

pp. 38-48 ◽

Cited By ~ 1

Author(s):

M. K. Bragina ◽

D. A. Afonnikov ◽

E. A. Salina

Keyword(s):

Genome Sequencing ◽

Plant Traits ◽

Plant Genome ◽

Targeted Sequencing ◽

Genome Sequences ◽

Crop Species ◽

High Coverage ◽

Protein Coding ◽

Sequencing Technologies ◽

A Genome

Since the first plant genome of Arabidopsis thaliana has been sequenced and published, genome sequencing technologies have undergone significant changes. New algorithms, sequencing technologies and bioinformatic approaches were adopted to obtain genome, transcriptome and exome sequences for model and crop species, which have permitted deep inferences into plant biology. As a result of an improved genome assembly and analysis methods, genome sequencing costs plummeted and the number of high-quality plant genome sequences is constantly growing. Consequently, more than 300 plant genome sequences have been published over the past twenty years. Although many of the published genomes are considered incomplete, they proved to be a valuable tool for identifying genes involved in the formation of economically valuable plant traits, for marker-assisted and genomic selection and for comparative analysis of plant genomes in order to determine the basic patterns of origin of various plant species. Since a high coverage and resolution of a genome sequence is not enough to detect all changes in complex samples, targeted sequencing, which consists in the isolation and sequencing of a specific region of the genome, has begun to develop. Targeted sequencing has a higher detection power (the ability to identify new differences/variants) and resolution (up to one basis). In addition, exome sequencing (the method of sequencing only protein-coding genes regions) is actively developed, which allows for the sequencing of non-expressed alleles and genes that cannot be found with RNA-seq. In this review, an analysis of sequencing technologies development and the construction of “reference” genomes of plants is performed. A comparison of the methods of targeted sequencing based on the use of the reference DNA sequence is accomplished.

Download Full-text

Fast and inexpensive whole genome sequencing library preparation from intact yeast cells

10.1101/2020.09.03.280990 ◽

2020 ◽

Author(s):

Sibylle C Vonesch ◽

Shengdi Li ◽

Chelsea Szu Tu ◽

Bianca P Hennig ◽

Nikolay Dobrev ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genomic Dna ◽

Large Scale ◽

Massively Parallel Sequencing ◽

Yeast Cells ◽

Whole Genome ◽

High Quality ◽

Rapid Preparation ◽

Yeast Cultures

ABSTRACTThrough the increase in the capacity of sequencing machines massively parallel sequencing of thousands of samples in a single run is now possible. With the improved throughput and resulting drop in the price of sequencing, the cost and time for preparation of sequencing libraries have become the major bottleneck in large-scale experiments. Methods using a hyperactive variant of the Tn5 transposase efficiently generate libraries starting from cDNA or genomic DNA in a few hours and are highly scalable. For genome sequencing, however, the time and effort spent on genomic DNA isolation limits the practicability of sequencing large numbers of samples. Here, we describe a highly scalable method for preparing high quality whole-genome sequencing libraries directly from yeast cultures in less than three hours at 34 cents per sample. We skip the rate-limiting step of genomic DNA extraction by directly tagmenting yeast spheroplasts and add a nucleosome release step prior to enrichment PCR to improve the evenness of genomic coverage. Resulting libraries do not show any GC-bias and are comparable in quality to libraries processed from genomic DNA with a commercially available Tn5-based kit. We use our protocol to investigate CRISPR/Cas9 on- and off-target edits and reliably detect edited variants and shared polymorphisms between strains. Our protocol enables rapid preparation of unbiased and high-quality, sequencing-ready indexed libraries for hundreds of yeast strains in a single day at a low price. By adjusting individual steps of our workflow we expect that our protocol can be adapted to other organisms.

Download Full-text

Comprehensive population-based genome sequencing provides insight into hematopoietic regulatory mechanisms

10.1101/067934 ◽

2016 ◽

Author(s):

Michael Guo ◽

Satish K. Nandakumar ◽

Jacob C. Ulirsch ◽

Seyedeh Maryam Zekavat ◽

Jason D. Buenrostro ◽

...

Keyword(s):

Blood Cell ◽

Genome Sequencing ◽

Association Studies ◽

Population Based ◽

Regulatory Mechanisms ◽

Data Sets ◽

Lineage Specification ◽

High Coverage ◽

Hematopoietic Stem ◽

Stem And Progenitor Cells

ABSTRACTGenetic variants affecting hematopoiesis can influence commonly measured blood cell traits. To identify factors that affect hematopoiesis, we performed association studies for blood cell traits in the population-based Estonian Biobank using high coverage whole genome sequencing (WGS) in 2,284 samples and SNP genotyping in an additional ~17,000 samples. Our analyses identified 17 associations across 14 blood cell traits. Integration of WGS-based fine-mapping and complementary epigenomic data sets provided evidence for causal mechanisms at several loci, including at a novel basophil count-associated locus near the master hematopoietic transcription factor CEBPA. The fine-mapped variant at this basophil count association near CEBPA overlapped an enhancer active in common myeloid progenitors and influenced its activity. In situ perturbation of this enhancer by CRISPR/Cas9 mutagenesis in hematopoietic stem and progenitor cells demonstrated that it is necessary for and specifically regulates CEBPA expression during basophil differentiation. We additionally identified basophil count-associated variation at another more pleiotropic myeloid enhancer near GATA2, highlighting regulatory mechanisms for ordered expression of master hematopoietic regulators during lineage specification. Our study illustrates how population-based genetic studies can provide key insights into poorly understood cell differentiation processes of considerable physiologic relevance.

Download Full-text

High coverage whole genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios

10.1101/2021.02.06.430068 ◽

2021 ◽

Cited By ~ 4

Author(s):

Marta Byrska-Bishop ◽

Uday S. Evani ◽

Xuefang Zhao ◽

Anna O. Basile ◽

Haley J. Abel ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Sequence Data ◽

Whole Genome ◽

1000 Genomes Project ◽

Phase 3 ◽

High Coverage ◽

Entire Cohort ◽

1000 Genomes ◽

Low Coverage

ABSTRACTThe 1000 Genomes Project (1kGP), launched in 2008, is the largest fully open resource of whole genome sequencing (WGS) data consented for public distribution of raw sequence data without access or use restrictions. The final (phase 3) 2015 release of 1kGP included 2,504 unrelated samples from 26 populations, representing five continental regions of the world and was based on a combination of technologies including low coverage WGS (mean depth 7.4X), high coverage whole exome sequencing (mean depth 65.7X), and microarray genotyping. Here, we present a new, high coverage WGS resource encompassing the original 2,504 1kGP samples, as well as an additional 698 related samples that result in 602 complete trios in the 1kGP cohort. We sequenced this expanded 1kGP cohort of 3,202 samples to a targeted depth of 30X using Illumina NovaSeq 6000 instruments. We performed SNV/INDEL calling against the GRCh38 reference using GATK’s HaplotypeCaller, and generated a comprehensive set of SVs by integrating multiple analytic methods through a sophisticated machine learning model, upgrading the 1kGP dataset to current state-of-the-art standards. Using this strategy, we defined over 111 million SNVs, 14 million INDELs, and ∼170 thousand SVs across the entire cohort of 3,202 samples with estimated false discovery rate (FDR) of 0.3%, 1.0%, and 1.8%, respectively. By comparison to the low-coverage phase 3 callset, we observed substantial improvements in variant discovery and estimated FDR that were facilitated by high coverage re-sequencing and expansion of the cohort. Specifically, we called 7% more SNVs, 59% more INDELs, and 170% more SVs per genome than the phase 3 callset. Moreover, we leveraged the presence of families in the cohort to achieve superior haplotype phasing accuracy and we demonstrate improvements that the high coverage panel brings especially for INDEL imputation. We make all the data generated as part of this project publicly available and we envision this updated version of the 1kGP callset to become the new de facto public resource for the worldwide scientific community working on genomics and genetics.

Download Full-text

Faculty Opinions recommendation of Estimation of nucleotide diversity, disequilibrium coefficients, and mutation rates from high-coverage genome-sequencing projects.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1121769.578841 ◽

2008 ◽

Author(s):

Joshua Plotkin ◽

Sergey Kryazhimskiy

Keyword(s):

Genome Sequencing ◽

Nucleotide Diversity ◽

Mutation Rates ◽

High Coverage

Download Full-text

Fast and inexpensive whole-genome sequencing library preparation from intact yeast cells

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkaa009 ◽

2020 ◽

Vol 11 (1) ◽

pp. 1-12

Author(s):

Sibylle C Vonesch ◽

Shengdi Li ◽

Chelsea Szu Tu ◽

Bianca P Hennig ◽

Nikolay Dobrev ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genomic Dna ◽

Large Scale ◽

Massively Parallel Sequencing ◽

Yeast Cells ◽

Whole Genome ◽

High Quality ◽

Rapid Preparation ◽

Genomic Dna Isolation

Abstract Through the increase in the capacity of sequencing machines massively parallel sequencing of thousands of samples in a single run is now possible. With the improved throughput and resulting drop in the price of sequencing, the cost and time for preparation of sequencing libraries have become the major bottleneck in large-scale experiments. Methods using a hyperactive variant of the Tn5 transposase efficiently generate libraries starting from cDNA or genomic DNA in a few hours and are highly scalable. For genome sequencing, however, the time and effort spent on genomic DNA isolation limit the practicability of sequencing large numbers of samples. Here, we describe a highly scalable method for preparing high-quality whole-genome sequencing libraries directly from Saccharomyces cerevisiae cultures in less than 3 h at 34 cents per sample. We skip the rate-limiting step of genomic DNA extraction by directly tagmenting lysed yeast spheroplasts and add a nucleosome release step prior to enrichment PCR to improve the evenness of genomic coverage. Resulting libraries do not show any GC bias and are comparable in quality to libraries processed from genomic DNA with a commercially available Tn5-based kit. We use our protocol to investigate CRISPR/Cas9 on- and off-target edits and reliably detect edited variants and shared polymorphisms between strains. Our protocol enables rapid preparation of unbiased and high-quality, sequencing-ready indexed libraries for hundreds of yeast strains in a single day at a low price. By adjusting individual steps of our workflow, we expect that our protocol can be adapted to other organisms.

Download Full-text