scholarly journals Comprehensive evaluation of SNP identification with the Restriction Enzyme-based Reduced Representation Library (RRL) method

BMC Genomics ◽  
2012 ◽  
Vol 13 (1) ◽  
pp. 77 ◽  
Author(s):  
Ye Du ◽  
Hui Jiang ◽  
Ying Chen ◽  
Cong Li ◽  
Meiru Zhao ◽  
...  
2016 ◽  
Author(s):  
Thadeous J Kacmarczyk ◽  
Mame P. Fall ◽  
Xihui Zhang ◽  
Yuan Xin ◽  
Yushan Li ◽  
...  

ABSTRACTBackgroundDNA methylation in CpG context is fundamental to the epigenetic regulation of gene expression in high eukaryotes. Disorganization of methylation status is implicated in many diseases, cellular differentiation, imprinting, and other biological processes. Techniques that enrich for biologically relevant genomic regions with high CpG content are desired, since, depending on the size of an organism’s methylome, the depth of sequencing required to cover all CpGs can be prohibitively expensive. Currently, restriction enzyme based reduced representation bisulfite sequencing and its modified protocols are widely used to study methylation differences. Recently, Agilent Technologies and Roche NimbleGen have ventured to both reduce sequencing costs and capture CpGs of known biological relevance by marketing in-solution custom-capture hybridization platforms. We aimed to evaluate the similarities and differences of these three methods considering each targets approximately 10-13% of the human methylome.ResultsOverall, the regions covered per platform were as expected: targeted capture based methods covered >95% of their designed regions whereas the restriction enzyme-based method covered >70% of the expected fragments. While the total number of CpG loci shared by all methods was low, ~30% of any platform, the methylation levels of CpGs common across platforms were concordant. Annotation of CpG loci with genomic features revealed roughly the same proportions of feature annotations across the three platforms. Targeted capture methods encompass similar amounts of annotations with the restriction enzyme based method covering fewer promoters (~9%) and shores (~8%) and more unannotated loci (7-14%).ConclusionsAlthough all methods are largely consistent in terms of covered CpG loci and cover similar proportions of annotated CpG loci, the restriction based enrichment results in more unannotated regions and the commercially available capture methods result in less off-target regions. Quality of DNA is very important for restriction based enrichment and starting material can be low. Conversely, quality of the starting material is less important for capture methods, and at least twice the amount of starting material is required. Pricing is marginally less for restriction based enrichment, and number of samples to be prepared is not restricted to the number of samples a kit supports. The one advantage of capture libraries is the ability to custom design areas of interest. The choice of the technique should be decided by the number of samples, the quality and quantity of DNA available and the biological areas of interest since comparable data are obtained from all platforms.


2017 ◽  
Author(s):  
Giancarlo Bonora ◽  
Liudmilla Rubbi ◽  
Marco Morselli ◽  
Constantinos Chronis ◽  
Kathrin Plath ◽  
...  

ABSTRACTWhole-genome bisulfite sequencing (WGBS) and reduced representation bisulfite sequencing (RRBS) are widely used for measuring DNA methylation levels on a genome-wide scale(1). Both methods have limitations: WGBS is expensive and prohibitive for most large-scale projects; RRBS only interrogates 6-12% of the CpGs in the human genome(16,19). Here, we introduce methylation-sensitive restriction enzyme bisulfite sequencing (MREBS) which has the reduced sequencing requirements of RRBS, but significantly expands the coverage of CpG sites in the genome. We built a multiple regression model that combines the two features of MREBS: the bisulfite conversion ratios of single cytosines (as in WGBS and RRBS) as well as the number of reads that cover each locus (as in MRE-seq(12)). This combined approach allowed us to estimate differential methylation across 60% of the genome using read count data alone, and where counts were sufficiently high in both samples (about 1.5% of the genome), our estimates were significantly improved by the single CpG conversion information. We show that differential DNA methylation values based on MREBS data correlate well with those based on WGBS and RRBS. This newly developed technique combines the sequencing cost of RRBS and DNA methylation estimates on a portion of the genome similar to WGBS, making it ideal for large-scale projects of mammalian genomes.


2018 ◽  
Author(s):  
Erin O Campbell ◽  
Bryan M T Brunet ◽  
Julian R Dupuis ◽  
Felix A H Sperling

ABSTRACTSampling markers throughout a genome with restriction enzymes emerged in the 2000s as reduced representation shotgun sequencing (RRS). Rapid advances in sequencing technology have since spurred modifications of RRS, giving rise to many derivatives with unique names, such as RADseq. But naming conventions have often been more creative than consistent, with unclear criteria for recognition as a unique method resulting in a proliferation of names characterized by ambiguity. We conducted a literature review to assess methodological and etymological relationships among 36 restriction enzyme-based methods, as well as rates of correct referencing of commonly-used methods. We identify several instances of methodological convergence or misattribution in the literature, and note that many published derivatives have modified only minor elements of parent protocols. We urge greater restraint in naming derivative methods, to strike a better balance between clarity, recognition of scientific innovation, and correct attribution.


2018 ◽  
Author(s):  
Jaime A. Osorio-Guarín ◽  
Corey R. Quackenbush ◽  
Omar E. Cornejo

AbstractAs the source of chocolate, cacao has become one of the most important crops in the world. The identification of molecular markers to understand the demographic history, genetic diversity and population structure plays a pivotal role in cacao breeding programs. Here, we report the use of a modified genotyping-by-sequencing (GBS) approach for large-scale single nucleotide polymorphism (SNP) discovery and allele ancestry mapping. We identified 12,357 bi-allelic SNPs after filtering, of which, 7,009 variants were ancestry informative. The GBS approach proved to be rapid, cost-effective, and highly informative for ancestry assignment in this species.


PLoS ONE ◽  
2020 ◽  
Vol 15 (1) ◽  
pp. e0226608 ◽  
Author(s):  
Carly F. Graham ◽  
Douglas R. Boreham ◽  
Richard G. Manzon ◽  
Wendylee Stott ◽  
Joanna Y. Wilson ◽  
...  

Genomics ◽  
2016 ◽  
Vol 107 (4) ◽  
pp. 109-119 ◽  
Author(s):  
Sophie A. Kirschner ◽  
Oliver Hunewald ◽  
Sophie B. Mériaux ◽  
Regina Brunnhoefer ◽  
Claude P. Muller ◽  
...  

2014 ◽  
Author(s):  
Santiago Herrera ◽  
Paula H. Reyes-Herrera ◽  
Timothy M. Shank

High-throughput sequencing of reduced representation libraries obtained through digestion with restriction enzymes ? generically known as restriction-site associated DNA sequencing (RAD-seq) ? is a common strategy to generate genome-wide genotypic and sequence data from eukaryotes. A critical design element of any RAD-seq study is a knowledge of the approximate number of genetic markers that can be obtained for a taxon using different restriction enzymes, as this number determines the scope of a project, and ultimately defines its success. This number can only be directly determined if a reference genome sequence is available, or it can be estimated if the genome size and restriction recognition sequence probabilities are known. However, both scenarios are uncommon for non-model species. Here, we performed systematic in silico surveys of recognition sequences, for diverse and commonly used type II restriction enzymes across the eukaryotic tree of life. Our observations reveal that recognition-sequence frequencies for a given restriction enzyme are strikingly variable among broad eukaryotic taxonomic groups, being largely determined by phylogenetic relatedness. We demonstrate that genome sizes can be predicted from cleavage frequency data obtained with restriction enzymes targeting ?neutral? elements. Models based on genomic compositions are also effective tools to accurately calculate probabilities of recognition sequences across taxa, and can be applied to species for which reduced-representation data is available (including transcriptomes and ?neutral? RAD-seq datasets). The analytical pipeline developed in this study, PredRAD (https://github.com/phrh/PredRAD), and the resulting databases constitute valuable resources that will help guide the design of any study using RAD-seq or related methods.


Sign in / Sign up

Export Citation Format

Share Document