scholarly journals Predicting genome sizes and restriction enzyme recognition-sequence probabilities across the eukaryotic tree of life

2014 ◽  
Author(s):  
Santiago Herrera ◽  
Paula H. Reyes-Herrera ◽  
Timothy M. Shank

High-throughput sequencing of reduced representation libraries obtained through digestion with restriction enzymes ? generically known as restriction-site associated DNA sequencing (RAD-seq) ? is a common strategy to generate genome-wide genotypic and sequence data from eukaryotes. A critical design element of any RAD-seq study is a knowledge of the approximate number of genetic markers that can be obtained for a taxon using different restriction enzymes, as this number determines the scope of a project, and ultimately defines its success. This number can only be directly determined if a reference genome sequence is available, or it can be estimated if the genome size and restriction recognition sequence probabilities are known. However, both scenarios are uncommon for non-model species. Here, we performed systematic in silico surveys of recognition sequences, for diverse and commonly used type II restriction enzymes across the eukaryotic tree of life. Our observations reveal that recognition-sequence frequencies for a given restriction enzyme are strikingly variable among broad eukaryotic taxonomic groups, being largely determined by phylogenetic relatedness. We demonstrate that genome sizes can be predicted from cleavage frequency data obtained with restriction enzymes targeting ?neutral? elements. Models based on genomic compositions are also effective tools to accurately calculate probabilities of recognition sequences across taxa, and can be applied to species for which reduced-representation data is available (including transcriptomes and ?neutral? RAD-seq datasets). The analytical pipeline developed in this study, PredRAD (https://github.com/phrh/PredRAD), and the resulting databases constitute valuable resources that will help guide the design of any study using RAD-seq or related methods.

2018 ◽  
Author(s):  
Erin O Campbell ◽  
Bryan M T Brunet ◽  
Julian R Dupuis ◽  
Felix A H Sperling

ABSTRACTSampling markers throughout a genome with restriction enzymes emerged in the 2000s as reduced representation shotgun sequencing (RRS). Rapid advances in sequencing technology have since spurred modifications of RRS, giving rise to many derivatives with unique names, such as RADseq. But naming conventions have often been more creative than consistent, with unclear criteria for recognition as a unique method resulting in a proliferation of names characterized by ambiguity. We conducted a literature review to assess methodological and etymological relationships among 36 restriction enzyme-based methods, as well as rates of correct referencing of commonly-used methods. We identify several instances of methodological convergence or misattribution in the literature, and note that many published derivatives have modified only minor elements of parent protocols. We urge greater restraint in naming derivative methods, to strike a better balance between clarity, recognition of scientific innovation, and correct attribution.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Philipp Kirschner ◽  
◽  
Wolfgang Arthofer ◽  
Stefanie Pfeifenberger ◽  
Eliška Záveská ◽  
...  

AbstractMulti-locus genetic data are pivotal in phylogenetics. Today, high-throughput sequencing (HTS) allows scientists to generate an unprecedented amount of such data from any organism. However, HTS is resource intense and may not be accessible to wide parts of the scientific community. In phylogeography, the use of HTS has concentrated on a few taxonomic groups, and the amount of data used to resolve a phylogeographic pattern often seems arbitrary. We explore the performance of two genetic marker sampling strategies and the effect of marker quantity in a comparative phylogeographic framework focusing on six species (arthropods and plants). The same analyses were applied to data inferred from amplified fragment length polymorphism fingerprinting (AFLP), a cheap, non-HTS based technique that is able to straightforwardly produce several hundred markers, and from restriction site associated DNA sequencing (RADseq), a more expensive, HTS-based technique that produces thousands of single nucleotide polymorphisms. We show that in four of six study species, AFLP leads to results comparable with those of RADseq. While we do not aim to contest the advantages of HTS techniques, we also show that AFLP is a robust technique to delimit evolutionary entities in both plants and animals. The demonstrated similarity of results from the two techniques also strengthens biological conclusions that were based on AFLP data in the past, an important finding given the wide utilization of AFLP over the last decades. We emphasize that whenever the delimitation of evolutionary entities is the central goal, as it is in many fields of biodiversity research, AFLP is still an adequate technique.


2020 ◽  
Author(s):  
Katherine Silliman ◽  
Jane L. Indorf ◽  
Nancy Knowlton ◽  
William E. Browne ◽  
Carla Hurt

AbstractThe formation of the Isthmus of Panama and final closure of the Central American Seaway (CAS) provides an independent calibration point for examining the rate of DNA substitutions. This vicariant event has been widely used to estimate the substitution rate across mitochondrial genomes and to date evolutionary events in other taxonomic groups. Nuclear sequence data is increasingly being used to complement mitochondrial datasets for phylogenetic and evolutionary investigations; these studies would benefit from information regarding the rate and pattern of DNA substitutions derived from the nuclear genome. To estimate this genomewide neutral mutation rate (μ), genotype-by-sequencing (GBS) datasets were generated for three transisthmian species pairs in Alpheus snapping shrimp. Using a Bayesian coalescent approach (G-PhoCS) applied to 44,960 GBS loci, we estimated μ to be 2.64E-9 substitutions/site/year, when calibrated with the closure of the CAS at 3 Ma. This estimate is remarkably similar to experimentally derived mutation rates in model arthropod systems, strengthening the argument for a recent closure of the CAS. To our knowledge this is the first use of transisthmian species pairs to calibrate the rate of molecular evolution from GBS data.


2021 ◽  
Author(s):  
Brian P. Anton ◽  
Alexey Fomenkov ◽  
Victoria Wu ◽  
Richard J. Roberts

ABSTRACTSingle-molecule Real-Time (SMRT) sequencing can easily identify sites of N6-methyladenine and N4-methylcytosine within DNA sequences, but similar identification of 5-methylcytosine sites is not as straightforward. In prokaryotic DNA, methylation typically occurs within specific sequence contexts, or motifs, that are a property of the methyltransferases that “write” these epigenetic marks. We present here a straightforward, cost-effective alternative to both SMRT and bisulfite sequencing for the determination of prokaryotic 5-methylcytosine methylation motifs. The method, called MFRE-Seq, relies on excision and isolation of fully methylated fragments of predictable size using MspJI-Family Restriction Enzymes (MFREs), which depend on the presence of 5-methylcytosine for cleavage. We demonstrate that MFRE-Seq is compatible with both Illumina and Ion Torrent sequencing platforms and requires only a digestion step and simple column purification of size-selected digest fragments prior to standard library preparation procedures. We applied MFRE-Seq to numerous bacterial and archaeal genomic DNA preparations and successfully confirmed known motifs and identified novel ones. This method should be a useful complement to existing methodologies for studying prokaryotic methylomes and characterizing the contributing methyltransferases.


In DNA splicing systems, restriction enzymes and ligases cleave and recombine DNA molecules based on the cleavage pattern of the restriction enzymes. The set of molecules resulting from the splicing system depicts a splicing language. In this research, an algorithm for DNA splicing systems is developed using C++ visual programming. The splicing languages which have been characterised through some theorems based on the crossings and sequences of the restriction enzymes, are generated as the output from this computation. In order to generate the splicing languages, the algorithm detects and calculates the number of cutting sites of the restriction enzymes found in the initial molecules, determines whether the sequence of restriction enzyme is a palindrome or not, and if the restriction enzymes have the same or different crossings. The results from this research depict the splicing languages obtained from the manual computations, which contributes to the development of computational software in DNA computing.


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0247541
Author(s):  
Brian P. Anton ◽  
Alexey Fomenkov ◽  
Victoria Wu ◽  
Richard J. Roberts

Single-molecule Real-Time (SMRT) sequencing can easily identify sites of N6-methyladenine and N4-methylcytosine within DNA sequences, but similar identification of 5-methylcytosine sites is not as straightforward. In prokaryotic DNA, methylation typically occurs within specific sequence contexts, or motifs, that are a property of the methyltransferases that “write” these epigenetic marks. We present here a straightforward, cost-effective alternative to both SMRT and bisulfite sequencing for the determination of prokaryotic 5-methylcytosine methylation motifs. The method, called MFRE-Seq, relies on excision and isolation of fully methylated fragments of predictable size using MspJI-Family Restriction Enzymes (MFREs), which depend on the presence of 5-methylcytosine for cleavage. We demonstrate that MFRE-Seq is compatible with both Illumina and Ion Torrent sequencing platforms and requires only a digestion step and simple column purification of size-selected digest fragments prior to standard library preparation procedures. We applied MFRE-Seq to numerous bacterial and archaeal genomic DNA preparations and successfully confirmed known motifs and identified novel ones. This method should be a useful complement to existing methodologies for studying prokaryotic methylomes and characterizing the contributing methyltransferases.


2014 ◽  
Author(s):  
Xiaofei Lv ◽  
Yuping Wu ◽  
Bin Ma

The structure pattern of the tree of life clues on the key ecological issues; hence knowing the fractal dimension is the fundamental question in understanding the tree of life. Yet the fractal dimension of the tree of life remains unclear since the scale of the tree of life has hypergrown in recent years. Here we show that the tree of life display a consistent power-law rules for inter- and intra-taxonomic levels, but the fractal dimensions were different among different kingdoms. The fractal dimension of hierarchical structure (Dr) is 0.873 for the entire tree of life, which smaller than the values of Dr for Animalia and Plantae but greater than the values of Dr for Fungi, Chromista, and Protozoa. The hierarchical fractal dimensions values for prokaryotic kingdoms are lower than for other kingdoms. The Dr value for Viruses was lower than most eukaryotic kingdoms, but greater than prokaryotes. The distribution of taxa size is governed by fractal diversity but skewed by overdominating taxa with large subtaxa size. The proportion of subtaxa in taxa with small and large sizes was greater than in taxa with intermediate size. Our results suggest that the distribution of subtaxa in taxa can be predicted with fractal dimension for the accumulating taxa abundance rather than the taxa abundance. Our study determined the fractal dimensions for inter- and intra-taxonomic levels of the present tree of life. These results emphases the need for further theoretical studies, as well as predictive modelling, to interpret the different fractal dimension for different taxonomic groups and skewness of taxa with large subtaxa size.


2018 ◽  
Vol 2 (2) ◽  
pp. 196-208
Author(s):  
Ayad Ismaeel

An important approach of therapy the target gene sequence causes diseases via repair/recombine the mutated gene (gene transfer) using a restriction enzymes in the laboratory. This approach will cause multiple problems happening accompany to biological laboratory if ruled out problems outside of it like the digested DNA ran as a smear on an agarose gel, incomplete restriction enzyme digestion, extra bands in the gel, etc. The paper suggested new approach of therapy via repair/replacement mutated gene caused disease by detecting primers and finding restriction enzymes using bioinformatics tools, software, packages etc. then achieving the repair/ recombine of mutations before going to the biologic lab (out-lab) to avoid the problems associated these laboratories. Implement and apply this a proposed therapy approach on TP53 gene (which caused more than 50% of human cancers) and after confirming there is mutations on P53 tumor protein shows an effective cost, friendly therapy methodology and comprehensive.


2021 ◽  
Vol 888 (1) ◽  
pp. 012024
Author(s):  
P W Prihandini ◽  
A Primasari ◽  
M Luthfi ◽  
D Pamungkas ◽  
A P Z N L Sari ◽  
...  

Abstract The restriction enzyme is important for genotyping using the PCR-RFLP technique. Therefore, this study aims to identify the restriction enzyme mapping in the partial sequence of the follicle-stimulating hormone receptor (FSHR) gene in Indonesian local cattle. A total of 29 samples sized 306 bp, were aligned with Genbank sequence acc no. NC_032660, resulting three polymorphic sites, namely g.193G>C, g.227T>C, and g.275A>C. Furthermore, the restriction mapping analysis using the NEBcutter program V2.0 showed that no enzyme recognized the SNP g.275A>C, while the SNP g.193G>C and g.227T>C were identified by the AluI and MscI enzymes, respectively. The AluI enzyme cuts at two positions (193 bp and 243 bp) in the G allele sample producing three fragments namely 50 bp, 63 bp, and 193 bp, meanwhile, in the C allele, the AluI cuts only in position 243 bp, hence, the fragment products are 63 bp and 243 bp. In contrast, the MscI enzyme was only recognized in the T allele, producing fragments sized 77 bp and 229 bp but failed to identify the restriction site along with the PCR products in the C allele. Based on the results, the SNPs (g.193G>C and g.227T>C) and restriction enzymes (AluI and MscI) are applicable for genotyping local Indonesian cattle using the PCR-RFLP technique in future studies.


2018 ◽  
Vol 14 (2) ◽  
pp. 188-192
Author(s):  
Nurul Izzaty Ismail ◽  
Wan Heng Fong ◽  
Nor Haniza Sarmin

In DNA splicing system, the potential effects of sets of restriction enzymes and a ligase that allow DNA molecules to be cleaved and reassociated to produce further molecules are studied.  A splicing language depicts the molecules resulting from a splicing system.  In this research, a C++ programming code for DNA splicing system with one palindromic restriction enzyme for one and two (non-overlapping) cutting sites is developed.  A graphical user interface, GUI is then designed to allow the user to insert the initial DNA string and restriction enzymes to generate the splicing languages which are the result of the computation of the C++ programming.  This interface displays the resulting splicing languages, which depict the results from in vitro experiments of the respective splicing system.  The results from this research simplify the lenghty manual computation of the resulting splicing languages of DNA splicing systems with one palindromic restriction enzyme.   


Sign in / Sign up

Export Citation Format

Share Document