Predicting genome sizes and restriction enzyme recognition-sequence probabilities across the eukaryotic tree of life

High-throughput sequencing of reduced representation libraries obtained through digestion with restriction enzymes ? generically known as restriction-site associated DNA sequencing (RAD-seq) ? is a common strategy to generate genome-wide genotypic and sequence data from eukaryotes. A critical design element of any RAD-seq study is a knowledge of the approximate number of genetic markers that can be obtained for a taxon using different restriction enzymes, as this number determines the scope of a project, and ultimately defines its success. This number can only be directly determined if a reference genome sequence is available, or it can be estimated if the genome size and restriction recognition sequence probabilities are known. However, both scenarios are uncommon for non-model species. Here, we performed systematic in silico surveys of recognition sequences, for diverse and commonly used type II restriction enzymes across the eukaryotic tree of life. Our observations reveal that recognition-sequence frequencies for a given restriction enzyme are strikingly variable among broad eukaryotic taxonomic groups, being largely determined by phylogenetic relatedness. We demonstrate that genome sizes can be predicted from cleavage frequency data obtained with restriction enzymes targeting ?neutral? elements. Models based on genomic compositions are also effective tools to accurately calculate probabilities of recognition sequences across taxa, and can be applied to species for which reduced-representation data is available (including transcriptomes and ?neutral? RAD-seq datasets). The analytical pipeline developed in this study, PredRAD (https://github.com/phrh/PredRAD), and the resulting databases constitute valuable resources that will help guide the design of any study using RAD-seq or related methods.

Download Full-text

Would an RRS by any other name sound as RAD?

10.1101/283085 ◽

2018 ◽

Author(s):

Erin O Campbell ◽

Bryan M T Brunet ◽

Julian R Dupuis ◽

Felix A H Sperling

Keyword(s):

Literature Review ◽

Restriction Enzyme ◽

Restriction Enzymes ◽

Shotgun Sequencing ◽

Sequencing Technology ◽

Reduced Representation ◽

Minor Elements ◽

Scientific Innovation ◽

A Genome ◽

Unique Method

ABSTRACTSampling markers throughout a genome with restriction enzymes emerged in the 2000s as reduced representation shotgun sequencing (RRS). Rapid advances in sequencing technology have since spurred modifications of RRS, giving rise to many derivatives with unique names, such as RADseq. But naming conventions have often been more creative than consistent, with unclear criteria for recognition as a unique method resulting in a proliferation of names characterized by ambiguity. We conducted a literature review to assess methodological and etymological relationships among 36 restriction enzyme-based methods, as well as rates of correct referencing of commonly-used methods. We identify several instances of methodological convergence or misattribution in the literature, and note that many published derivatives have modified only minor elements of parent protocols. We urge greater restraint in naming derivative methods, to strike a better balance between clarity, recognition of scientific innovation, and correct attribution.

Download Full-text

Performance comparison of two reduced-representation based genome-wide marker-discovery strategies in a multi-taxon phylogeographic framework

Scientific Reports ◽

10.1038/s41598-020-79778-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Philipp Kirschner ◽

◽

Wolfgang Arthofer ◽

Stefanie Pfeifenberger ◽

Eliška Záveská ◽

...

Keyword(s):

High Throughput Sequencing ◽

Performance Comparison ◽

Nucleotide Polymorphisms ◽

Aflp Data ◽

Single Nucleotide ◽

Reduced Representation ◽

Genome Wide ◽

Taxonomic Groups ◽

Marker Discovery ◽

Study Species

AbstractMulti-locus genetic data are pivotal in phylogenetics. Today, high-throughput sequencing (HTS) allows scientists to generate an unprecedented amount of such data from any organism. However, HTS is resource intense and may not be accessible to wide parts of the scientific community. In phylogeography, the use of HTS has concentrated on a few taxonomic groups, and the amount of data used to resolve a phylogeographic pattern often seems arbitrary. We explore the performance of two genetic marker sampling strategies and the effect of marker quantity in a comparative phylogeographic framework focusing on six species (arthropods and plants). The same analyses were applied to data inferred from amplified fragment length polymorphism fingerprinting (AFLP), a cheap, non-HTS based technique that is able to straightforwardly produce several hundred markers, and from restriction site associated DNA sequencing (RADseq), a more expensive, HTS-based technique that produces thousands of single nucleotide polymorphisms. We show that in four of six study species, AFLP leads to results comparable with those of RADseq. While we do not aim to contest the advantages of HTS techniques, we also show that AFLP is a robust technique to delimit evolutionary entities in both plants and animals. The demonstrated similarity of results from the two techniques also strengthens biological conclusions that were based on AFLP data in the past, an important finding given the wide utilization of AFLP over the last decades. We emphasize that whenever the delimitation of evolutionary entities is the central goal, as it is in many fields of biodiversity research, AFLP is still an adequate technique.

Download Full-text

Base-substitution mutation rate across the nuclear genome of Alpheus snapping shrimp and the timing of isolation by the Isthmus of Panama

10.1101/2020.11.25.396556 ◽

2020 ◽

Author(s):

Katherine Silliman ◽

Jane L. Indorf ◽

Nancy Knowlton ◽

William E. Browne ◽

Carla Hurt

Keyword(s):

Mutation Rate ◽

Sequence Data ◽

Nuclear Genome ◽

Base Substitution ◽

Snapping Shrimp ◽

Isthmus Of Panama ◽

Species Pairs ◽

Genotype By Sequencing ◽

Independent Calibration ◽

Taxonomic Groups

AbstractThe formation of the Isthmus of Panama and final closure of the Central American Seaway (CAS) provides an independent calibration point for examining the rate of DNA substitutions. This vicariant event has been widely used to estimate the substitution rate across mitochondrial genomes and to date evolutionary events in other taxonomic groups. Nuclear sequence data is increasingly being used to complement mitochondrial datasets for phylogenetic and evolutionary investigations; these studies would benefit from information regarding the rate and pattern of DNA substitutions derived from the nuclear genome. To estimate this genomewide neutral mutation rate (μ), genotype-by-sequencing (GBS) datasets were generated for three transisthmian species pairs in Alpheus snapping shrimp. Using a Bayesian coalescent approach (G-PhoCS) applied to 44,960 GBS loci, we estimated μ to be 2.64E-9 substitutions/site/year, when calibrated with the closure of the CAS at 3 Ma. This estimate is remarkably similar to experimentally derived mutation rates in model arthropod systems, strengthening the argument for a recent closure of the CAS. To our knowledge this is the first use of transisthmian species pairs to calibrate the rate of molecular evolution from GBS data.

Download Full-text

Genome-Wide Identification of 5-Methylcytosine Sites in Bacterial Genomes By High-Throughput Sequencing of MspJI Restriction Fragments

10.1101/2021.02.10.430591 ◽

2021 ◽

Author(s):

Brian P. Anton ◽

Alexey Fomenkov ◽

Victoria Wu ◽

Richard J. Roberts

Keyword(s):

Single Molecule ◽

Dna Sequences ◽

High Throughput Sequencing ◽

Cost Effective ◽

Restriction Enzymes ◽

Specific Sequence ◽

Genome Wide ◽

Cost Effective Alternative ◽

Simple Column ◽

Sequencing Platforms

ABSTRACTSingle-molecule Real-Time (SMRT) sequencing can easily identify sites of N6-methyladenine and N4-methylcytosine within DNA sequences, but similar identification of 5-methylcytosine sites is not as straightforward. In prokaryotic DNA, methylation typically occurs within specific sequence contexts, or motifs, that are a property of the methyltransferases that “write” these epigenetic marks. We present here a straightforward, cost-effective alternative to both SMRT and bisulfite sequencing for the determination of prokaryotic 5-methylcytosine methylation motifs. The method, called MFRE-Seq, relies on excision and isolation of fully methylated fragments of predictable size using MspJI-Family Restriction Enzymes (MFREs), which depend on the presence of 5-methylcytosine for cleavage. We demonstrate that MFRE-Seq is compatible with both Illumina and Ion Torrent sequencing platforms and requires only a digestion step and simple column purification of size-selected digest fragments prior to standard library preparation procedures. We applied MFRE-Seq to numerous bacterial and archaeal genomic DNA preparations and successfully confirmed known motifs and identified novel ones. This method should be a useful complement to existing methodologies for studying prokaryotic methylomes and characterizing the contributing methyltransferases.

Download Full-text

Computation of Splicing Languages from DNA Splicing System Based on Sequences of Restriction Enzymes

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1006.0986s319 ◽

2019 ◽

Vol 8 (6S3) ◽

pp. 31-42

Keyword(s):

Restriction Enzyme ◽

Dna Computing ◽

Restriction Enzymes ◽

Visual Programming ◽

Cleavage Pattern ◽

Dna Molecules ◽

Splicing Systems ◽

Dna Splicing

In DNA splicing systems, restriction enzymes and ligases cleave and recombine DNA molecules based on the cleavage pattern of the restriction enzymes. The set of molecules resulting from the splicing system depicts a splicing language. In this research, an algorithm for DNA splicing systems is developed using C++ visual programming. The splicing languages which have been characterised through some theorems based on the crossings and sequences of the restriction enzymes, are generated as the output from this computation. In order to generate the splicing languages, the algorithm detects and calculates the number of cutting sites of the restriction enzymes found in the initial molecules, determines whether the sequence of restriction enzyme is a palindrome or not, and if the restriction enzymes have the same or different crossings. The results from this research depict the splicing languages obtained from the manual computations, which contributes to the development of computational software in DNA computing.

Download Full-text

Genome-wide identification of 5-methylcytosine sites in bacterial genomes by high-throughput sequencing of MspJI restriction fragments

PLoS ONE ◽

10.1371/journal.pone.0247541 ◽

2021 ◽

Vol 16 (5) ◽

pp. e0247541

Author(s):

Brian P. Anton ◽

Alexey Fomenkov ◽

Victoria Wu ◽

Richard J. Roberts

Keyword(s):

Single Molecule ◽

Dna Sequences ◽

High Throughput Sequencing ◽

Cost Effective ◽

Restriction Enzymes ◽

Specific Sequence ◽

Genome Wide ◽

Cost Effective Alternative ◽

Simple Column ◽

Sequencing Platforms

Single-molecule Real-Time (SMRT) sequencing can easily identify sites of N6-methyladenine and N4-methylcytosine within DNA sequences, but similar identification of 5-methylcytosine sites is not as straightforward. In prokaryotic DNA, methylation typically occurs within specific sequence contexts, or motifs, that are a property of the methyltransferases that “write” these epigenetic marks. We present here a straightforward, cost-effective alternative to both SMRT and bisulfite sequencing for the determination of prokaryotic 5-methylcytosine methylation motifs. The method, called MFRE-Seq, relies on excision and isolation of fully methylated fragments of predictable size using MspJI-Family Restriction Enzymes (MFREs), which depend on the presence of 5-methylcytosine for cleavage. We demonstrate that MFRE-Seq is compatible with both Illumina and Ion Torrent sequencing platforms and requires only a digestion step and simple column purification of size-selected digest fragments prior to standard library preparation procedures. We applied MFRE-Seq to numerous bacterial and archaeal genomic DNA preparations and successfully confirmed known motifs and identified novel ones. This method should be a useful complement to existing methodologies for studying prokaryotic methylomes and characterizing the contributing methyltransferases.

Download Full-text

The fractal dimension of the tree of life

10.7287/peerj.preprints.198v3 ◽

2014 ◽

Author(s):

Xiaofei Lv ◽

Yuping Wu ◽

Bin Ma

Keyword(s):

Fractal Dimension ◽

Power Law ◽

Fractal Dimensions ◽

Predictive Modelling ◽

Tree Of Life ◽

Fundamental Question ◽

Theoretical Studies ◽

Structure Pattern ◽

Intermediate Size ◽

Taxonomic Groups

The structure pattern of the tree of life clues on the key ecological issues; hence knowing the fractal dimension is the fundamental question in understanding the tree of life. Yet the fractal dimension of the tree of life remains unclear since the scale of the tree of life has hypergrown in recent years. Here we show that the tree of life display a consistent power-law rules for inter- and intra-taxonomic levels, but the fractal dimensions were different among different kingdoms. The fractal dimension of hierarchical structure (Dr) is 0.873 for the entire tree of life, which smaller than the values of Dr for Animalia and Plantae but greater than the values of Dr for Fungi, Chromista, and Protozoa. The hierarchical fractal dimensions values for prokaryotic kingdoms are lower than for other kingdoms. The Dr value for Viruses was lower than most eukaryotic kingdoms, but greater than prokaryotes. The distribution of taxa size is governed by fractal diversity but skewed by overdominating taxa with large subtaxa size. The proportion of subtaxa in taxa with small and large sizes was greater than in taxa with intermediate size. Our results suggest that the distribution of subtaxa in taxa can be predicted with fractal dimension for the accumulating taxa abundance rather than the taxa abundance. Our study determined the fractal dimensions for inter- and intra-taxonomic levels of the present tree of life. These results emphases the need for further theoretical studies, as well as predictive modelling, to interpret the different fractal dimension for different taxonomic groups and skewness of taxa with large subtaxa size.

Download Full-text

Out-Lab Therapy Approach Based on Elected A Restriction Enzyme to Transfer Target Gene

Al-Kitab Journal for Pure Sciences ◽

10.32441/kjps.02.02.p13 ◽

2018 ◽

Vol 2 (2) ◽

pp. 196-208

Author(s):

Ayad Ismaeel

Keyword(s):

Restriction Enzyme ◽

Target Gene ◽

Restriction Enzymes ◽

Tp53 Gene ◽

Restriction Enzyme Digestion ◽

Therapy Approach ◽

Software Packages ◽

Important Approach ◽

Effective Cost ◽

Target Gene Sequence

An important approach of therapy the target gene sequence causes diseases via repair/recombine the mutated gene (gene transfer) using a restriction enzymes in the laboratory. This approach will cause multiple problems happening accompany to biological laboratory if ruled out problems outside of it like the digested DNA ran as a smear on an agarose gel, incomplete restriction enzyme digestion, extra bands in the gel, etc. The paper suggested new approach of therapy via repair/replacement mutated gene caused disease by detecting primers and finding restriction enzymes using bioinformatics tools, software, packages etc. then achieving the repair/ recombine of mutations before going to the biologic lab (out-lab) to avoid the problems associated these laboratories. Implement and apply this a proposed therapy approach on TP53 gene (which caused more than 50% of human cancers) and after confirming there is mutations on P53 tumor protein shows an effective cost, friendly therapy methodology and comprehensive.

Download Full-text

Identification of restriction enzyme in the FSHR gene of indonesian local cattle

IOP Conference Series Earth and Environmental Science ◽

10.1088/1755-1315/888/1/012024 ◽

2021 ◽

Vol 888 (1) ◽

pp. 012024

Author(s):

P W Prihandini ◽

A Primasari ◽

M Luthfi ◽

D Pamungkas ◽

A P Z N L Sari ◽

...

Keyword(s):

Restriction Enzyme ◽

Restriction Site ◽

Follicle Stimulating Hormone ◽

Restriction Enzymes ◽

Future Studies ◽

Fshr Gene ◽

Pcr Rflp ◽

Mapping Analysis ◽

Pcr Products ◽

Enzyme Mapping

Abstract The restriction enzyme is important for genotyping using the PCR-RFLP technique. Therefore, this study aims to identify the restriction enzyme mapping in the partial sequence of the follicle-stimulating hormone receptor (FSHR) gene in Indonesian local cattle. A total of 29 samples sized 306 bp, were aligned with Genbank sequence acc no. NC_032660, resulting three polymorphic sites, namely g.193G>C, g.227T>C, and g.275A>C. Furthermore, the restriction mapping analysis using the NEBcutter program V2.0 showed that no enzyme recognized the SNP g.275A>C, while the SNP g.193G>C and g.227T>C were identified by the AluI and MscI enzymes, respectively. The AluI enzyme cuts at two positions (193 bp and 243 bp) in the G allele sample producing three fragments namely 50 bp, 63 bp, and 193 bp, meanwhile, in the C allele, the AluI cuts only in position 243 bp, hence, the fragment products are 63 bp and 243 bp. In contrast, the MscI enzyme was only recognized in the T allele, producing fragments sized 77 bp and 229 bp but failed to identify the restriction site along with the PCR products in the C allele. Based on the results, the SNPs (g.193G>C and g.227T>C) and restriction enzymes (AluI and MscI) are applicable for genotyping local Indonesian cattle using the PCR-RFLP technique in future studies.

Download Full-text

Computation of splicing languages from DNA splicing system with one palindromic restriction enzyme

Malaysian Journal of Fundamental and Applied Sciences ◽

10.11113/mjfas.v14n2.879 ◽

2018 ◽

Vol 14 (2) ◽

pp. 188-192

Author(s):

Nurul Izzaty Ismail ◽

Wan Heng Fong ◽

Nor Haniza Sarmin

Keyword(s):

User Interface ◽

Graphical User Interface ◽

Restriction Enzyme ◽

Restriction Enzymes ◽

In Vitro Experiments ◽

Dna Molecules ◽

C Programming ◽

Splicing Systems ◽

Dna Splicing

In DNA splicing system, the potential effects of sets of restriction enzymes and a ligase that allow DNA molecules to be cleaved and reassociated to produce further molecules are studied. A splicing language depicts the molecules resulting from a splicing system. In this research, a C++ programming code for DNA splicing system with one palindromic restriction enzyme for one and two (non-overlapping) cutting sites is developed. A graphical user interface, GUI is then designed to allow the user to insert the initial DNA string and restriction enzymes to generate the splicing languages which are the result of the computation of the C++ programming. This interface displays the resulting splicing languages, which depict the results from in vitro experiments of the respective splicing system. The results from this research simplify the lenghty manual computation of the resulting splicing languages of DNA splicing systems with one palindromic restriction enzyme.

Download Full-text