scholarly journals Low guanine content and biased nucleotide distribution in vertebrate mtDNA can cause overestimation of non-CpG methylation

2022 ◽  
Vol 4 (1) ◽  
Author(s):  
Takashi Okada ◽  
Xin Sun ◽  
Stephen McIlfatrick ◽  
Justin C St. John

ABSTRACT Mitochondrial DNA (mtDNA) methylation in vertebrates has been hotly debated for over 40 years. Most contrasting results have been reported following bisulfite sequencing (BS-seq) analyses. We addressed whether BS-seq experimental and analysis conditions influenced the estimation of the levels of methylation in specific mtDNA sequences. We found false positive non-CpG methylation in the CHH context (fpCHH) using unmethylated Sus scrofa mtDNA BS-seq data. fpCHH methylation was detected on the top/plus strand of mtDNA within low guanine content regions. These top/plus strand sequences of fpCHH regions would become extremely AT-rich sequences after BS-conversion, whilst bottom/minus strand sequences remained almost unchanged. These unique sequences caused BS-seq aligners to falsely assign the origin of each strand in fpCHH regions, resulting in false methylation calls. fpCHH methylation detection was enhanced by short sequence reads, short library inserts, skewed top/bottom read ratios and non-directional read mapping modes. We confirmed no detectable CHH methylation in fpCHH regions by BS-amplicon sequencing. The fpCHH peaks were located in the D-loop, ATP6, ND2, ND4L, ND5 and ND6 regions and identified in our S. scrofa ovary and oocyte data and human BS-seq data sets. We conclude that non-CpG methylation could potentially be overestimated in specific sequence regions by BS-seq analysis.

2017 ◽  
Author(s):  
Julian R. Dupuis ◽  
Forest T. Bremer ◽  
Angela Kauwe ◽  
Michael San Jose ◽  
Luc Leblanc ◽  
...  

ABSTRACTHigh-throughput sequencing has fundamentally changed how molecular phylogenetic datasets are assembled, and phylogenomic datasets commonly contain 50-100-fold more loci than those generated using traditional Sanger-based approaches. Here, we demonstrate a new approach for building phylogenomic datasets using single tube, highly multiplexed amplicon sequencing, which we name HiMAP (Highly Multiplexed Amplicon-based Phylogenomics), and present bioinformatic pipelines for locus selection based on genomic and transcriptomic data resources and post-sequencing consensus calling and alignment. This method is inexpensive and amenable to sequencing a large number (hundreds) of taxa simultaneously, requires minimal hands-on time at the bench (<1/2 day), and data analysis can be accomplished without the need for read mapping or assembly. We demonstrate this approach by sequencing 878 amplicons in single reactions for 82 species of tephritid fruit flies across seven genera (384 individuals), including some of the most economically-important agricultural insect pests. The resulting dataset (>150,000 bp concatenated alignment) contained >40,000 phylogenetically informative characters, and although some discordance was observed between analyses, it provided unparalleled resolution of many phylogenetic relationships in this group. Most notably, we found high support for the generic status ofZeugodacusand the sister relationship betweenDacusandZeugodacus. We discuss HiMAP, with regard to its molecular and bioinformatic strengths, and the insight the resulting dataset provides into relationships of this diverse insect group.


2016 ◽  
Vol 37 (1) ◽  
pp. 1-8 ◽  
Author(s):  
Khaled Merabet ◽  
Eugenia Sanchez ◽  
Abdelhak Dahmana ◽  
Sergé Bogaerts ◽  
David Donaire ◽  
...  

The North African fire salamander, Salamandra algira, is distributed in Algeria, Morocco and Ceuta (Spanish territory located on the north coast of Africa), but until now rather limited information has been available on the populations across the Algerian part of its range. We here provide a first analysis of the phylogeography of this species in Algeria, based on 44 samples from populations distributed across 15 localities in Central Algeria. We sequenced three segments of mitochondrial DNA, covering 12S rRNA, cytochrome b (Cytb) and the D-loop. The mtDNA sequences of the Algerian samples were strongly different from the Moroccan populations occurring west of the Moulouya River (corresponding to the subspecies S. a. tingitana and S. a. splendens) but sister to the genetically rather similar population from the Beni Snassen Massif in eastern Morocco (subspecies S. algira spelaea). Among the Algerian specimens studied, those from the westernmost site, Chrea Massif, were the sister clade to the remaining populations, and the overall genetic divergence was low, with a maximum of five mutational steps in a 295 bp fragment of cytochrome b.


2016 ◽  
pp. btw490 ◽  
Author(s):  
Haifeng Chen ◽  
Andrew D. Smith ◽  
Ting Chen

2020 ◽  
Author(s):  
Zaka Wing-Sze Yuen ◽  
Akanksha Srivastava ◽  
Runa Daniel ◽  
Dennis McNevin ◽  
Cameron Jack ◽  
...  

AbstractDNA methylation plays a fundamental role in the control of gene expression and genome integrity. Although there are multiple tools that enable its detection from Nanopore sequencing, their accuracy remains largely unknown. Here, we present a systematic benchmarking of tools for the detection of CpG methylation from Nanopore sequencing using individual reads, control mixtures of methylated and unmethylated reads, and bisulfite sequencing. We found that tools have a tradeoff between false positives and false negatives, and present a high dispersion with respect to the expected methylation frequency values. We described various strategies to improve the accuracy of these tools, including a new consensus approach, METEORE (https://github.com/comprna/METEORE), based on the combination of the predictions from two or more tools that shows improved accuracy over individual tools. Snakemake pipelines are provided for reproducibility and to enable the systematic application of our analyses to other datasets.


2021 ◽  
Vol 3 (4) ◽  
Author(s):  
Guilherme de Sena Brandine ◽  
Andrew D Smith

Abstract DNA cytosine methylation is an important epigenomic mark with a wide range of functions in many organisms. Whole genome bisulfite sequencing is the gold standard to interrogate cytosine methylation genome-wide. Algorithms used to map bisulfite-converted reads often encode the four-base DNA alphabet with three letters by reducing two bases to a common letter. This encoding substantially reduces the entropy of nucleotide frequencies in the resulting reference genome. Within the paradigm of read mapping by first filtering possible candidate alignments, reduced entropy in the sequence space can increase the required computing effort. We introduce another bisulfite mapping algorithm (abismal), based on the idea of encoding a four-letter DNA sequence as only two letters, one for purines and one for pyrimidines. We show that this encoding can lead to greater specificity compared to existing encodings used to map bisulfite sequencing reads. Through the two-letter encoding, the abismal software tool maps reads in less time and using less memory than most bisulfite sequencing read mapping software tools, while attaining similar accuracy. This allows in silico methylation analysis to be performed in a wider range of computing machines with limited hardware settings.


2021 ◽  
Author(s):  
Iacopo Bicci ◽  
Claudia Calabrese ◽  
Zoe J. Golder ◽  
Aurora Gomez-Duran ◽  
Patrick F Chinnery

SummaryMethylation on CpG residues is one of the most important epigenetic modifications of nuclear DNA, regulating gene expression. Methylation of mitochondrial DNA (mtDNA) has been studied using whole genome bisulfite sequencing (WGBS), but recent evidence has uncovered major technical issues which introduce a potential bias during methylation quantification. Here, we validate the technical concerns with WGBS, and then develop and assess the accuracy of a protocol for variant-specific methylation identification using long-read Oxford Nanopore Sequencing. Our approach circumvents mtDNA-specific confounders, while enriching for native full-length molecules over nuclear DNA. Variant calling analysis against Illumina deep re-sequencing showed that all expected mtDNA variants can be reliably identified. Methylation calling revealed negligible mtDNA methylation levels in multiple human primary and cancer cell lines. In conclusion, our protocol enables the reliable analysis of epigenetic modifications of mtDNA at single-molecule level at single base resolution, with potential applications beyond methylation.MotivationAlthough whole genome bisulfite sequencing (WGBS) is the gold-standard approach to determine base-level CpG methylation in the nuclear genome, emerging technical issues raise questions about its reliability for evaluating mitochondrial DNA (mtDNA) methylation. Concerns include mtDNA strand asymmetry rendering the C-rich light strand disproportionately vulnerable the chemical modifications introduced with WGBS. Also, short-read sequencing can result in a co-amplification of nuclear sequences originating from ancestral mtDNA with a high nucleotide similarity. Lastly, calling mtDNA alleles with varying proportions (heteroplasmy) is complicated by the C-to-T conversion introduced by WGBS on unmethylated CpGs. Here, we propose an alternative protocol to quantify methyl-CpGs in mtDNA, at single-molecule level, using Oxford Nanopore Sequencing (ONS). By optimizing the standard ONS library preparation, we achieved selective enrichment of native mtDNA and accurate single nucleotide variant and CpG methylation calling, thus overcoming previous limitations.


2013 ◽  
Vol 84 (2) ◽  
pp. 163-168 ◽  
Author(s):  
Chihiro N. HAYASHI ◽  
Yoshikazu ADACHI ◽  
Kumiko TAKEDA ◽  
Yukiko ABE ◽  
Takeshi YASUE ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document