scholarly journals SMART: Statistical Mitogenome Assembly with Repeats

2019 ◽  
Author(s):  
Fahad Alqahtani ◽  
Ion I. Măndoiu

AbstractBy using next-generation sequencing technologies it is possible to quickly and inexpensively generate large numbers of relatively short reads from both the nuclear and mitochondrial DNA contained in a biological sample. Unfortunately, assembling such whole-genome sequencing (WGS) data with standard de novo assemblers often fails to generate high quality mitochondrial genome sequences due to the large difference in copy number (and hence sequencing depth) between the mitochondrial and nuclear genomes. Assembly of complete mitochondrial genome sequences is further complicated by the fact that many de novo assemblers are not designed for circular genomes, and by the presence of repeats in the mitochondrial genomes of some species.In this paper we describe the Statistical Mitogenome Assembly with Repeats (SMART) pipeline for automated assembly of complete circular mitochondrial genomes from WGS data. SMART uses an efficient coverage-based filter to first select a subset of reads enriched in mtDNA sequences. Contigs produced by an initial assembly step are filtered using BLAST searches against a comprehensive mitochondrial genome database, and used as “baits” for an alignment-based filter that produces the set of reads used in a second de novo assembly and scaffolding step. In the presence of repeats, the possible paths through the assembly graph are evaluated using a maximum-likelihood model. Additionally, the assembly process is repeated a user-specified number of times on re-sampled subsets of reads to select for annotation the reconstructed sequences with highest bootstrap support.Experiments on WGS datasets from a variety of species show that the SMART pipeline produces complete circular mitochondrial genome sequences with a higher success rate than current state-of-the art tools, even from low coverage WGS data. The pipeline is available through an easy-to-use web interface at https://neo.engr.uconn.edu/?tool_id=SMART.

2017 ◽  
Vol 37 (03) ◽  
pp. 125-136
Author(s):  
Tolulope A. Agunbiade ◽  
Brad S. Coates ◽  
Weilin Sun ◽  
Mu-Rou Tsai ◽  
Maria Carmen Valero ◽  
...  

Abstract Maruca vitrata (Fabricius, 1787) is a cryptic pantropical species of Lepidoptera that are comprised of two unique strains that inhabit the American continents (New World strain) and regions spanning from Africa through to Southeast Asia and Northern Australia (Old World strain). In this study, we de novo assembled the complete mitochondrial genome sequence of the New World legume pod borer, M. vitrata, from shotgun sequence data generated on an Illumina HiSeq 2000. Phylogenomic comparisons were made with other previously published mitochondrial genome sequences from crambid moths, including the Old World strain of M. vitrata. The 15,385 bp M. vitrata (New World) sequence has an 80.7% A+T content and encodes the 13 protein-coding, 2 ribosomal RNA and 22 transfer RNA genes in the typical orientation and arrangement of lepidopteran mitochondrial DNAs. Mitochondrial genome-wide comparison between the New and Old World strains of M. vitrata detected 476 polymorphic sites (4.23% nucleotide divergence) with an excess of synonymous substitution as a result of purifying selection. Furthermore, this level of sequence variation suggests that these strains diverged from ~1.83 to 2.12 million years ago, assuming a linear rate of short-term substitution. The de novo assemblies of mitochondrial genomes from next-generation sequencing (NGS) reads provide readily available data for similar comparative studies.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3901 ◽  
Author(s):  
Zachary R. Hanna ◽  
James B. Henderson ◽  
Anna B. Sellas ◽  
Jérôme Fuchs ◽  
Rauri C.K. Bowie ◽  
...  

We report here the successful assembly of the complete mitochondrial genomes of the northern spotted owl (Strix occidentalis caurina) and the barred owl (S. varia). We utilized sequence data from two sequencing methodologies, Illumina paired-end sequence data with insert lengths ranging from approximately 250 nucleotides (nt) to 9,600 nt and read lengths from 100–375 nt and Sanger-derived sequences. We employed multiple assemblers and alignment methods to generate the final assemblies. The circular genomes of S. o. caurina and S. varia are comprised of 19,948 nt and 18,975 nt, respectively. Both code for two rRNAs, twenty-two tRNAs, and thirteen polypeptides. They both have duplicated control region sequences with complex repeat structures. We were not able to assemble the control regions solely using Illumina paired-end sequence data. By fully spanning the control regions, Sanger-derived sequences enabled accurate and complete assembly of these mitochondrial genomes. These are the first complete mitochondrial genome sequences of owls (Aves: Strigiformes) possessing duplicated control regions. We searched the nuclear genome of S. o. caurina for copies of mitochondrial genes and found at least nine separate stretches of nuclear copies of gene sequences originating in the mitochondrial genome (Numts). The Numts ranged from 226–19,522 nt in length and included copies of all mitochondrial genes except tRNAPro, ND6, and tRNAGlu. Strix occidentalis caurina and S. varia exhibited an average of 10.74% (8.68% uncorrected p-distance) divergence across the non-tRNA mitochondrial genes.


Genes ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 1567
Author(s):  
Haifeng Tian ◽  
Qiaomu Hu ◽  
Hongyi Lu ◽  
Zhong Li

Asian swamp eel (Monopterus albus, Zuiew 1793) is a commercially important fish due to its nutritional value in Eastern and Southeastern Asia. One local strain of M. albus distributed in the Jianghan Plain of China has been subjected to a selection breeding program because of its preferred body color and superiority of growth and fecundity. Some members of the genus Monopterus have been reclassified into other genera recently. These classifications require further phylogenetic analyses. In this study, the complete mitochondrial genomes of the breeds of M. albus were decoded using both PacBio and Illumina sequencing technologies, then phylogenetic analyses were carried out, including sampling of M. albus at five different sites and 14 species of Synbranchiformes with complete mitochondrial genomes. The total length of the mitogenome is 16,621 bp, which is one nucleotide shorter than that of four mitogenomes of M. albus sampled from four provinces in China, as well as one with an unknown sampling site. The gene content, gene order, and overall base compositions are almost identical to the five reported ones. The results of maximum likelihood (ML) and Bayesian inference analyses of the complete mitochondrial genome and 13 protein-coding genes (PCGs) were consistent. The phylogenetic trees indicated that the selecting breed formed the deepest branch in the clade of all Asian swamp eels, confirmed the phylogenetic relationships of four genera of the family Synbranchidae, also providing systematic phylogenetic relationships for the order Synbranchiformes. The divergence time analyses showed that all Asian swamp eels diverged about 0.49 million years ago (MYA) and their common ancestor split from other species about 45.96 MYA in the middle of the Miocene epoch. Altogether, the complete mitogenome of this breed of M. albus would serve as an important dataset for germplasm identification and breeding programs for this species, in addition to providing great help in identifying the phylogenetic relationships of the order Synbranchiformes.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ying-ying Ye ◽  
Jing Miao ◽  
Ya-hong Guo ◽  
Li Gong ◽  
Li-hua Jiang ◽  
...  

AbstractThe complete mitochondrial genome (mitogenome) of animals can provide useful information for evolutionary and phylogenetic analyses. The mitogenome of the genus Exhippolysmata (i.e., Exhippolysmata ensirostris) was sequenced and annotated for the first time, its phylogenetic relationship with selected members from the infraorder Caridea was investigated. The 16,350 bp mitogenome contains the entire set of 37 common genes. The mitogenome composition was highly A + T biased at 64.43% with positive AT skew (0.009) and negative GC skew (− 0.199). All tRNA genes in the E. ensirostris mitogenome had a typical cloverleaf secondary structure, except for trnS1 (AGN), which appeared to lack the dihydrouridine arm. The gene order in the E. ensirostris mitogenome was rearranged compared with those of ancestral decapod taxa, the gene order of trnL2-cox2 changed to cox2-trnL2. The tandem duplication-random loss model is the most likely mechanism for the observed gene rearrangement of E. ensirostris. The ML and BI phylogenetic analyses place all Caridea species into one group with strong bootstrap support. The family Lysmatidae is most closely related to Alpheidae and Palaemonidae. These results will help to better understand the gene rearrangements and evolutionary position of E. ensirostris and lay a foundation for further phylogenetic studies of Caridea.


2020 ◽  
Author(s):  
Graham Etherington

De novo assembly of 49 mustelid whole mitochondrial genomes


2018 ◽  
Vol 6 (16) ◽  
pp. e00234-18
Author(s):  
Amanda S. Kahn ◽  
Jonathan B. Geller

ABSTRACT We announce the nearly complete mitochondrial genome sequences of two hexactinellid sponges, Bathydorus laniger and Docosaccus maculatus. A contiguous region of over 15,000 bp was sequenced from each genome. An uncommon structural element was identified as a series of repetitive elements with sequences matching cob in the genome of D. maculatus.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10364
Author(s):  
Natalia I. Abramson ◽  
Fedor N. Golenishchev ◽  
Semen Yu. Bodrov ◽  
Olga V. Bondareva ◽  
Evgeny A. Genelt-Yanovskiy ◽  
...  

In this article, we present the nearly complete mitochondrial genome of the Subalpine Kashmir vole Hyperacrius fertilis (Arvicolinae, Cricetidae, Rodentia), assembled using data from Illumina next-generation sequencing (NGS) of the DNA from a century-old museum specimen. De novo assembly consisted of 16,341 bp and included all mitogenome protein-coding genes as well as 12S and 16S RNAs, tRNAs and D-loop. Using the alignment of protein-coding genes of 14 previously published Arvicolini tribe mitogenomes, seven Clethrionomyini mitogenomes, and also Ondatra and Dicrostonyx outgroups, we conducted phylogenetic reconstructions based on a dataset of 13 protein-coding genes (PCGs) under maximum likelihood and Bayesian inference. Phylogenetic analyses robustly supported the phylogenetic position of this species within the tribe Arvicolini. Among the Arvicolini, Hyperacrius represents one of the early-diverged lineages. This result of phylogenetic analysis altered the conventional view on phylogenetic relatedness between Hyperacrius and Alticola and prompted the revision of morphological characters underlying the former assumption. Morphological analysis performed here confirmed molecular data and provided additional evidence for taxonomic replacement of the genus Hyperacrius from the tribe Clethrionomyini to the tribe Arvicolini.


Sign in / Sign up

Export Citation Format

Share Document