scholarly journals DARTS: An Algorithm for Domain-Associated Retrotransposon Search in Genome Assemblies

Genes ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 9
Author(s):  
Mikhail Biryukov ◽  
Kirill Ustyantsev

Retrotransposons comprise a substantial fraction of eukaryotic genomes, reaching the highest proportions in plants. Therefore, identification and annotation of retrotransposons is an important task in studying the regulation and evolution of plant genomes. The majority of computational tools for mining transposable elements (TEs) are designed for subsequent genome repeat masking, often leaving aside the element lineage classification and its protein domain composition. Additionally, studies focused on the diversity and evolution of a particular group of retrotransposons often require substantial customization efforts from researchers to adapt existing software to their needs. Here, we developed a computational pipeline to mine sequences of protein-coding retrotransposons based on the sequences of their conserved protein domains—DARTS (Domain-Associated Retrotransposon Search). Using the most abundant group of TEs in plants—long terminal repeat (LTR) retrotransposons (LTR-RTs)—we show that DARTS has radically higher sensitivity for LTR-RT identification compared to the widely accepted tool LTRharvest. DARTS can be easily customized for specific user needs. As a result, DARTS returns a set of structurally annotated nucleotide and amino acid sequences which can be readily used in subsequent comparative and phylogenetic analyses. DARTS may facilitate researchers interested in the discovery and detailed analysis of the diversity and evolution of retrotransposons, LTR-RTs, and other protein-coding TEs.

2021 ◽  
Author(s):  
Mikhail Biryukov ◽  
Kirill Ustyantsev

AbstractRetrotransposons comprise a substantial fraction of eukaryotic genomes reaching the highest proportions in plants. Therefore, identification and annotation of retrotransposons is an important task in studying regulation and evolution of plant genomes. A majority of computational tools for mining transposable elements (TEs) are designed for subsequent genome repeat masking, often leaving aside the element lineage classification and its protein domain composition. Additionally, studies focused on diversity and evolution of a particular group of retrotransposons often require substantial customization efforts from researchers to adapt existing software to their needs. Here, we developed a computational pipeline to mine sequences of protein-coding retrotransposons based on the sequences of their conserved protein domains - DARTS. Using the most abundant group of TEs in plants - long terminal repeat (LTR) retrotransposons (LTR-RTs), we show that DARTS has radically higher sensitivity of LTR-RTs identification compared to a widely accepted LTRharvest tool. DARTS can be easily customized for specific user needs. As a result, DARTS returns a set of structurally annotated nucleotide and amino acid sequences which can be readily used in subsequent comparative and phylogenetic analyses. DARTS should facilitate researchers interested in discovery and in-detail analysis of diversity and evolution of retrotransposons, LTR-RTs, and other protein-coding TEs.


Parasitology ◽  
2006 ◽  
Vol 134 (5) ◽  
pp. 749-759 ◽  
Author(s):  
J.-K. PARK ◽  
K.-H. KIM ◽  
S. KANG ◽  
H. K. JEON ◽  
J.-H. KIM ◽  
...  

SUMMARYThe complete nucleotide sequence of the mitochondrial genome was determined for the fish tapeworm Diphyllobothrium latum. This genome is 13 608 bp in length and encodes 12 protein-coding genes (but lacks the atp8), 22 transfer RNA (tRNA) and 2 ribosomal RNA (rRNA) genes, corresponding to the gene complement found thus far in other flatworm mitochondrial (mt) DNAs. The gene arrangement of this pseudophyllidean cestode is the same as the 6 cyclophyllidean cestodes characterized to date, with only minor variation in structure among these other genomes; the relative position of trnS2 and trnL1 is switched in Hymenolepis diminuta. Phylogenetic analyses of the concatenated amino acid sequences for 12 protein-coding genes of all complete cestode mtDNAs confirmed taxonomic and previous phylogenetic assessments, with D. latum being a sister taxon to the cyclophyllideans. High nodal support and phylogenetic congruence between different methods suggest that mt genomes may be of utility in resolving ordinal relationships within the cestodes. All species of Diphyllobothrium infect fish-eating vertebrates, and D. latum commonly infects humans through the ingestion of raw, poorly cooked or pickled fish. The complete mitochondrial genome provides a wealth of genetic markers which could be useful for identifying different life-cycle stages and for investigating their population genetics, ecology and epidemiology.


2019 ◽  
Vol 75 (05) ◽  
pp. 6248-2019
Author(s):  
ZEYNEP AKKUTAY-YOLDAR ◽  
TAYLAN KOÇ B. ◽  
ÇIĞDEM OĞUZOĞLU T.

Canine kobuvirus (CaKVs) is a newly emerging virus detected in dogs in several countries. However, kobuvirus infection has not yet been described in domestic carnivores in Turkey. In this study, we tested blood and rectal swab samples to determine the presence of kobuvirus in a dog with clinical symptoms by reverse transcription-polymerase chain reaction (RT-PCR), using 3D (RNA polymerase) region primers of canine kobuviruses. To provide molecular characterization data, the Maximum Likelihood (ML) method was used for the phylogenetic analyses. The PCR product of the partial protein-coding region of the 3D protein gene from the rectal swab was amplified, purified, and sequenced. Phylogenetic analysis of amino acid sequences suggests that our CaKV strain was closely related to US-CaKVs,and placed on a monophyletic clade as a sister branch localized in the CaKV cluster. These results indicate that CaKV exists in dogs in Turkey. With a similarity of 94.2–96.1%, it is like other CaKVs. To our knowledge, this is the first report of CaKV infection of a dog by in Turkey. Further studies are needed to determine its role in dog gastrointestinal infections.


2020 ◽  
Author(s):  
Yi-Tian Fu ◽  
Yu Nie ◽  
De-Yong Duan ◽  
Guo-Hua Liu

Abstract Background: The family Hoplopleuridae contains at least 183 species of blood-sucking lice, which widely parasitize both mice and rats. Fragmented mitochondrial (mt) genomes have been reported in two rat lice (Hoplopleura kitti and H. akanezumi) from this family, but some minichromosomes were unidentified in their mt genomes.Methods: We sequenced the mt genome of the rat louse Hoplopleura sp. with an Illumina platform and compared its mt genome organization with H. kitti and H. akanezumi.Results: Fragmented mt genome of the rat louse Hoplopleura sp. contains 37 genes which are on 12 circular mt minichromosomes. Each mt minichromosome is 1.8–2.7 kb long and contains 1–5 genes and one large non-coding region. The gene content and arrangement of mt minichromosomes of Hoplopleura sp. (n = 3) and H. kitti (n = 3) are different from those in H. akanezumi (n = 3). Phylogenetic analyses based on the deduced amino acid sequences of the eight protein-coding genes showed that the Hoplopleura sp. was more closely related to H. akanezumi than to H. kitti, and then they formed a monophyletic group.Conclusions: Comparison among the three rat lice revealed variation in the composition of mt minichromosomes within the genus Hoplopleura. Hoplopleura sp. is the first species from the family Hoplopleuridae for which a complete fragmented mt genome has been sequenced. The new data provide useful genetic markers for studying the population genetics, molecular systematics and phylogenetics of blood-sucking lice.


2013 ◽  
Vol 2013 ◽  
pp. 1-6 ◽  
Author(s):  
Eran Elhaik ◽  
Dan Graur

Eukaryotic genomes, particularly animal genomes, have a complex, nonuniform, and nonrandom internal compositional organization. The compositional organization of animal genomes can be described as a mosaic of discrete genomic regions, called “compositional domains,” each with a distinct GC content that significantly differs from those of its upstream and downstream neighboring domains. A typical animal genome consists of a mixture of compositionally homogeneous and nonhomogeneous domains of varying lengths and nucleotide compositions that are interspersed with one another. We have devised IsoPlotter, an unbiased segmentation algorithm for inferring the compositional organization of genomes. IsoPlotter has become an indispensable tool for describing genomic composition and has been used in the analysis of more than a dozen genomes. Applications include describing new genomes, correlating domain composition with gene composition and their density, studying the evolution of genomes, testing phylogenomic hypotheses, and detect regions of potential interbreeding between human and extinct hominines. To extend the use of IsoPlotter, we designed a completely automated pipeline, called IsoPlotter+ to carry out all segmentation analyses, including graphical display, and built a repository for compositional domain maps of all fully sequenced vertebrate and invertebrate genomes. The IsoPlotter+ pipeline and repository offer a comprehensive solution to the study of genome compositional architecture. Here, we demonstrate IsoPlotter+ by applying it to human and insect genomes. The computational tools and data repository are available online.


2020 ◽  
Author(s):  
Yi-Tian Fu ◽  
Yu Nie ◽  
De-Yong Duan ◽  
Guo-Hua Liu

Abstract Background The family Hoplopleuridae contains at least 183 species of blood-sucking lice, which widely parasitize both mice and rats. Fragmented mitochondrial (mt) genomes have been reported in two rat lice (Hoplopleura kitti and H. akanezumi) from this family, but some minichromosomes were unidentified in their mt genomes. Methods We sequenced the mt genome of rat louse Hoplopleura sp. with an Illumina Hiseq platform and compared its mt genome organization with H. kitti and H. akanezumi. Results Fragmented mt genome of the rat louse Hoplopleura sp. contains 37 genes which are on 12 circular mt minichromosomes. Each mt minichromosome is 1.8–2.7 kb long, which contains 1–5 genes and one large non-coding region. The gene content and arrangement of three mt minichromosomes of Hoplopleura sp. and H. kitti are different from that of the three mt minichromosomes of H. akanezumi. Phylogenetic analyses based on the deduced amino acid sequences of the eight protein-coding genes showed that the Hoplopleura sp. was more closely related to H. akanezumi than to H. kitti, and then they form a monophyletic group. Conclusions Comparison among the three rat lice revealed variation in the composition of mt minichromosomes within the genus Hoplopleura. Hoplopleura sp. is the first species from the family Hoplopleuridae for which a complete fragmented mt genome has been sequenced. The new data provides useful genetic markers for studying the population genetics, molecular systematics and phylogenetics of blood-sucking lice.


2019 ◽  
Vol 17 (03) ◽  
pp. 245-254 ◽  
Author(s):  
Su-Young Hong ◽  
Kyeong-Sik Cheon ◽  
Ki-Oug Yoo ◽  
Hyun-Oh Lee ◽  
Manjulatha Mekapogu ◽  
...  

AbstractThe complete chloroplast (cp) genome sequences of three Amaranthus species (Amaranthus hypochondriacus, A. cruentus and A. caudatus) were determined by next-generation sequencing. The cp genome sequences of A. hypochondriacus, A. cruentus and A. caudatus were 150,523, 150,757 and 150,523 bp in length, respectively, each containing 84 genes with identical contents and orders. Expansion or contraction of the inverted repeat region was not observed among the three Amaranthus species. The coding regions were highly conserved with 99.3% homology in nucleotide and amino acid sequences. Five genes – matK, accD, ndhJ, ccsA and ndhF – showed relatively high non-synonymous/synonymous values (Ka/Ks > 0.1). Sequence comparison identified two insertion/deletion (InDels) greater than 40 bp in length, and polymerase chain reaction markers that could amplify these InDel regions were applied to diverse Korean Genbank accessions, which could discriminate the three Amaranthus species. Phylogenetic analyses based on 62 protein-coding genes showed that the core Caryophyllales were monophyletic and Amaranthoideae formed a sister group with the Betoideae and Chenopodioideae clade. Comparing each homologous locus among the three Amaranthus species, identified eight regions with high Pi values (>0.03). Seven of these loci, except for rps19-trnH (GUG), were considered to be useful molecular markers for further phylogenetic studies.


Zootaxa ◽  
2020 ◽  
Vol 4810 (2) ◽  
pp. 351-360
Author(s):  
CHAO DU ◽  
LI LIU ◽  
YUNPENG LIU ◽  
ZHAOHUI FU

The Eurasian Wryneck is a species of wryneck woodpecker breeding in temperate regions of Europe and Asia. We sequenced the mitochondrial genome of Jynx torquilla (Aves, Piciformes, Picidae) using the next generation sequencing. The circular genome is 16,832 bp long, encoding 13 protein-coding genes (PCGs), 22 transfer RNAs (tRNAs), two ribosomal RNAs (rRNAs), and two control regions. Gene order and orientation are similar to the most common type suggested as ancestral for birds but have a 1,221 bp control region and a 60 bp remnant control region. Phylogenetic analyses of 17 piciform taxa, based on both nucleotide and amino acid sequences of mitochondrial PCGs, strongly support the monophyly of Picidae. All phylogenetic trees indicate that the subfamily Jynginae is a monophyletic lineage sister to other woodpeckers, including monophyletic Picinae. Only the Bayes inferred tree based on the nucleotide dataset, recovered Picumninae as monophyletic. These findings will be helpful for the understanding of the phylogeny and evolution of Picidae. 


Genome ◽  
2012 ◽  
Vol 55 (3) ◽  
pp. 222-233 ◽  
Author(s):  
Natuo Kômoto ◽  
Kenji Yukuhiro ◽  
Shuichiro Tomita

Webspinners (order Embioptera) are polyneopteran insects characterized by enlarged foretarsi with silk glands, whose silk is used to produce galleries in which the insects live gregariously. The phylogenetic position of webspinners has been debated. In the present study, an almost complete mitochondrial DNA (mtDNA) sequence of Embioptera is reported for the first time. The mtDNA of a webspinner, Aposthonia japonica , has the 13 protein-coding genes (PCGs) generally found in metazoan mtDNA sequences. There is a translocation of a large region including atp6, atp8, cox3, nad3, and nad5 as well as a duplication of the 12S rRNA gene. The rearrangement does not seem to affect nucleotide composition, although amino acid composition in some parts of the mtDNA is biased compared with other Polyneoptera species. Based on phylogenetic analyses using nucleotide sequences of all PCGs concatenated with two rRNA genes and the amino acid sequences of all PCGs, A. japonica is sister to Verophasmatodea, a suborder of typical stick and leaf insects.


2020 ◽  
Author(s):  
Yi-Tian Fu ◽  
Yu Nie ◽  
De-Yong Duan ◽  
Guo-Hua Liu

Abstract Background The family Hoplopleuridae contains at least 183 species of blood-sucking lice, which widely parasitize both mice and rats. Fragmented mitochondrial (mt) genomes have been reported in two rat lice ( Hoplopleura kitti and H. akanezumi ) from this family, but some minichromosomes were unidentified in their mt genomes. Methods We sequenced the mt genome of rat louse Hoplopleura sp. with an Illumina Hiseq platform and compared its mt genome organization with H. kitti and H. akanezumi . Results Fragmented mt genome of the rat louse Hoplopleura sp. contains 37 genes which are on 12 circular mt minichromosomes. Each mt minichromosome is 1.8-2.7 kb long, which contains 1-5 genes and one large non-coding region. The gene content and arrangement of three mt minichromosomes of Hoplopleura sp. and H. kitti are different from that of the three mt minichromosomes of H. akanezumi . Phylogenetic analyses based on the deduced amino acid sequences of the eight protein-coding genes showed that the Hoplopleura sp. was more closely related to H. akanezumi than to H. kitti , and then they form a monophyletic group. Conclusions Comparison among the three rat lice revealed variation in the composition of mt minichromosomes within the genus Hoplopleura . Hoplopleura sp. is the first species from the family Hoplopleuridae for which a complete fragmented mt genome has been sequenced. The new data provides useful genetic markers for studying the population genetics, molecular systematics and phylogenetics of blood-sucking lice.


Sign in / Sign up

Export Citation Format

Share Document