scholarly journals De novo assembly and annotation of the mangrove cricket genome

2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Aya Satoh ◽  
Miwako Takasu ◽  
Kentaro Yano ◽  
Yohey Terai

Abstract Objectives The mangrove cricket, Apteronemobius asahinai, shows endogenous activity rhythms that synchronize with the tidal cycle (i.e., a free-running rhythm with a period of ~ 12.4 h [the circatidal rhythm]). Little is known about the molecular mechanisms underlying the circatidal rhythm. We present the draft genome of the mangrove cricket to facilitate future molecular studies of the molecular mechanisms behind this rhythm. Data description The draft genome contains 151,060 scaffolds with a total length of 1.68 Gb (N50: 27 kb) and 92% BUSCO completeness. We obtained 28,831 predicted genes, of which 19,896 (69%) were successfully annotated using at least one of two databases (UniProtKB/SwissProt database and Pfam database).

2022 ◽  
Author(s):  
Shinichi Morita ◽  
Tomoko F. Shibata ◽  
Tomoaki Nishiyama ◽  
Yuuki Kobayashi ◽  
Katsushi Yamaguchi ◽  
...  

Beetles are the largest insect order and one of the most successful animal groups in terms of number of species. The Japanese rhinoceros beetle Trypoxylus dichotomus (Coleoptera, Scarabaeidae, Dynastini) is a giant beetle with distinctive exaggerated horns present on the head and prothoracic regions of the male. T. dichotomus has been used as research model in various fields such as evolutionary developmental biology, ecology, ethology, biomimetics, and drug discovery. In this study, de novo assembly of 615 Mb, representing 80% of the genome estimated by flow cytometry, was obtained using the 10x Chromium platform. The scaffold N50 length of the genome assembly was 8.02 Mb, with repetitive elements predicted to comprise 49.5% of the assembly. In total, 23,987 protein-coding genes were predicted in the genome. In addition, de novo assembly of the mitochondrial genome yielded a contig of 20,217 bp. We also analyzed the transcriptome by generating 16 RNA-seq libraries from a variety of tissues of both sexes and developmental stages, which allowed us to identify 13 co-expressed gene modules. The detailed genomic and transcriptomic information of T. dichotomus is the most comprehensive among those reported for any species of Dynastinae. This genomic information will be an excellent resource for further functional and evolutionary analyses, including the evolutionary origin and genetic regulation of beetle horns and the molecular mechanisms underlying sexual dimorphism.


2019 ◽  
Vol 11 (7) ◽  
pp. 1965-1970 ◽  
Author(s):  
Nikola Palevich ◽  
Paul H Maclean ◽  
Abdul Baten ◽  
Richard W Scott ◽  
David M Leathwick

Abstract Internal parasitic nematodes are a global animal health issue causing drastic losses in livestock. Here, we report a H. contortus representative draft genome to serve as a genetic resource to the scientific community and support future experimental research of molecular mechanisms in related parasites. A de novo hybrid assembly was generated from PCR-free whole genome sequence data, resulting in a chromosome-level assembly that is 465 Mb in size encoding 22,341 genes. The genome sequence presented here is consistent with the genome architecture of the existing Haemonchus species and is a valuable resource for future studies regarding population genetic structures of parasitic nematodes. Additionally, comparative pan-genomics with other species of economically important parasitic nematodes have revealed highly open genomes and strong collinearities within the phylum Nematoda.


2019 ◽  
Vol 20 (18) ◽  
pp. 4334 ◽  
Author(s):  
Fradj ◽  
Gonçalves dos Santos ◽  
de Montigny ◽  
Awwad ◽  
Boumghar ◽  
...  

Chaga (Inonotus obliquus) is a medicinal fungus used in traditional medicine of Native American and North Eurasian cultures. Several studies have demonstrated the medicinal properties of chaga’s bioactive molecules. For example, several terpenoids (e.g., betulin, betulinic acid and inotodiol) isolated from I. obliquus cells have proven effectiveness in treating different types of tumor cells. However, the molecular mechanisms and regulation underlying the biosynthesis of chaga terpenoids remain unknown. In this study, we report on the optimization of growing conditions for cultured I. obliquus in presence of different betulin sources (e.g., betulin or white birch bark). It was found that better results were obtained for a liquid culture pH 6.2 at 28 °C. In addition, a de novo assembly and characterization of I. obliquus transcriptome in these growth conditions using Illumina technology was performed. A total of 219,288,500 clean reads were generated, allowing for the identification of 20,072 transcripts of I. obliquus including transcripts involved in terpenoid biosynthesis. The differential expression of these genes was confirmed by quantitative-PCR. This study provides new insights on the molecular mechanisms and regulation of I. obliquus terpenoid production. It also contributes useful molecular resources for gene prediction or the development of biotechnologies for the alternative production of terpenoids.


GigaScience ◽  
2020 ◽  
Vol 9 (1) ◽  
Author(s):  
Yujing Suo ◽  
Peng Sun ◽  
Huihui Cheng ◽  
Weijuan Han ◽  
Songfeng Diao ◽  
...  

Abstract Background Diospyros oleifera Cheng, of the family Ebenaceae, is an economically important tree. Phylogenetic analyses indicate that D. oleifera is closely related to Diospyros kaki Thunb. and could be used as a model plant for studies of D. kaki. Therefore, development of genomic resources of D. oleifera will facilitate auxiliary assembly of the hexaploid persimmon genome and elucidate the molecular mechanisms of important traits. Findings The D. oleifera genome was assembled with 443.6 Gb of raw reads using the Pacific Bioscience Sequel and Illumina HiSeq X Ten platforms. The final draft genome was ∼812.3 Mb and had a high level of continuity with N50 of 3.36 Mb. Fifteen scaffolds corresponding to the 15 chromosomes were assembled to a final size of 721.5 Mb using 332 scaffolds, accounting for 88.81% of the genome. Repeat sequences accounted for 54.8% of the genome. By de novo sequencing and analysis of homology with other plant species, 30,530 protein-coding genes with an average transcript size of 7,105.40 bp were annotated; of these, 28,580 protein-coding genes (93.61%) had conserved functional motifs or terms. In addition, 171 candidate genes involved in tannin synthesis and deastringency in persimmon were identified; of these chalcone synthase (CHS) genes were expanded in the D. oleifera genome compared with Diospyros lotus, Camellia sinensis, and Vitis vinifera. Moreover, 186 positively selected genes were identified, including chalcone isomerase (CHI) gene, a key enzyme in the flavonoid-anthocyanin pathway. Phylogenetic tree analysis indicated that the split of D. oleifera and D. lotus likely occurred 9.0 million years ago. In addition to the ancient γ event, a second whole-genome duplication event occurred in D. oleifera and D. lotus. Conclusions We generated a high-quality chromosome-level draft genome for D. oleifera, which will facilitate assembly of the hexaploid persimmon genome and further studies of major economic traits in the genus Diospyros.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9762
Author(s):  
Andres Benavides ◽  
Friman Sanchez ◽  
Juan F. Alzate ◽  
Felipe Cabarcas

Background A prime objective in metagenomics is to classify DNA sequence fragments into taxonomic units. It usually requires several stages: read’s quality control, de novo assembly, contig annotation, gene prediction, etc. These stages need very efficient programs because of the number of reads from the projects. Furthermore, the complexity of metagenomes requires efficient and automatic tools that orchestrate the different stages. Method DATMA is a pipeline for fast metagenomic analysis that orchestrates the following: sequencing quality control, 16S rRNA-identification, reads binning, de novo assembly and evaluation, gene prediction, and taxonomic annotation. Its distributed computing model can use multiple computing resources to reduce the analysis time. Results We used a controlled experiment to show DATMA functionality. Two pre-annotated metagenomes to compare its accuracy and speed against other metagenomic frameworks. Then, with DATMA we recovered a draft genome of a novel Anaerolineaceae from a biosolid metagenome. Conclusions DATMA is a bioinformatics tool that automatically analyzes complex metagenomes. It is faster than similar tools and, in some cases, it can extract genomes that the other tools do not. DATMA is freely available at https://github.com/andvides/DATMA.


2021 ◽  
Vol 19 (3) ◽  
pp. e32
Author(s):  
Jeong-An Gim ◽  
Kyung-Wan Baek ◽  
Young-Sool Hah ◽  
Ho Jin Choo ◽  
Ji-Seok Kim ◽  
...  

Semisulcospira libertina, a species of freshwater snail, is widespread in East Asia. It is important as a food source. Additionally, it is a vector of clonorchiasis, paragonimiasis, metagonimiasis, and other parasites. Although S. libertina has ecological, commercial, and clinical importance, its whole-genome has not been reported yet. Here, we revealed the genome of S. libertina through de novo assembly. We assembled the whole-genome of S. libertina and determined its transcriptome for the first time using Illumina NovaSeq 6000 platform. According to the k-mer analysis, the genome size of S. libertina was estimated to be 3.04 Gb. Using RepeatMasker, a total of 53.68% of repeats were identified in the genome assembly. Genome data of S. libertina reported in this study will be useful for identification and conservation of S. libertina in East Asia.


Author(s):  
Pei-Ling Yu ◽  
James C. Fulton ◽  
Sandra L. Carmona ◽  
Diana Burbano-David ◽  
Luz Stella Barrero ◽  
...  

We report a draft genome assembly of the causal agent of tomato vascular wilt, Fusarium oxysporum f. sp. lycopersici isolate 59, obtained from the Andean region in Colombia.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Min Shi ◽  
Zhizhi Wang ◽  
Xiqian Ye ◽  
Hongqing Xie ◽  
Fei Li ◽  
...  

Abstract Background Parasitic insects are well-known biological control agents for arthropod pests worldwide. They are capable of regulating their host’s physiology, development and behaviour. However, many of the molecular mechanisms involved in host-parasitoid interaction remain unknown. Results We sequenced the genomes of two parasitic wasps (Cotesia vestalis, and Diadromus collaris) that parasitize the diamondback moth Plutella xylostella using Illumina and Pacbio sequencing platforms. Genome assembly using SOAPdenovo produced a 178 Mb draft genome for C. vestalis and a 399 Mb draft genome for D. collaris. A total set that contained 11,278 and 15,328 protein-coding genes for C. vestalis and D. collaris, respectively, were predicted using evidence (homology-based and transcriptome-based) and de novo prediction methodology. Phylogenetic analysis showed that the braconid C. vestalis and the ichneumonid D. collaris diverged approximately 124 million years ago. These two wasps exhibit gene gains and losses that in some cases reflect their shared life history as parasitic wasps and in other cases are unique to particular species. Gene families with functions in development, nutrient acquisition from hosts, and metabolism have expanded in each wasp species, while genes required for biosynthesis of some amino acids and steroids have been lost, since these nutrients can be directly obtained from the host. Both wasp species encode a relative higher number of neprilysins (NEPs) thus far reported in arthropod genomes while several genes encoding immune-related proteins and detoxification enzymes were lost in both wasp genomes. Conclusions We present the annotated genome sequence of two parasitic wasps C. vestalis and D. collaris, which parasitize a common host, the diamondback moth, P. xylostella. These data will provide a fundamental source for studying the mechanism of host control and will be used in parasitoid comparative genomics to study the origin and diversification of the parasitic lifestyle.


Author(s):  
Nikolay Alabi ◽  
Yihan Wu ◽  
Oliver Bossdorf ◽  
Loren H Rieseberg ◽  
Robert I Colautti

Abstract The emerging field of invasion genetics examines the genetic causes and consequences of biological invasions, but few study systems are available that integrate deep ecological knowledge with genomic tools. Here we report on the de novo assembly and annotation of a genome for the biennial herb Alliaria petiolata (M. Bieb.) Cavara & Grande (Brassicaceae), which is widespread in Eurasia and invasive across much of temperate North America. Our goal was to sequence and annotate a genome to complement resources available from hundreds of published ecological studies, a global field survey, and hundreds of genetic lines maintained in Germany and Canada. We sequenced a genotype (EFCC3-3-20) collected from the native range near Venice, Italy and sequenced paired-end and mate pair libraries at ∼70 × coverage. A de novo assembly resulted in a highly continuous draft genome (N50 = 121 Mb; L50 = 2) with 99.7% of the 1.1 Gb genome mapping to scaffolds of at least 50 Kb in length. A total of 64,770 predicted genes in the annotated genome include 99% of plant BUSCO genes and 98% of transcriptome reads. Consistent with previous reports of (auto)hexaploidy in western Europe, we found that almost one third of BUSCO genes (390/1440) mapped to two or more scaffolds despite < 2% genome-wide average heterozygosity. The continuity and gene space quality of our draft assembly will enable molecular and functional genomic studies of A. petiolata to address questions relevant to invasion genetics and conservation strategies.


Genome ◽  
2017 ◽  
Vol 60 (9) ◽  
pp. 743-755 ◽  
Author(s):  
Sorel Fitz-Gibbon ◽  
Andrew L. Hipp ◽  
Kasey K. Pham ◽  
Paul S. Manos ◽  
Victoria L. Sork

The emergence of next generation sequencing has increased by several orders of magnitude the amount of data available for phylogenetics. Reduced representation approaches, such as restriction-sited associated DNA sequencing (RADseq), have proven useful for phylogenetic studies of non-model species at a wide range of phylogenetic depths. However, analysis of these datasets is not uniform and we know little about the potential benefits and drawbacks of de novo assembly versus assembly by mapping to a reference genome. Using RADseq data for 83 oak samples representing 16 taxa, we identified variants via three pipelines: mapping sequence reads to a recently published draft genome of Quercus lobata, and de novo assembly under two sets of locus filters. For each pipeline, we inferred the maximum likelihood phylogeny. All pipelines produced similar trees, with minor shifts in relationships within well-supported clades, despite the fact that they yielded different numbers of loci (68 000 – 111 000 loci) and different degrees of overlap with the reference genome. We conclude that both the reference-aligned and de novo assembly pipelines yield reliable results, and that advantages and disadvantages of these approaches pertain mainly to downstream uses of RADseq data, not to phylogenetic inference per se.


Sign in / Sign up

Export Citation Format

Share Document