scholarly journals First annotated draft genomes of non-marine ostracods (Ostracoda, Crustacea) with different reproductive modes

2020 ◽  
Author(s):  
Patrick Tran Van ◽  
Yoann Anselmetti ◽  
Jens Bast ◽  
Zoé Dumas ◽  
Nicolas Galtier ◽  
...  

ABSTRACTSOstracods are one of the oldest crustacean groups with an excellent fossil record and high importance for phylogenetic analyses but genome resources for this class are still lacking. We have successfully assembled and annotated the first reference genomes for three species of non-marine ostracods; two with obligate sexual reproduction (Cyprideis torosa and Notodromas monacha) and the putative ancient asexual Darwinula stevensoni. This kind of genomic research has so far been impeded by the small size of most ostracods and the absence of genetic resources such as linkage maps or BAC libraries that were available for other crustaceans. For genome assembly, we used an Illumina-based sequencing technology, resulting in assemblies of similar sizes for the three species (335-382Mb) and with scaffold numbers and their N50 (19-56 kb) in the same orders of magnitude. Gene annotations were guided by transcriptome data from each species. The three assemblies are relatively complete with BUSCO scores of 92-96%, and thus exceed the quality of several other published crustacean genomes obtained with similar techniques. The number of predicted genes (13,771-17,776) is in the same range as Branchiopoda genomes but lower than in most malacostracan genomes. These three reference genomes from non-marine ostracods provide the urgently needed basis to further develop ostracods as models for evolutionary and ecological research.

Author(s):  
Patrick Tran Van ◽  
Yoann Anselmetti ◽  
Jens Bast ◽  
Zoé Dumas ◽  
Nicolas Galtier ◽  
...  

Abstract Ostracods are one of the oldest crustacean groups with an excellent fossil record and high importance for phylogenetic analyses but genome resources for this class are still lacking. We have successfully assembled and annotated the first reference genomes for three species of non-marine ostracods; two with obligate sexual reproduction (Cyprideis torosa and Notodromas monacha) and the putative ancient asexual Darwinula stevensoni. This kind of genomic research has so far been impeded by the small size of most ostracods and the absence of genetic resources such as linkage maps or BAC libraries that were available for other crustaceans. For genome assembly, we used an Illumina-based sequencing technology, resulting in assemblies of similar sizes for the three species (335-382Mb) and with scaffold numbers and their N50 (19-56 kb) in the same orders of magnitude. Gene annotations were guided by transcriptome data from each species. The three assemblies are relatively complete with BUSCO scores of 92-96. The number of predicted genes (13,771-17,776) is in the same range as Branchiopoda genomes but lower than in most malacostracan genomes. These three reference genomes from non-marine ostracods provide the urgently needed basis to further develop ostracods as models for evolutionary and ecological research.


Plant Science ◽  
2006 ◽  
Vol 170 (4) ◽  
pp. 889-896
Author(s):  
Yann-Rong Lin ◽  
Teh-Yuan Chow ◽  
Meizhong Luo ◽  
Dave Kudrna ◽  
Chih-Chi Lin ◽  
...  

2020 ◽  
Vol 36 (10) ◽  
pp. 3011-3017 ◽  
Author(s):  
Olga Mineeva ◽  
Mateo Rojas-Carulla ◽  
Ruth E Ley ◽  
Bernhard Schölkopf ◽  
Nicholas D Youngblut

Abstract Motivation Methodological advances in metagenome assembly are rapidly increasing in the number of published metagenome assemblies. However, identifying misassemblies is challenging due to a lack of closely related reference genomes that can act as pseudo ground truth. Existing reference-free methods are no longer maintained, can make strong assumptions that may not hold across a diversity of research projects, and have not been validated on large-scale metagenome assemblies. Results We present DeepMAsED, a deep learning approach for identifying misassembled contigs without the need for reference genomes. Moreover, we provide an in silico pipeline for generating large-scale, realistic metagenome assemblies for comprehensive model training and testing. DeepMAsED accuracy substantially exceeds the state-of-the-art when applied to large and complex metagenome assemblies. Our model estimates a 1% contig misassembly rate in two recent large-scale metagenome assembly publications. Conclusions DeepMAsED accurately identifies misassemblies in metagenome-assembled contigs from a broad diversity of bacteria and archaea without the need for reference genomes or strong modeling assumptions. Running DeepMAsED is straight-forward, as well as is model re-training with our dataset generation pipeline. Therefore, DeepMAsED is a flexible misassembly classifier that can be applied to a wide range of metagenome assembly projects. Availability and implementation DeepMAsED is available from GitHub at https://github.com/leylabmpi/DeepMAsED. Supplementary information Supplementary data are available at Bioinformatics online.


Insects ◽  
2019 ◽  
Vol 10 (4) ◽  
pp. 91
Author(s):  
Weidong Huang ◽  
Xiufeng Xie ◽  
Xinyue Liang ◽  
Xingmin Wang ◽  
Xiaosheng Chen

Obtaining genetic information from museum specimens is a fundamental component of many fields of research, including DNA barcoding, population genetics, conservation genetics, and phylogenetic analysis. However, acquiring genetic information from museum specimens is challenging because of the difficulty in amplifying the target sequences due to DNA damage and degradation. Different pretreatments can significantly impact the purity and concentration of genomic DNA from museum specimens. Here, we assessed four pretreatment methods—use of 0.9% NaCl buffer, phosphate-buffered saline (PBS), Saline Tris-EDTA (STE) buffer, and sterile water—to determine which pretreatment is most suitable for DNA extraction from dried specimens of ladybird beetles. We completed a comprehensive phylogenetic analysis to test whether the sequences obtained from dried specimens enable proper phylogenetic inference. Our results showed that pretreatment can improve the quality of DNA from dried specimens. The pretreatment effects of 0.9% NaCl buffer and STE buffer were better than those of PBS buffer and sterile water. The phylogenetic analyses results showed that museum specimens can be used to generate cogent phylogenetic inferences. We report the optimum pretreatment methods for DNA extraction from dried ladybird beetles specimens as well as provide evidence for accurately determining phylogenetic relationships for museum specimens.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Marina Athanasouli ◽  
Hanh Witte ◽  
Christian Weiler ◽  
Tobias Loschko ◽  
Gabi Eberhardt ◽  
...  

Abstract Background Nematode model organisms such as Caenorhabditis elegans and Pristionchus pacificus are powerful systems for studying the evolution of gene function at a mechanistic level. However, the identification of P. pacificus orthologs of candidate genes known from C. elegans is complicated by the discrepancy in the quality of gene annotations, a common problem in nematode and invertebrate genomics. Results Here, we combine comparative genomic screens for suspicious gene models with community-based curation to further improve the quality of gene annotations in P. pacificus. We extend previous curations of one-to-one orthologs to larger gene families and also orphan genes. Cross-species comparisons of protein lengths, screens for atypical domain combinations and species-specific orphan genes resulted in 4311 candidate genes that were subject to community-based curation. Corrections for 2946 gene models were implemented in a new version of the P. pacificus gene annotations. The new set of gene annotations contains 28,896 genes and has a single copy ortholog completeness level of 97.6%. Conclusions Our work demonstrates the effectiveness of comparative genomic screens to identify suspicious gene models and the scalability of community-based approaches to improve the quality of thousands of gene models. Similar community-based approaches can help to improve the quality of gene annotations in other invertebrate species, including parasitic nematodes.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Gokhan Yavas ◽  
Huixiao Hong ◽  
Wenming Xiao

Abstract Background Accurate de novo genome assembly has become reality with the advancements in sequencing technology. With the ever-increasing number of de novo genome assembly tools, assessing the quality of assemblies has become of great importance in genome research. Although many quality metrics have been proposed and software tools for calculating those metrics have been developed, the existing tools do not produce a unified measure to reflect the overall quality of an assembly. Results To address this issue, we developed the de novo Assembly Quality Evaluation Tool (dnAQET) that generates a unified metric for benchmarking the quality assessment of assemblies. Our framework first calculates individual quality scores for the scaffolds/contigs of an assembly by aligning them to a reference genome. Next, it computes a quality score for the assembly using its overall reference genome coverage, the quality score distribution of its scaffolds and the redundancy identified in it. Using synthetic assemblies randomly generated from the latest human genome build, various builds of the reference genomes for five organisms and six de novo assemblies for sample NA24385, we tested dnAQET to assess its capability for benchmarking quality evaluation of genome assemblies. For synthetic data, our quality score increased with decreasing number of misassemblies and redundancy and increasing average contig length and coverage, as expected. For genome builds, dnAQET quality score calculated for a more recent reference genome was better than the score for an older version. To compare with some of the most frequently used measures, 13 other quality measures were calculated. The quality score from dnAQET was found to be better than all other measures in terms of consistency with the known quality of the reference genomes, indicating that dnAQET is reliable for benchmarking quality assessment of de novo genome assemblies. Conclusions The dnAQET is a scalable framework designed to evaluate a de novo genome assembly based on the aggregated quality of its scaffolds (or contigs). Our results demonstrated that dnAQET quality score is reliable for benchmarking quality assessment of genome assemblies. The dnQAET can help researchers to identify the most suitable assembly tools and to select high quality assemblies generated.


2021 ◽  
Vol 2 (3) ◽  
pp. 4014-4028
Author(s):  
Chenghao Du

The novel coronavirus disease 2019 (COVID‐19), originally identified in December 2019 Wuhan, China, has propagated to worldwide pandemic, causing many cases of death and morbidity. Since the development of COVID-19 vaccines is still under experimental stages without public access, different types of testing and detection ensuring rapid and accurate results are urgently required to prevent delaying isolation of infected patients. The traditional diagnostic and analytical methods of COVID-19 relied heavily on nucleic acid and antibody-antigen methods but are subject to assembly bias, restricted by reading length, showed some false positive/negative results and had a long turnaround time. Hence, three styles of nanopore sequencing techniques as complementary tools for COVID-19 diagnosis and analysis are introduced. The long-read nanopore sequencing technology has been adopted in metagenomic and pathological studies of virosphere including SARS-CoV-2 recently by either metagenomically, directly or indirectly sequencing the viral genomic RNA of SARS-CoV-2 in real-time to detect infected specimens for early isolation and treatment, to investigate the transmission and evolutionary routes of SARS-CoV-2 as well as its pathogenicity and epidemiology. In this article, the Nanopore-Based Metagenomic Sequencing, Direct RNA Nanopore Sequencing (DRS), and Nanopore Targeted Sequencing (NTS) become the main focus of the novel COVID-19 detecting analytical methods in sequencing platforms, which are discussed in comparison with other traditional and popular diagnostic methods. Finally, different types of nanopore sequencing platforms that are developed by Oxford Nanopore Technologies (ONT) due to various purposes and demands in viral genomic research are briefly discussed.


2018 ◽  
Author(s):  
Hongxing Qiao ◽  
Xiaojing Zhang ◽  
Hongtao Shi ◽  
Yuzhen Song ◽  
Chuanzhou Bian

Background. Astragalus was a well-known traditional herbal medicine, widely used in human s , livestock and poultry in China and E ast Asia. Fermentation could improve health-promoting biological substance by probiotics. Methods. We investigated Astragalus that was fermented using probiotics including Enterococcus faecium , Lactobacillus plantarum and E nterococcus faecium + L actobacillus plantarum and applied the PacBio single molecule, real-time sequencing technology (SMRT) to evaluat e the quality of Astragalus fermentation production. Results. We found the production rates of acetic acid, methylacetic acid , ethylacetic acid and lactic acid using E. faecium + L. plantarum fermentation were 1866.24 mg/kg on day 15 , 203.80 mg/kg on day 30 , 996.04 mg/kg on day 15 and 3081.99 mg/kg on day 20 , respectively. Other production rates were: polysaccharide s, 9.43%, 8.51% and 7 . 59% on day 10; saponins , 19.6912 mg/g, 21.6630 mg/g and 20.2084 mg/g on day 15; and flavonoid s, 1.9032 mg/g, 2.0835 mg/g and 1.7086 mg/g on day 20 using E . faecium , L . plantarum and E. faecium + L. plantarum , respectively. According to SMRT analysis of the microbial composition s of nine Astragalus samples, we found after fermentation on day 3 , E. faecium and L. plantarum became the most prevalent species. Moreover, E. faecium + L. plantarum gave more positive effects than single strains in the Astragalus solid state fermentation process. Inclusion. Our data have demonstrate d that the SMRT sequencing platform is applicable to assessing the quality of Astragalus fermentation.


2017 ◽  
Author(s):  
Thahmina Ali ◽  
Baekdoo Kim ◽  
Carlos Lijeron ◽  
Olorunseun O Ogunwobi ◽  
Raja Mazumder ◽  
...  

In translational medicine, the technology of RNA sequencing (RNA-seq) continues to prove powerful, and transforming the RNA-seq data into biological insights has become increasingly imperative. We present the Transcriptomics profiler for Easy Discovery (TED) toolkit, a comprehensive approach to processing and analyzing RNA-seq data. TED is divided into three major modules: data quality control, transcriptome data analysis, and data discovery, with eleven pipelines in total. These pipelines perform the preliminary steps from assessing and correcting the quality of the RNA-seq data, to the simultaneous analysis of five transcriptomic features (differentially expressed coding, non-coding, novel isoform genes, gene fusions, alternative splicing events, genetic variants of somatic and germline mutations) and ultimately translating the RNA-seq analysis findings into actionable, clinically-relevant reports. TED was evaluated using previously published prostate cancer transcriptome data where we observed previously studied outcomes, and also created a knowledge database of highly-integrated, biologically relevant reports demonstrating that it is well-positioned for clinical applications. TED is implemented on an instance of the Galaxy platform (Galaxy page: http://galaxy.hunter.cuny.edu/u/bioitcore/p/transcriptomics-profiler-for-easy-discovery-ted-toolkit , Documentation Manual: http://ted.readthedocs.io/en/latest/index.html ) as intuitive and reproducible pipelines providing a manageable strategy for conducting substantial transcriptome analysis in a routine and sustainable fashion for bioinformatics researchers and clinicians alike.


Sign in / Sign up

Export Citation Format

Share Document