assembly pipeline
Recently Published Documents


TOTAL DOCUMENTS

71
(FIVE YEARS 36)

H-INDEX

10
(FIVE YEARS 3)

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Zack Saud ◽  
Matthew D. Hitchings ◽  
Tariq M. Butt

AbstractDNA viruses can exploit host cellular epigenetic processes to their advantage; however, the epigenome status of most DNA viruses remains undetermined. Third generation sequencing technologies allow for the identification of modified nucleotides from sequencing experiments without specialized sample preparation, permitting the detection of non-canonical epigenetic modifications that may distinguish viral nucleic acid from that of their host, thus identifying attractive targets for advanced therapeutics and diagnostics. We present a novel nanopore de novo assembly pipeline used to assemble a misidentified Camelpox vaccine. Two confirmed deletions of this vaccine strain in comparison to the closely related Vaccinia virus strain modified vaccinia Ankara make it one of the smallest non-vector derived orthopoxvirus genomes to be reported. Annotation of the assembly revealed a previously unreported signal peptide at the start of protein A38 and several predicted signal peptides that were found to differ from those previously described. Putative epigenetic modifications around various motifs have been identified and the assembly confirmed previous work showing the vaccine genome to most closely resemble that of Vaccinia virus strain Modified Vaccinia Ankara. The pipeline may be used for other DNA viruses, increasing the understanding of DNA virus evolution, virulence, host preference, and epigenomics.


2021 ◽  
Author(s):  
Philipp Matthias Schäfer ◽  
Franz Steinmetz ◽  
Stefan Schneyer ◽  
Timo Bachmann ◽  
Thomas Eiband ◽  
...  

Technology has sufficiently matured to enable, in principle, flexible and autonomous robotic assembly systems. However, in practice, it requires making all the relevant (implicit) knowledge that system engineers and workers have – about products to be assembled, tasks to be performed, as well as robots and their skills – available to the system explicitly. Only then can the planning and execution components of a robotic assembly pipeline communicate with each other in the same language and solve tasks autonomously without human intervention. This is why we have developed the Factory of the Future (FoF) ontology. At its core, this ontology models the tasks that are necessary to assemble a product and the robotic skills that can be employed to complete said tasks. The FoF ontology is based on existing standards. We started with theoretical considerations and iteratively adapted it based on practical experience gained from incorporating more and more components required for automated planning and assembly. Furthermore, we propose tools to extend the ontology for specific scenarios with knowledge about parts, robots, tools, and skills from various sources. The resulting scenario ontology serves us as world model for the robotic systems and other components of the assembly process. A central runtime interface to this world model provides fast and easy access to the knowledge during execution. In this work, we also show the integration of a graphical user front-end, an assembly planner, a workspace reconfigurator, and more components of the assembly pipeline that all communicate with the help of the FoF ontology. Overall, our integration of the FoF ontology with the other components of a robotic assembly pipeline shows that using an ontology is a practical method to establish a common language and understanding between the involved components.


2021 ◽  
Vol 17 (8) ◽  
pp. e1009304
Author(s):  
Nikolas Dovrolis ◽  
Katerina Kassela ◽  
Konstantinos Konstantinidis ◽  
Adamantia Kouvela ◽  
Stavroula Veletza ◽  
...  

Viral metagenomics, also known as virome studies, have yielded an unprecedented number of novel sequences, essential in recognizing and characterizing the etiological agent and the origin of emerging infectious diseases. Several tools and pipelines have been developed, to date, for the identification and assembly of viral genomes. Assembly pipelines often result in viral genomes contaminated with host genetic material, some of which are currently deposited into public databases. In the current report, we present a group of deposited sequences that encompass ribosomal RNA (rRNA) contamination. We highlight the detrimental role of chimeric next generation sequencing reads, between host rRNA sequences and viral sequences, in virus genome assembly and we present the hindrances these reads may pose to current methodologies. We have further developed a refining pipeline, the Zero Waste Algorithm (ZWA) that assists in the assembly of low abundance viral genomes. ZWA performs context-depended trimming of chimeric reads, precisely removing their rRNA moiety. These, otherwise discarded, reads were fed to the assembly pipeline and assisted in the construction of larger and cleaner contigs making a substantial impact on current assembly methodologies. ZWA pipeline may significantly enhance virus genome assembly from low abundance samples and virus metagenomics approaches in which a small number of reads determine genome quality and integrity.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Valentine Murigneux ◽  
Leah W. Roberts ◽  
Brian M. Forde ◽  
Minh-Duy Phan ◽  
Nguyen Thi Khanh Nhu ◽  
...  

Abstract Background Oxford Nanopore Technology (ONT) long-read sequencing has become a popular platform for microbial researchers due to the accessibility and affordability of its devices. However, easy and automated construction of high-quality bacterial genomes using nanopore reads remains challenging. Here we aimed to create a reproducible end-to-end bacterial genome assembly pipeline using ONT in combination with Illumina sequencing. Results We evaluated the performance of several popular tools used during genome reconstruction, including base-calling, filtering, assembly, and polishing. We also assessed overall genome accuracy using ONT both natively and with Illumina. All steps were validated using the high-quality complete reference genome for the Escherichia coli sequence type (ST)131 strain EC958. Software chosen at each stage were incorporated into our final pipeline, MicroPIPE. Further validation of MicroPIPE was carried out using 11 additional ST131 E. coli isolates, which demonstrated that complete circularised chromosomes and plasmids could be achieved without manual intervention. Twelve publicly available Gram-negative and Gram-positive bacterial genomes (with available raw ONT data and matched complete genomes) were also assembled using MicroPIPE. We found that revised basecalling and updated assembly of the majority of these genomes resulted in improved accuracy compared to the current publicly available complete genomes. Conclusions MicroPIPE is built in modules using Singularity container images and the bioinformatics workflow manager Nextflow, allowing changes and adjustments to be made in response to future tool development. Overall, MicroPIPE provides an easy-access, end-to-end solution for attaining high-quality bacterial genomes. MicroPIPE is available at https://github.com/BeatsonLab-MicrobialGenomics/micropipe.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 289
Author(s):  
Xiao Ma ◽  
Jeanine L. Olsen ◽  
Thorsten B.H. Reusch ◽  
Gabriele Procaccini ◽  
Dave Kudrna ◽  
...  

Background: Seagrasses (Alismatales) are the only fully marine angiosperms. Zostera marina (eelgrass) plays a crucial role in the functioning of coastal marine ecosystems and global carbon sequestration. It is the most widely studied seagrass and has become a marine model system for exploring adaptation under rapid climate change. The original draft genome (v.1.0) of the seagrass Z. marina (L.) was based on a combination of Illumina mate-pair libraries and fosmid-ends. A total of 25.55 Gb of Illumina and 0.14 Gb of Sanger sequence was obtained representing 47.7× genomic coverage. The assembly resulted in ~2000 unordered scaffolds (L50 of 486 Kb), a final genome assembly size of 203MB, 20,450 protein coding genes and 63% TE content. Here, we present an upgraded chromosome-scale genome assembly and compare v.1.0 and the new v.3.1, reconfirming previous results from Olsen et al. (2016), as well as pointing out new findings.   Methods: The same high molecular weight DNA used in the original sequencing of the Finnish clone was used. A high-quality reference genome was assembled with the MECAT assembly pipeline combining PacBio long-read sequencing and Hi-C scaffolding.  Results: In total, 75.97 Gb PacBio data was produced. The final assembly comprises six pseudo-chromosomes and 304 unanchored scaffolds with a total length of 260.5Mb and an N50 of 34.6 MB, showing high contiguity and few gaps (~0.5%). 21,483 protein-encoding genes are annotated in this assembly, of which 20,665 (96.2%) obtained at least one functional assignment based on similarity to known proteins.  Conclusions: As an important marine angiosperm, the improved Z. marina genome assembly will further assist evolutionary, ecological, and comparative genomics at the chromosome level. The new genome assembly will further our understanding into the structural and physiological adaptations from land to marine life.


Author(s):  
Jin Sun ◽  
Runsheng Li ◽  
Chong Chen ◽  
Julia D. Sigwart ◽  
Kevin M. Kocot

Choosing the optimum assembly approach is essential to achieving a high-quality genome assembly suitable for comparative and evolutionary genomic investigations. Significant recent progress in long-read sequencing technologies such as PacBio and Oxford Nanopore Technologies (ONT) has also brought about a large variety of assemblers. Although these have been extensively tested on model species such as Homo sapiens and Drosophila melanogaster , such benchmarking has not been done in Mollusca, which lacks widely adopted model species. Molluscan genomes are notoriously rich in repeats and are often highly heterozygous, making their assembly challenging. Here, we benchmarked 10 assemblers based on ONT raw reads from two published molluscan genomes of differing properties, the gastropod Chrysomallon squamiferum (356.6 Mb, 1.59% heterozygosity) and the bivalve Mytilus coruscus (1593 Mb, 1.94% heterozygosity). By optimizing the assembly pipeline, we greatly improved both genomes from previously published versions. Our results suggested that 40–50X of ONT reads are sufficient for high-quality genomes, with Flye being the recommended assembler for compact and less heterozygous genomes exemplified by C. squamiferum , while NextDenovo excelled for more repetitive and heterozygous molluscan genomes exemplified by M. coruscus . A phylogenomic analysis using the two updated genomes with 32 other published high-quality lophotrochozoan genomes resulted in maximum support across all nodes, and we show that improved genome quality also leads to more complete matrices for phylogenomic inferences. Our benchmarking will ensure efficiency in future assemblies for molluscs and perhaps also for other marine phyla with few genomes available. This article is part of the Theo Murphy meeting issue ‘Molluscan genomics: broad insights and future directions for a neglected phylum’.


2021 ◽  
Vol 10 (10) ◽  
Author(s):  
John M. Farrow ◽  
Everett C. Pesci ◽  
Daniel J. Slade

ABSTRACT Here, we report a complete genome sequence for Acinetobacter baumannii strain ATCC 17961, with plasmid sequences, and a high-quality (>98% complete) build for A. baumannii strain AB09-003. These genome sequences were generated by combining short-read Illumina and long-read Oxford Nanopore MinION sequencing data using the Unicycler hybrid assembly pipeline.


2021 ◽  
Author(s):  
Valentine Murigneux ◽  
Leah W. Roberts ◽  
Brian M. Forde ◽  
Minh-Duy Phan ◽  
Nguyen Thi Khanh Nhu ◽  
...  

AbstractOxford Nanopore Technology (ONT) long-read sequencing has become a popular platform for microbial researchers; however, easy and automated construction of high-quality bacterial genomes remains challenging. Here we present MicroPIPE: a reproducible end-to-end bacterial genome assembly pipeline for ONT and Illumina sequencing. To construct MicroPIPE, we evaluated the performance of several tools for genome reconstruction and assessed overall genome accuracy using ONT both natively and with Illumina. Further validation of MicroPIPE was carried out using 11 sequence type (ST)131 Escherichia coli and eight publicly available Gram-negative and Gram-positive bacterial isolates. MicroPIPE uses Singularity containers and the workflow manager Nextflow and is available at https://github.com/BeatsonLab-MicrobialGenomics/micropipe.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Zack Saud ◽  
Alexandra M. Kortsinoglou ◽  
Vassili N. Kouvelis ◽  
Tariq M. Butt

Abstract Background More accurate and complete reference genomes have improved understanding of gene function, biology, and evolutionary mechanisms. Hybrid genome assembly approaches leverage benefits of both long, relatively error-prone reads from third-generation sequencing technologies and short, accurate reads from second-generation sequencing technologies, to produce more accurate and contiguous de novo genome assemblies in comparison to using either technology independently. In this study, we present a novel hybrid assembly pipeline that allowed for both mitogenome de novo assembly and telomere length de novo assembly of all 7 chromosomes of the model entomopathogenic fungus, Metarhizium brunneum. Results The improved assembly allowed for better ab initio gene prediction and a more BUSCO complete proteome set has been generated in comparison to the eight current NCBI reference Metarhizium spp. genomes. Remarkably, we note that including the mitogenome in ab initio gene prediction training improved overall gene prediction. The assembly was further validated by comparing contig assembly agreement across various assemblers, assessing the assembly performance of each tool. Genomic synteny and orthologous protein clusters were compared between Metarhizium brunneum and three other Hypocreales species with complete genomes, identifying core proteins, and listing orthologous protein clusters shared uniquely between the two entomopathogenic fungal species, so as to further facilitate the understanding of molecular mechanisms underpinning fungal-insect pathogenesis. Conclusions The novel assembly pipeline may be used for other haploid fungal species, facilitating the need to produce high-quality reference fungal genomes, leading to better understanding of fungal genomic evolution, chromosome structuring and gene regulation.


Sign in / Sign up

Export Citation Format

Share Document