hybpiper-rbgv and yang-and-smith-rbgv: Containerization and additional options for assembly and paralog detection in target enrichment data

PREMISE: The HybPiper pipeline has become one of the most widely used tools for the assembly of target enrichment (sequence capture) data for phylogenomic analysis. Between the production of locus sequences and phylogenetic analysis, the identification of paralogs is a critical step ensuring accurate inference of evolutionary relationships. Algorithmic approaches using gene tree topologies for the inference of ortholog groups are computationally efficient and broadly applicable to non-model organisms, especially in the absence of a known species tree. Unfortunately, software compatibility issues, unfamiliarity with relevant programming languages, and the complexity involved in running numerous subsequent analysis steps continue to limit the broad uptake of these approaches and constrain their application in practice. METHODS AND RESULTS: We updated the scripts constituting HybPiper and a pipeline for the inference of ortholog groups ("Yang and Smith") to provide novel options for the treatment of supercontigs, remove bugs, and seamlessly use the outputs of the former as inputs for the latter. The pipelines were containerised using Singularity and implemented via two Nextflow pipelines for easier deployment and to vastly reduce the number of commands required for their use. We tested the pipelines with several datasets, one of which is presented for demonstration. CONCLUSIONS: hybpiper-rbgv and yang-and-smith-rbgv provide easy installation, user-friendly experience, and robust results to the phylogenetic community. They are presently used as the analysis pipeline of the Australian Angiosperm Tree of Life project. The pipelines are available at https://github.com/chrisjackson-pellicle.

Download Full-text

Simulation and Emulation Tools for Fog Computing

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999201002152003 ◽

2020 ◽

Vol 13 ◽

Author(s):

Simar Preet Singh ◽

Rajesh Kumar ◽

Anju Sharma ◽

S. Raji Reddy ◽

Priyanka Vashisht

Keyword(s):

Quality Of Service ◽

Programming Languages ◽

Traffic Management ◽

Fog Computing ◽

Research Work ◽

Research Experience ◽

Computing Paradigm ◽

Quality Of Service Parameters ◽

User Friendly

Background: Fog computing paradigm has recently emerged and gained higher attention in present era of Internet of Things. The growth of large number of devices all around, leads to the situation of flow of packets everywhere on the Internet. To overcome this situation and to provide computations at network edge, fog computing is the need of present time that enhances traffic management and avoids critical situations of jam, congestion etc. Methods: For research purposes, there are many methods to implement the scenarios of fog computing i.e. real-time implementation, implementation using emulators, implementation using simulators etc. The present study aims to describe the various simulation and emulation tools for implementing fog computing scenarios. Results: Review shows that iFogSim is the simulator that most of the researchers use in their research work. Among emulators, EmuFog is being used at higher pace than other available emulators. This might be due to ease of implementation and user-friendly nature of these tools and language these tools are based upon. The use of such tools enhance better research experience and leads to improved quality of service parameters (like bandwidth, network, security etc.). Conclusion: There are many fog computing simulators/emulators based on many different platforms that uses different programming languages. The paper concludes that the two main simulation and emulation tools in the area of fog computing are iFogSim and EmuFog. Accessibility of these simulation/emulation tools enhance better research experience and leads to improved quality of service parameters along with the ease of their usage.

Download Full-text

BREC: an R package/Shiny app for automatically identifying heterochromatin boundaries and estimating local recombination rates along chromosomes

BMC Bioinformatics ◽

10.1186/s12859-021-04233-1 ◽

2021 ◽

Vol 22 (S6) ◽

Author(s):

Yasmine Mansour ◽

Annie Chateau ◽

Anna-Sophie Fiston-Lavier

Keyword(s):

Data Quality ◽

Data Science ◽

Fruit Fly ◽

R Package ◽

Model Organisms ◽

Data Quality Control ◽

Recombination Rates ◽

Functional Dynamics ◽

Shiny App ◽

User Friendly

Abstract Background Meiotic recombination is a vital biological process playing an essential role in genome's structural and functional dynamics. Genomes exhibit highly various recombination profiles along chromosomes associated with several chromatin states. However, eu-heterochromatin boundaries are not available nor easily provided for non-model organisms, especially for newly sequenced ones. Hence, we miss accurate local recombination rates necessary to address evolutionary questions. Results Here, we propose an automated computational tool, based on the Marey maps method, allowing to identify heterochromatin boundaries along chromosomes and estimating local recombination rates. Our method, called BREC (heterochromatin Boundaries and RECombination rate estimates) is non-genome-specific, running even on non-model genomes as long as genetic and physical maps are available. BREC is based on pure statistics and is data-driven, implying that good input data quality remains a strong requirement. Therefore, a data pre-processing module (data quality control and cleaning) is provided. Experiments show that BREC handles different markers' density and distribution issues. Conclusions BREC's heterochromatin boundaries have been validated with cytological equivalents experimentally generated on the fruit fly Drosophila melanogaster genome, for which BREC returns congruent corresponding values. Also, BREC's recombination rates have been compared with previously reported estimates. Based on the promising results, we believe our tool has the potential to help bring data science into the service of genome biology and evolution. We introduce BREC within an R-package and a Shiny web-based user-friendly application yielding a fast, easy-to-use, and broadly accessible resource. The BREC R-package is available at the GitHub repository https://github.com/GenomeStructureOrganization.

Download Full-text

A Universal, Genomewide GuideFinder for CRISPR/Cas9 Targeting in Microbial Genomes

mSphere ◽

10.1128/msphere.00086-20 ◽

2020 ◽

Vol 5 (1) ◽

Author(s):

Michelle Spoto ◽

Changhui Guan ◽

Elizabeth Fleming ◽

Julia Oh

Keyword(s):

Gene Function ◽

Large Scale ◽

Essential Gene ◽

Bacterial Species ◽

Bacterial Genome ◽

Model Organisms ◽

Design Parameters ◽

Bacterial Genomes ◽

Wide Range ◽

User Friendly

ABSTRACT The CRISPR/Cas system has significant potential to facilitate gene editing in a variety of bacterial species. CRISPR interference (CRISPRi) and CRISPR activation (CRISPRa) represent modifications of the CRISPR/Cas9 system utilizing a catalytically inactive Cas9 protein for transcription repression and activation, respectively. While CRISPRi and CRISPRa have tremendous potential to systematically investigate gene function in bacteria, few programs are specifically tailored to identify guides in draft bacterial genomes genomewide. Furthermore, few programs offer open-source code with flexible design parameters for bacterial targeting. To address these limitations, we created GuideFinder, a customizable, user-friendly program that can design guides for any annotated bacterial genome. GuideFinder designs guides from NGG protospacer-adjacent motif (PAM) sites for any number of genes by the use of an annotated genome and FASTA file input by the user. Guides are filtered according to user-defined design parameters and removed if they contain any off-target matches. Iteration with lowered parameter thresholds allows the program to design guides for genes that did not produce guides with the more stringent parameters, one of several features unique to GuideFinder. GuideFinder can also identify paired guides for targeting multiplicity, whose validity we tested experimentally. GuideFinder has been tested on a variety of diverse bacterial genomes, finding guides for 95% of genes on average. Moreover, guides designed by the program are functionally useful—focusing on CRISPRi as a potential application—as demonstrated by essential gene knockdown in two staphylococcal species. Through the large-scale generation of guides, this open-access software will improve accessibility to CRISPR/Cas studies of a variety of bacterial species. IMPORTANCE With the explosion in our understanding of human and environmental microbial diversity, corresponding efforts to understand gene function in these organisms are strongly needed. CRISPR/Cas9 technology has revolutionized interrogation of gene function in a wide variety of model organisms. Efficient CRISPR guide design is required for systematic gene targeting. However, existing tools are not adapted for the broad needs of microbial targeting, which include extraordinary species and subspecies genetic diversity, the overwhelming majority of which is characterized by draft genomes. In addition, flexibility in guide design parameters is important to consider the wide range of factors that can affect guide efficacy, many of which can be species and strain specific. We designed GuideFinder, a customizable, user-friendly program that addresses the limitations of existing software and that can design guides for any annotated bacterial genome with numerous features that facilitate guide design in a wide variety of microorganisms.

Download Full-text

miR2Diabetes: A Literature-Curated Database of microRNA Expression Patterns, in Diabetic Microvascular Complications

Genes ◽

10.3390/genes10100784 ◽

2019 ◽

Vol 10 (10) ◽

pp. 784 ◽

Cited By ~ 1

Author(s):

Sungjin Park ◽

SeongRyeol Moon ◽

Kiyoung Lee ◽

Ie Byung Park ◽

Dae Ho Lee ◽

...

Keyword(s):

Association Studies ◽

Microvascular Complications ◽

Expression Patterns ◽

Disease Model ◽

Model Organisms ◽

Web Interface ◽

Diabetic Microvascular Complications ◽

User Friendly ◽

Rats And Mice ◽

Kidney Liver

microRNAs (miRNAs) have been established as critical regulators of the pathogenesis of diabetes mellitus (DM), and diabetes microvascular complications (DMCs). However, manually curated databases for miRNAs, and DM (including DMCs) association studies, have yet to be established. Here, we constructed a user-friendly database, “miR2Diabetes,” equipped with a graphical web interface for simple browsing or searching manually curated annotations. The annotations in our database cover 14 DM and DMC phenotypes, involving 156 miRNAs, by browsing diverse sample origins (e.g., blood, kidney, liver, and other tissues). Additionally, we provide miRNA annotations for disease-model organisms (including rats and mice), of DM and DMCs, for the purpose of improving knowledge of the biological complexity of these pathologies. We assert that our database will be a comprehensive resource for miRNA biomarker studies, as well as for prioritizing miRNAs for functional validation, in DM and DMCs, with likely extension to other diseases.

Download Full-text

BREC: An R package/Shiny app for automatically identifying heterochromatin boundaries and estimating local recombination rates along chromosomes

10.1101/2020.06.29.178095 ◽

2020 ◽

Author(s):

Yasmine Mansour ◽

Annie Chateau ◽

Anna-Sophie Fiston-Lavier

Keyword(s):

Data Quality ◽

Data Science ◽

Fruit Fly ◽

R Package ◽

Model Organisms ◽

Data Quality Control ◽

Recombination Rates ◽

Functional Dynamics ◽

Shiny App ◽

User Friendly

AbstractMotivationMeiotic recombination is a vital biological process playing an essential role in genomes structural and functional dynamics. Genomes exhibit highly various recombination profiles along chromosomes associated with several chromatin states. However, eu-heterochromatin boundaries are not available nor easily provided for non-model organisms, especially for newly sequenced ones. Hence, we miss accurate local recombination rates, necessary to address evolutionary questions.ResultsHere, we propose an automated computational tool, based on the Marey maps method, allowing to identify heterochromatin boundaries along chromosomes and estimating local recombination rates. Our method, called BREC (heterochromatin Boundaries and RECombination rate estimates) is non-genome-specific, running even on non-model genomes as long as genetic and physical maps are available. BREC is based on pure statistics and is data-driven, implying that good input data quality remains a strong requirement. Therefore, a data pre-processing module (data quality control and cleaning) is provided. Experiments show that BREC handles different markers density and distribution issues. BREC’s heterochromatin boundaries have been validated with cytological equivalents experimentally generated on the fruit fly Drosophila melanogaster genome, for which BREC returns congruent corresponding values. Also, BREC’s recombination rates have been compared with previously reported estimates. Based on the promising results, we believe our tool has the potential to help bring data science into the service of genome biology and evolution. We introduce BREC within an R-package and a Shiny web-based user-friendly application yielding a fast, easy-to-use, and broadly accessible resource.AvailabilityBREC R-package is available at the GitHub repository https://github.com/ymansour21/BREC.

Download Full-text

ADVANTAGES AND DISADVANTAGES OF DELPHI BORLAND MUHAMMAD DENNY PRAYOGA 165100046

10.31219/osf.io/83j9b ◽

2018 ◽

Author(s):

Muhammad denny prayoga

Keyword(s):

Programming Languages ◽

Programming Language ◽

Sql Server ◽

Database Applications ◽

Complex Component ◽

Software Applications ◽

Advantages And Disadvantages ◽

User Friendly ◽

Visual Interface

Delphi is an IDE compiler for the Pascal programming language. Borland Delphi is one of the programming languages since it was first launched and immediately attracted by computer programmers. This is because Delphi provides facilities to create applications with an easy visual interface and provides satisfactory results.but the name of a product must be lacked. Here are the advantages and disadvantages of Delphi.Advantages:1. freeware2. Having a user friendly design for beginner programmers3. Has a fast compilation speed4. Has a very complex component for making software applications to databases5. Have a default database plugin application (BDE)6. The version is always updated, until now it has reached Delphi version 20097. The resulting application can be a portable executable and installable executable file8. It is very easy to connect to various database applications, for example BDE, Access, MySql, SQL Server, Oracle, and other databasesDeficiency:1. One of Delpi's shortcomings is the result of the compilation, the * .exe file will definitely overwhelm the memory!

Download Full-text

Phylogenomics reveals an extensive history of genome duplication in diatoms (Bacillariophyta)

10.1101/181115 ◽

2017 ◽

Author(s):

Matthew Parks ◽

Teofil Nakov ◽

Elizabeth Ruck ◽

Norman J. Wickett ◽

Andrew J. Alverson

Keyword(s):

Whole Genome Duplication ◽

Large Scale ◽

Genome Duplication ◽

Gene Tree ◽

Genomic Diversity ◽

Phylogenomic Analysis ◽

Whole Genome ◽

Gene Trees ◽

Angiosperm Evolution ◽

Polyploid Formation

ABSTRACTPremise of the studyDiatoms are one of the most species-rich lineages of microbial eukaryotes. Similarities in clade age, species richness, and contributions to primary production motivate comparisons to flowering plants, whose genomes have been inordinately shaped by whole genome duplication (WGD). These events that have been linked to speciation and increased rates of lineage diversification, identifying WGDs as a principal driver of angiosperm evolution. We synthesized a relatively large but scattered body of evidence that, taken together, suggests that polyploidy may be common in diatoms.MethodsWe used data from gene counts, gene trees, and patterns of synonymous divergence to carry out the first large-scale phylogenomic analysis of genome-scale duplication histories for a phylogenetically diverse set of 37 diatom taxa.Key resultsSeveral methods identified WGD events of varying age across diatoms, though determining the exact number and placement of events and, more broadly, inferences of WGD at all, were greatly impacted by gene-tree uncertainty. Gene-tree reconciliations supported allopolyploidy as the predominant mode of polyploid formation, with particularly strong evidence for ancient allopolyploid events in the thalassiosiroid and pennate diatom clades.ConclusionsWhole genome duplication appears to have been an important driver of genome evolution in diatoms. Denser taxon sampling will better pinpoint the timing of WGDs and likely reveal many more of them. We outline potential challenges in reconstructing paleopolyploid events in diatoms that, together with these results, offer a framework for understanding the evolutionary roles of genome duplication in a group that likely harbors substantial genomic diversity.

Download Full-text

Generalised Complementarity Analysis: identifying the most precious places for the conservation of Species, Functional and Phylogenetic Diversity

10.1101/189837 ◽

2017 ◽

Author(s):

David Anthony Nipperess

Keyword(s):

Conservation Planning ◽

Phylogenetic Diversity ◽

Exact Analytical Solution ◽

Area Under The Curve ◽

Null Model ◽

Mathematical Framework ◽

Computationally Efficient ◽

Complementarity Analysis ◽

Functional And Phylogenetic Diversity ◽

Algorithmic Approaches

AbstractThe most precious places for conservation are those that make the largest contribution to regional, national or global biodiversity. The two key concepts for determining the contribution of a specific site are Complementarity (the gain in diversity achieved when adding that site to a set of other sites) and Irreplaceability (here defined as the overall complementarity of that site when compared to a range of possible combinations of other sites). Generalised Complementarity Analysis (GCA) is a mathematical framework that provides an exact analytical solution for the expected complementarity (gain in diversity) of a focal site, when added to a set of other sites of a given size (m). Diversity is defined very generally to allow for complementarity to be calculated for species richness, Functional Diversity or Phylogenetic Diversity. The expected irreplaceability of a focal site is then defined in GCA as the area under the curve of expected complementarity values for all possible values of m. GCA is much more computationally efficient than existing algorithmic approaches and is scalable to very large numbers of sites. Because complementarity and irreplaceability are calculated for all possible combinations of sites, GCA serves as a null model for systematic conservation planning algorithms that seek to optimise site selection. However, because truly irreplaceable sites remain so under all possible site selections, GCA is a powerful conservation planning tool in its own right, providing an efficient means of identifying the world’s most precious places for conservation.

Download Full-text

MicrobeAnnotator: a user-friendly, comprehensive functional annotation pipeline for microbial genomes

BMC Bioinformatics ◽

10.1186/s12859-020-03940-5 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Carlos A. Ruiz-Perez ◽

Roth E. Conrad ◽

Konstantinos T. Konstantinidis

Keyword(s):

High Throughput ◽

Functional Annotation ◽

High Throughput Sequencing ◽

Gene Annotation ◽

Single Cells ◽

Computationally Efficient ◽

High Throughput Analysis ◽

Microbial Genomes ◽

Reference Protein ◽

User Friendly

Abstract Background High-throughput sequencing has increased the number of available microbial genomes recovered from isolates, single cells, and metagenomes. Accordingly, fast and comprehensive functional gene annotation pipelines are needed to analyze and compare these genomes. Although several approaches exist for genome annotation, these are typically not designed for easy incorporation into analysis pipelines, do not combine results from different annotation databases or offer easy-to-use summaries of metabolic reconstructions, and typically require large amounts of computing power for high-throughput analysis not available to the average user. Results Here, we introduce MicrobeAnnotator, a fully automated, easy-to-use pipeline for the comprehensive functional annotation of microbial genomes that combines results from several reference protein databases and returns the matching annotations together with key metadata such as the interlinked identifiers of matching reference proteins from multiple databases [KEGG Orthology (KO), Enzyme Commission (E.C.), Gene Ontology (GO), Pfam, and InterPro]. Further, the functional annotations are summarized into Kyoto Encyclopedia of Genes and Genomes (KEGG) modules as part of a graphical output (heatmap) that allows the user to quickly detect differences among (multiple) query genomes and cluster the genomes based on their metabolic similarity. MicrobeAnnotator is implemented in Python 3 and is freely available under an open-source Artistic License 2.0 from https://github.com/cruizperez/MicrobeAnnotator. Conclusions We demonstrated the capabilities of MicrobeAnnotator by annotating 100 Escherichia coli and 78 environmental Candidate Phyla Radiation (CPR) bacterial genomes and comparing the results to those of other popular tools. We showed that the use of multiple annotation databases allows MicrobeAnnotator to recover more annotations per genome compared to faster tools that use reduced databases and is computationally efficient for use in personal computers. The output of MicrobeAnnotator can be easily incorporated into other analysis pipelines while the results of other annotation tools can be seemingly incorporated into MicrobeAnnotator to generate summary plots.

Download Full-text

ZAF – An Open Source Fully Automated Feeder for Aquatic Facilities

10.1101/2021.04.28.441879 ◽

2021 ◽

Author(s):

Merlin Lange ◽

AhmetCan Solak ◽

Shruthi Vijay Kumar ◽

Hirofumi Kobayashi ◽

Bin Yang ◽

...

Keyword(s):

Open Source ◽

Food Distribution ◽

Model Organisms ◽

Precise Control ◽

Automatic Feeder ◽

Animal Feeding ◽

Fully Automatic ◽

Cost Efficient ◽

User Friendly ◽

Advanced Version

In the past few decades, aquatic animals have become popular model organisms in biology, spurring a growing need for establishing aquatic facilities. Zebrafish are widely studied and relatively easy to culture using commercial systems. However, a challenging aspect of maintaining aquatic facilities is animal feeding, which is both time- and resource-consuming. We have developed an open-source fully automatic daily feeding system, Zebrafish Automatic Feeder (ZAF). ZAF is reliable, provides a standardized amount of food to every tank, is cost-efficient, easy to build, and has a user-friendly interface. The advanced version, ZAF+, allows for the precise control of food distribution as a function of fish density per tank. Both ZAF and ZAF+ are adaptable to any laboratory environment and can help facilitate the implementation of aquatic colonies. Here we provide all blueprints and instructions for building the mechanics, electronics, fluidics, as well as to setup the control software and its user-friendly graphical interface. Importantly, the design is modular and can be scaled to meet different user needs. Furthermore, our results show that ZAF and ZAF+ do not adversely affect zebrafish culture, enabling fully automatic feeding for any aquatic facility.

Download Full-text