GenIE-Sys: Genome Integrative Explorer System

AbstractThere are an ever-increasing number of genomes being sequenced, many of which have associated RNA sequencing and other genomics data. The availability of user-friendly web-accessible mining tools ensures that these data repositories provide maximum benefit to the community. However, there are relatively few options available for setting up such standalone frameworks. We developed the Genome Integrative Explorer System (GenIE-Sys) to set up web resources to enable search, visualization and exploration of genomics data typically generated by a genome project.GenIE-Sys is implemented in PHP, JavaScript and Python and is freely available under the GNU GPL 3 public license. All source code is freely available at the GenIE-Sys website (https://geniesys.org) or GitHub (http://github.com/plantgenie/geniesys.git). Documentation is available at http://geniesys.readthedocs.io.

Download Full-text

PhyloCSF++: A fast and user-friendly implementation of PhyloCSF with annotation tools

10.1101/2021.03.10.434297 ◽

2021 ◽

Author(s):

Christopher Pockrandt ◽

Martin Steinegger ◽

Steven L. Salzberg

Keyword(s):

Source Code ◽

File Format ◽

Sequence Alignments ◽

Multiple Sequence ◽

Protein Coding ◽

Multiple Sequence Alignments ◽

Coding Regions ◽

Link Type ◽

A Genome ◽

User Friendly

AbstractSummaryPhyloCSF++ is an efficient and parallelized C++ implementation of the popular PhyloCSF method to distinguish protein-coding and non-coding regions in a genome based on multiple sequence alignments. It can score alignments or produce browser tracks for entire genomes in the wig file format. Additionally, PhyloCSF++ annotates coding sequences in GFF/GTF files using precomputed tracks or computes and scores multiple sequence alignments on the fly with MMseqs.AvailabilityPhyloCSF++ is released under the AGPLv3 license. Binaries and source code are available at https://github.com/cpockrandt/PhyloCSFpp. The software can be installed through bioconda. A variety of tracks can be accessed through ftp://ftp.ccb.jhu.edu/pub/software/phylocsf++/[email protected], [email protected]

Download Full-text

ASaiM: a Galaxy-based framework to analyze raw shotgun data from microbiota

10.1101/183970 ◽

2017 ◽

Cited By ~ 2

Author(s):

Bérénice Batut ◽

Kévin Gravouil ◽

Clémence Defois ◽

Saskia Hiltemann ◽

Jean-François Brugère ◽

...

Keyword(s):

Technological Progress ◽

Source Code ◽

Command Line ◽

Bioinformatic Tools ◽

Link Type ◽

Data Analyses ◽

The Galaxy ◽

Sequencing Platforms ◽

User Friendly ◽

New Generation

AbstractBackgroundNew generation of sequencing platforms coupled to numerous bioinformatics tools has led to rapid technological progress in metagenomics and metatranscriptomics to investigate complex microorganism communities. Nevertheless, a combination of different bioinformatic tools remains necessary to draw conclusions out of microbiota studies. Modular and user-friendly tools would greatly improve such studies.FindingsWe therefore developed ASaiM, an Open-Source Galaxy-based framework dedicated to microbiota data analyses. ASaiM provides a curated collection of tools to explore and visualize taxonomic and functional information from raw amplicon, metagenomic or metatranscriptomic sequences. To guide different analyses, several customizable workflows are included. All workflows are supported by tutorials and Galaxy interactive tours to guide the users through the analyses step by step. ASaiM is implemented as Galaxy Docker flavour. It is scalable to many thousand datasets, but also can be used a normal PC. The associated source code is available under Apache 2 license at https://github.com/ASaiM/framework and documentation can be found online (http://asaim.readthedocs.io/)ConclusionsBased on the Galaxy framework, ASaiM offers sophisticated analyses to scientists without command-line knowledge. ASaiM provides a powerful framework to easily and quickly explore microbiota data in a reproducible and transparent environment.

Download Full-text

PhotoModPlus: A webserver for photosynthetic protein prediction from a genome neighborhood feature

10.1101/2020.05.10.087635 ◽

2020 ◽

Author(s):

Apiwat Sangphukieo ◽

Teeraphan Laomettachit ◽

Marasri Ruengjitchatchawalya

Keyword(s):

Machine Learning ◽

New Model ◽

Link Type ◽

Photosynthetic Proteins ◽

Machine Learning Model ◽

Protein Prediction ◽

A Genome ◽

User Friendly ◽

Go Terms ◽

Better Than

AbstractIdentification of photosynthetic proteins and their functions is essential for understanding and improving photosynthetic efficiency. We present here a new webserver called PhotoModPlus as a platform to predict photosynthetic proteins via genome neighborhood networks (GNN) and a machine learning method. GNN facilitates users to visualize the overview of the conserved neighboring genes from multiple photosynthetic prokaryotic genomes and provides functional guidance to the query input. We also integrated a newly developed machine learning model for predicting photosynthesis-specific functions based on 24 prokaryotic photosynthesis-related GO terms, namely PhotoModGO, into the webserver. The new model was developed using a multi-label classification approach and genome neighborhood features. The performance of the new model was up to 0.872 of F1 measure, which was better than the sequence-based approaches evaluated by nested five-fold cross-validation. Finally, we demonstrated the applications of the webserver and the new model in the identification of novel photosynthetic proteins. The server was user-friendly designed and compatible with all devices and available at http://bicep.kmutt.ac.th/photomod or http://bicep2.kmutt.ac.th/photomod.

Download Full-text

MOSGA: Modular Open-Source Genome Annotator

Bioinformatics ◽

10.1093/bioinformatics/btaa1003 ◽

2020 ◽

Author(s):

Roman Martin ◽

Thomas Hackl ◽

Georges Hattab ◽

Matthias G Fischer ◽

Dominik Heider

Keyword(s):

Open Source ◽

Source Code ◽

Supplementary Information ◽

Web Interface ◽

Fully Integrated ◽

Sequencing Technologies ◽

A Genome ◽

Wide Range ◽

User Friendly ◽

Eukaryotic Genomes

Abstract Motivation The generation of high-quality assemblies, even for large eukaryotic genomes, has become a routine task for many biologists thanks to recent advances in sequencing technologies. However, the annotation of these assemblies—a crucial step toward unlocking the biology of the organism of interest—has remained a complex challenge that often requires advanced bioinformatics expertise. Results Here, we present MOSGA (Modular Open-Source Genome Annotator), a genome annotation framework for eukaryotic genomes with a user-friendly web-interface that generates and integrates annotations from various tools. The aggregated results can be analyzed with a fully integrated genome browser and are provided in a format ready for submission to NCBI. MOSGA is built on a portable, customizable and easily extendible Snakemake backend, and thus, can be tailored to a wide range of users and projects. Availability and implementation We provide MOSGA as a web service at https://mosga.mathematik.uni-marburg.de and as a docker container at registry.gitlab.com/mosga/mosga: latest. Source code can be found at https://gitlab.com/mosga/mosga Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

GenomeChronicler: The Personal Genome Project UK Genomic Report Generator Pipeline

10.1101/2020.01.06.873026 ◽

2020 ◽

Author(s):

José Afonso Guerra-Assunção ◽

Lucia Conde ◽

Ismail Moghul ◽

Amy P. Webster ◽

Simone Ecker ◽

...

Keyword(s):

Service Providers ◽

Genome Project ◽

Phenotypic Traits ◽

Personal Genome ◽

Whole Genome ◽

Sequencing Data ◽

Potential Health ◽

Personal Genome Project ◽

Link Type ◽

A Genome

AbstractIn recent years, there has been a significant increase in whole genome sequencing data of individual genomes produced by research projects as well as direct to consumer service providers. While many of these sources provide their users with an interpretation of the data, there is a lack of free, open tools for generating reports exploring the data in an easy to understand manner.GenomeChronicler was developed as part of the Personal Genome Project UK (PGP-UK) to address this need. PGP-UK provides genomic, transcriptomic, epigenomic and self-reported phenotypic data under an open-access model with full ethical approval. As a result, the reports generated by GenomeChronicler are intended for research purposes only and include information relating to potentially beneficial and potentially harmful variants, but without clinical curation.GenomeChronicler can be used with data from whole genome or whole exome sequencing, producing a genome report containing information on variant statistics, ancestry and known associated phenotypic traits. Example reports are available from the PGP-UK data page (personalgenomes.org.uk/data).The objective of this method is to leverage existing resources to find known phenotypes associated with the genotypes detected in each sample. The provided trait data is based primarily upon information available in SNPedia, but also collates data from ClinVar, GETevidence and gnomAD to provide additional details on potential health implications, presence of genotype in other PGP participants and population frequency of each genotype.The analysis can be run in a self-contained environment without requiring internet access, making it a good choice for cases where privacy is essential or desired: any third party project can embed GenomeChronicler within their off-line safe-haven environments. GenomeChronicler can be run for one sample at a time, or in parallel making use of the Nextflow workflow manager.The source code is available from GitHub (https://github.com/PGP-UK/GenomeChronicler), container recipes are available for Docker and Singularity, as well as a pre-built container from SingularityHub (https://singularity-hub.org/collections/3664) enabling easy deployment in a variety of settings. Users without access to computational resources to run GenomeChronicler can access the software from the Lifebit CloudOS platform (https://lifebit.ai/cloudos) enabling the production of reports and variant calls from raw sequencing data in a scalable fashion.

Download Full-text

Find research data repositories for the humanities - the data deposit recommendation service

International Journal of Digital Humanities ◽

10.1007/s42803-021-00030-7 ◽

2021 ◽

Author(s):

Stefan Buddenbohm ◽

Maaike de Jong ◽

Jean-Luc Minel ◽

Yoann Moranville

Keyword(s):

Research Data ◽

Legal Requirements ◽

Research Project ◽

Data Repositories ◽

Specific Research ◽

Link Type ◽

Set Up

AbstractHow can researchers identify suitable research data repositories for the deposit of their research data? Which repository matches best the technical and legal requirements of a specific research project? For this end and with a humanities perspective the Data Deposit Recommendation Service (DDRS) has been developed as a prototype. It not only serves as a functional service for selecting humanities research data repositories but it is particularly a technical demonstrator illustrating the potential of re-using an already existing infrastructure - in this case re3data - and the feasibility to set up this kind of service for other research disciplines. The documentation and the code of this project can be found in the DARIAH GitHub repository: https://dariah-eric.github.io/ddrs/.

Download Full-text

Pedigree and Pedigree Import Wizard

HortScience ◽

10.21273/hortsci.33.3.552g ◽

1998 ◽

Vol 33 (3) ◽

pp. 552g-553

Author(s):

Shahrokh Khandizadeh

Keyword(s):

Additional Data ◽

File Format ◽

Fruit Crops ◽

Operating Environment ◽

Agronomic Characteristics ◽

Link Type ◽

Plant Characteristics ◽

User Friendly

Pedigree for Windows is a user-friendly program that allows the user to trace agronomic characteristics, draw pedigrees, and view images of several fruit crops, including more than 1400 apple, 800 strawberry, 800 almond, 100 blackberry, 80 blueberry, 790 pear, 200 raspberry examples. Pedigree Import Wizard®© for Windows is an add-on software for users who are interested in importing their research or breeding data records of fruit, flower, and plant characteristics and any related images into Pedigree for Windows. Pedigree for Windows and Pedigree Import Wizard have been designed so that a user familiar with the Windows operating environment should have little need to refer to the documentation provided with the program. Pedigree Import Wizard uses a comma-separated value (csv) file format under the MS Excel environment. This option allows the user to add or import additional data to the existing database that are already stored in other software such as Lotus, Excel, Access, QuattroPro, WordPerfect, and MS Word tables, etc., as long as they work under the Windows environment. A free demo version of Pedigree and Pedigree Import Wizard for Windows is available from http://www.pgris.com.

Download Full-text

metaXplor: an interactive viral and microbial metagenomic data manager

GigaScience ◽

10.1093/gigascience/giab001 ◽

2021 ◽

Vol 10 (2) ◽

Author(s):

Guilhem Sempéré ◽

Adrien Pétel ◽

Magsen Abbé ◽

Pierre Lefeuvre ◽

Philippe Roumagnac ◽

...

Keyword(s):

Heterogeneous Data ◽

Metagenomic Data ◽

Online Data ◽

Data Repositories ◽

Ongoing Research ◽

Efficient Management ◽

Public Data ◽

Reference Databases ◽

Interactive Data ◽

User Friendly

Abstract Background Efficiently managing large, heterogeneous data in a structured yet flexible way is a challenge to research laboratories working with genomic data. Specifically regarding both shotgun- and metabarcoding-based metagenomics, while online reference databases and user-friendly tools exist for running various types of analyses (e.g., Qiime, Mothur, Megan, IMG/VR, Anvi'o, Qiita, MetaVir), scientists lack comprehensive software for easily building scalable, searchable, online data repositories on which they can rely during their ongoing research. Results metaXplor is a scalable, distributable, fully web-interfaced application for managing, sharing, and exploring metagenomic data. Being based on a flexible NoSQL data model, it has few constraints regarding dataset contents and thus proves useful for handling outputs from both shotgun and metabarcoding techniques. By supporting incremental data feeding and providing means to combine filters on all imported fields, it allows for exhaustive content browsing, as well as rapid narrowing to find specific records. The application also features various interactive data visualization tools, ways to query contents by BLASTing external sequences, and an integrated pipeline to enrich assignments with phylogenetic placements. The project home page provides the URL of a live instance allowing users to test the system on public data. Conclusion metaXplor allows efficient management and exploration of metagenomic data. Its availability as a set of Docker containers, making it easy to deploy on academic servers, on the cloud, or even on personal computers, will facilitate its adoption.

Download Full-text

Nonomuraea montanisoli sp. nov., isolated from mountain forest soil

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004695 ◽

2021 ◽

Author(s):

Suchart Chanama ◽

Chanwit Suriyachadkun ◽

Manee Chanama

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Related Species ◽

Sequence Similarity ◽

Diaminopimelic Acid ◽

Mountain Forest ◽

Rrna Gene ◽

Content Type ◽

Link Type ◽

A Genome

A novel actinomycete, strain SMC 257T, was isolated from a soil sample collected from mountain forest, Nan Province, Thailand. Strain SMC 257T formed tightly closed spiral spore chains on aerial mycelia. A polyphasic approach was used for the taxonomic study of this strain. Phylogenetic analysis based on 16S rRNA gene sequences indicated that strain SMC 257T belonged to the genus Nonomuraea , and the closest phylogenetically related species were Nonomuraea roseoviolacea subsp. carminata JCM 9946T (98.9 % 16S rRNA gene sequence similarity), Nonomuraea rhodomycinica TBRC 6557T (98.4 %), and Nonomuraea roseoviolacea subsp. roseoviolacea JCM 3145T (98.3 %). Genome sequencing revealed a genome size of 9.76 Mbp and a G+C content of 72.3 mol%. The genome average nucleotide identity (ANI) and the digital DNA–DNA hybridization (dDDH) values that distinguished this novel strain from its closest related species were species boundary of 95–96 % and 70 %, respectively. The cell wall peptidoglycan contained meso-diaminopimelic acid. The whole-cell sugars were glucose, ribose, madurose and mannose. The major menaquinone was MK-9(H4). The polar lipid profile consisted of phosphatidylethanolamine, hydroxyphosphatidylethanolamine, lysophosphatidylethanolamine, diphosphatidylglycerol, N-phosphatidylglycerol, phosphatidylinositol and phosphatidylinositol mannosides. The predominant cellular fatty acids were C17 : 0 10-methyl and iso-C16 : 0. Based on comparative analysis of phenotypic, chemotaxonomic and genotypic data, strain SMC 257T is considered to represent a novel species of the genus Nonomuraea , for which the name Nonomuraea montanisoli is proposed. The type strain is SMC 257T (=TBRC 13065T=NBRC 114772T).

Download Full-text

Clinical and multi-omics cross-phenotyping of patients with autoimmune and autoinflammatory diseases: the observational TRANSIMMUNOM protocol

BMJ Open ◽

10.1136/bmjopen-2017-021037 ◽

2018 ◽

Vol 8 (8) ◽

pp. e021037 ◽

Cited By ~ 2

Author(s):

Roberta Lorenzon ◽

Encarnita Mariotti-Ferrandiz ◽

Caroline Aheng ◽

Claire Ribet ◽

Ferial Toumi ◽

...

Keyword(s):

Systems Biology ◽

Good Clinical Practice ◽

Autoinflammatory Diseases ◽

Case Report Form ◽

Sample Collection ◽

Clinical Protocol ◽

Link Type ◽

Systems Immunology ◽

Hospital Ethics ◽

Set Up

IntroductionAutoimmune and autoinflammatory diseases (AIDs) represent a socioeconomic burden as the second cause of chronic illness in Western countries. In this context, the TRANSIMMUNOM clinical protocol is designed to revisit the nosology of AIDs by combining basic, clinical and information sciences. Based on classical and systems biology analyses, it aims to uncover important phenotypes that cut across diagnostic groups so as to discover biomarkers and identify novel therapeutic targets.Methods and analysisTRANSIMMUNOM is an observational clinical protocol that aims to cross-phenotype a set of 19 AIDs, six related control diseases and healthy volunteers . We assembled a multidisciplinary cohort management team tasked with (1) selecting informative biological (routine and omics type) and clinical parameters to be captured, (2) standardising the sample collection and shipment circuit, (3) selecting omics technologies and benchmarking omics data providers, (4) designing and implementing a multidisease electronic case report form and an omics database and (5) implementing supervised and unsupervised data analyses.Ethics and disseminationThe study was approved by the institutional review board of Pitié-Salpêtrière Hospital (ethics committee Ile-De-France 48–15) and done in accordance with the Declaration of Helsinki and good clinical practice. Written informed consent is obtained from all participants before enrolment in the study. TRANSIMMUNOM’s project website provides information about the protocol (https://www.transimmunom.fr/en/) including experimental set-up and tool developments. Results will be disseminated during annual scientific committees appraising the project progresses and at national and international scientific conferences.DiscussionSystems biology approaches are increasingly implemented in human pathophysiology research. The TRANSIMMUNOM study applies such approach to the pathophysiology of AIDs. We believe that this translational systems immunology approach has the potential to provide breakthrough discoveries for better understanding and treatment of AIDs.Trial registration numberNCT02466217; Pre-results.

Download Full-text