scholarly journals The bayberry database: a multiomic database for Myrica rubra, an important fruit tree with medicinal value

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Haiying Ren ◽  
Yuanhao He ◽  
Xingjiang Qi ◽  
Xiliang Zheng ◽  
Shuwen Zhang ◽  
...  

Abstract Background Chinese bayberry (Myrica rubra Sieb. & Zucc.) is an important fruit tree in China, and has high medicinal value. At present, the genome, transcriptome and germplasm resources of bayberry have been reported. In order to make more convenient use of these data, the Bayberry Database was established. Results The Bayberry Database is a comprehensive and intuitive data platform for examining the diverse annotated genome and germplasm resources of this species. This database contains nine central functional domains to interact with multiomic data: home, genome, germplasm, markers, tools, map, expression, reference, and contact. All domains provide pathways to a variety of data types composed of a reference genome sequence, transcriptomic data, gene patterns, phenotypic data, fruit images of Myrica rubra varieties, gSSR data, gene maps with annotation and evolutionary analyses. The tools module includes BLAST search, keyword search, sequence fetch and enrichment analysis functions. Conclusions The web address of the database is as follows http://www.bayberrybase.cn/. The Myrica rubra database is an intelligent, interactive, and user-friendly system that enables researchers, breeders and horticultural personnel to browse, search and retrieve relevant and useful information and thus facilitate genomic research and breeding efforts concerning Myrica rubra. This database will be of great help to bayberry research and breeding in the future.

2021 ◽  
Vol 22 (3) ◽  
pp. 1399
Author(s):  
Salim Ghannoum ◽  
Waldir Leoncio Netto ◽  
Damiano Fantini ◽  
Benjamin Ragan-Kelley ◽  
Amirabbas Parizadeh ◽  
...  

The growing attention toward the benefits of single-cell RNA sequencing (scRNA-seq) is leading to a myriad of computational packages for the analysis of different aspects of scRNA-seq data. For researchers without advanced programing skills, it is very challenging to combine several packages in order to perform the desired analysis in a simple and reproducible way. Here we present DIscBIO, an open-source, multi-algorithmic pipeline for easy, efficient and reproducible analysis of cellular sub-populations at the transcriptomic level. The pipeline integrates multiple scRNA-seq packages and allows biomarker discovery with decision trees and gene enrichment analysis in a network context using single-cell sequencing read counts through clustering and differential analysis. DIscBIO is freely available as an R package. It can be run either in command-line mode or through a user-friendly computational pipeline using Jupyter notebooks. We showcase all pipeline features using two scRNA-seq datasets. The first dataset consists of circulating tumor cells from patients with breast cancer. The second one is a cell cycle regulation dataset in myxoid liposarcoma. All analyses are available as notebooks that integrate in a sequential narrative R code with explanatory text and output data and images. R users can use the notebooks to understand the different steps of the pipeline and will guide them to explore their scRNA-seq data. We also provide a cloud version using Binder that allows the execution of the pipeline without the need of downloading R, Jupyter or any of the packages used by the pipeline. The cloud version can serve as a tutorial for training purposes, especially for those that are not R users or have limited programing skills. However, in order to do meaningful scRNA-seq analyses, all users will need to understand the implemented methods and their possible options and limitations.


2019 ◽  
Author(s):  
Wenlong Jia ◽  
Hechen Li ◽  
Shiying Li ◽  
Shuaicheng Li

ABSTRACTSummaryVisualizing integrated-level data from genomic research remains a challenge, as it requires sufficient coding skills and experience. Here, we present LandScapeoviz, a web-based application for interactive and real-time visualization of summarized genetic information. LandScape utilizes a well-designed file format that is capable of handling various data types, and offers a series of built-in functions to customize the appearance, explore results, and export high-quality diagrams that are available for publication.Availability and implementationLandScape is deployed at bio.oviz.org/demo-project/analyses/landscape for online use. Documentation and demo data are freely available on this website and GitHub (github.com/Nobel-Justin/Oviz-Bio-demo)[email protected]


Planta ◽  
2022 ◽  
Vol 255 (2) ◽  
Author(s):  
Nicholas Gladman ◽  
Andrew Olson ◽  
Sharon Wei ◽  
Kapeel Chougule ◽  
Zhenyuan Lu ◽  
...  

Abstract Main conclusion SorghumBase provides a community portal that integrates genetic, genomic, and breeding resources for sorghum germplasm improvement. Abstract Public research and development in agriculture rely on proper data and resource sharing within stakeholder communities. For plant breeders, agronomists, molecular biologists, geneticists, and bioinformaticians, centralizing desirable data into a user-friendly hub for crop systems is essential for successful collaborations and breakthroughs in germplasm development. Here, we present the SorghumBase web portal (https://www.sorghumbase.org), a resource for the sorghum research community. SorghumBase hosts a wide range of sorghum genomic information in a modular framework, built with open-source software, to provide a sustainable platform. This initial release of SorghumBase includes: (1) five sorghum reference genome assemblies in a pan-genome browser; (2) genetic variant information for natural diversity panels and ethyl methanesulfonate (EMS)-induced mutant populations; (3) search interface and integrated views of various data types; (4) links supporting interconnectivity with other repositories including genebank, QTL, and gene expression databases; and (5) a content management system to support access to community news and training materials. SorghumBase offers sorghum investigators improved data collation and access that will facilitate the growth of a robust research community to support genomics-assisted breeding.


2020 ◽  
Author(s):  
Kumari Sonal Choudhary ◽  
Eoin Fahy ◽  
Kevin Coakley ◽  
Manish Sud ◽  
Mano R Maurya ◽  
...  

ABSTRACTWith the advent of high throughput mass spectrometric methods, metabolomics has emerged as an essential area of research in biomedicine with the potential to provide deep biological insights into normal and diseased functions in physiology. However, to achieve the potential offered by metabolomics measures, there is a need for biologist-friendly integrative analysis tools that can transform data into mechanisms that relate to phenotypes. Here, we describe MetENP, an R package, and a user-friendly web application deployed at the Metabolomics Workbench site extending the metabolomics enrichment analysis to include species-specific pathway analysis, pathway enrichment scores, gene-enzyme information, and enzymatic activities of the significantly altered metabolites. MetENP provides a highly customizable workflow through various user-specified options and includes support for all metabolite species with available KEGG pathways. MetENPweb is a web application for calculating metabolite and pathway enrichment analysis.Availability and ImplementationThe MetENP package is freely available from Metabolomics Workbench GitHub: (https://github.com/metabolomicsworkbench/MetENP), the web application, is freely available at (https://www.metabolomicsworkbench.org/data/analyze.php)


Author(s):  
Peng Wang ◽  
Xin Li ◽  
Yue Gao ◽  
Qiuyan Guo ◽  
Shangwei Ning ◽  
...  

Abstract LnCeVar (http://www.bio-bigdata.net/LnCeVar/) is a comprehensive database that aims to provide genomic variations that disturb lncRNA-associated competing endogenous RNA (ceRNA) network regulation curated from the published literature and high-throughput data sets. LnCeVar curated 119 501 variation–ceRNA events from thousands of samples and cell lines, including: (i) more than 2000 experimentally supported circulating, drug-resistant and prognosis-related lncRNA biomarkers; (ii) 11 418 somatic mutation–ceRNA events from TCGA and COSMIC; (iii) 112 674 CNV–ceRNA events from TCGA; (iv) 67 066 SNP–ceRNA events from the 1000 Genomes Project. LnCeVar provides a user-friendly searching and browsing interface. In addition, as an important supplement of the database, several flexible tools have been developed to aid retrieval and analysis of the data. The LnCeVar–BLAST interface is a convenient way for users to search ceRNAs by interesting sequences. LnCeVar–Function is a tool for performing functional enrichment analysis. LnCeVar–Hallmark identifies dysregulated cancer hallmarks of variation–ceRNA events. LnCeVar–Survival performs COX regression analyses and produces survival curves for variation–ceRNA events. LnCeVar–Network identifies and creates a visualization of dysregulated variation–ceRNA networks. Collectively, LnCeVar will serve as an important resource for investigating the functions and mechanisms of personalized genomic variations that disturb ceRNA network regulation in human diseases.


Metabolites ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. 479
Author(s):  
Gayatri R. Iyer ◽  
Janis Wigginton ◽  
William Duren ◽  
Jennifer L. LaBarre ◽  
Marci Brandenburg ◽  
...  

Modern analytical methods allow for the simultaneous detection of hundreds of metabolites, generating increasingly large and complex data sets. The analysis of metabolomics data is a multi-step process that involves data processing and normalization, followed by statistical analysis. One of the biggest challenges in metabolomics is linking alterations in metabolite levels to specific biological processes that are disrupted, contributing to the development of disease or reflecting the disease state. A common approach to accomplishing this goal involves pathway mapping and enrichment analysis, which assesses the relative importance of predefined metabolic pathways or other biological categories. However, traditional knowledge-based enrichment analysis has limitations when it comes to the analysis of metabolomics and lipidomics data. We present a Java-based, user-friendly bioinformatics tool named Filigree that provides a primarily data-driven alternative to the existing knowledge-based enrichment analysis methods. Filigree is based on our previously published differential network enrichment analysis (DNEA) methodology. To demonstrate the utility of the tool, we applied it to previously published studies analyzing the metabolome in the context of metabolic disorders (type 1 and 2 diabetes) and the maternal and infant lipidome during pregnancy.


Author(s):  
Shyue-Liang Wang ◽  
◽  
Yu-Jane Tsai ◽  

We present a generalized approach for handling null queries that contain compound fuzzy attributes. Null queries elicit a null answer from the database. Compound fuzzy attributes are ambiguous attributes not defined in the original database schema but derived from multiple rigid attributes in a schema. Compound fuzzy attributes derived from simple numbers were studied by Nomura11). We extend compound fuzzy attributes so they can be derived from numbers, interval values, scalars, and sets of all these data types. Database management systems that handle this type of ambiguous attributes in null queries both reduce occurrences of null answers and provide an improved user-friendly query environment.


2018 ◽  
Vol 1 (1) ◽  
pp. 263-274 ◽  
Author(s):  
Marylyn D. Ritchie

Biomedical data science has experienced an explosion of new data over the past decade. Abundant genetic and genomic data are increasingly available in large, diverse data sets due to the maturation of modern molecular technologies. Along with these molecular data, dense, rich phenotypic data are also available on comprehensive clinical data sets from health care provider organizations, clinical trials, population health registries, and epidemiologic studies. The methods and approaches for interrogating these large genetic/genomic and clinical data sets continue to evolve rapidly, as our understanding of the questions and challenges continue to emerge. In this review, the state-of-the-art methodologies for genetic/genomic analysis along with complex phenomics will be discussed. This field is changing and adapting to the novel data types made available, as well as technological advances in computation and machine learning. Thus, I will also discuss the future challenges in this exciting and innovative space. The promises of precision medicine rely heavily on the ability to marry complex genetic/genomic data with clinical phenotypes in meaningful ways.


2021 ◽  
Author(s):  
Hagai Levi ◽  
Nima Rahmanian ◽  
Ran Elkon ◽  
Ron Shamir

Active module identification (AMI) is an essential step in many omics analyses. Such algorithms receive a gene network and a gene activity profile as input and report subnetworks that show significant over-representation of accrued activity signal ("active modules"). Such modules can point out key molecular processes in the analyzed biological conditions. We recently introduced a novel AMI algorithm called DOMINO, and demonstrated that it detects active modules that capture biological signals with markedly improved rate of empirical validation. Here, we provide an online server that executes DOMINO, making it more accessible and user-friendly. To help the interpretation of solutions, the server provides GO enrichment analysis, module visualizations, and accessible output formats for customized downstream analysis. It also enables running DOMINO with various gene identifiers of different organisms. The server is available at http://domino.cs.tau.ac.il. Its codebase is available at https://github.com/Shamir-Lab.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11333
Author(s):  
Daniyar Karabayev ◽  
Askhat Molkenov ◽  
Kaiyrgali Yerulanuly ◽  
Ilyas Kabimoldayev ◽  
Asset Daniyarov ◽  
...  

Background High-throughput sequencing platforms generate a massive amount of high-dimensional genomic datasets that are available for analysis. Modern and user-friendly bioinformatics tools for analysis and interpretation of genomics data becomes essential during the analysis of sequencing data. Different standard data types and file formats have been developed to store and analyze sequence and genomics data. Variant Call Format (VCF) is the most widespread genomics file type and standard format containing genomic information and variants of sequenced samples. Results Existing tools for processing VCF files don’t usually have an intuitive graphical interface, but instead have just a command-line interface that may be challenging to use for the broader biomedical community interested in genomics data analysis. re-Searcher solves this problem by pre-processing VCF files by chunks to not load RAM of computer. The tool can be used as standalone user-friendly multiplatform GUI application as well as web application (https://nla-lbsb.nu.edu.kz). The software including source code as well as tested VCF files and additional information are publicly available on the GitHub repository (https://github.com/LabBandSB/re-Searcher).


Sign in / Sign up

Export Citation Format

Share Document