scholarly journals Differential Expression Gene Explorer (DrEdGE): A tool for generating interactive online data visualizations for exploration of quantitative transcript abundance datasets

2019 ◽  
Author(s):  
Sophia C. Tintori ◽  
Patrick Golden ◽  
Bob Goldstein

AbstractAs the scientific community becomes increasingly interested in data sharing, there is a growing need for tools that facilitate the querying of public data. Mining of RNA-seq datasets, for example, has value to many biomedical researchers, yet is often effectively inaccessible to non-genomicist experts, even when the raw data are available. Here we present DrEdGE (dredge.bio.unc.edu), a free Web-based tool that facilitates data sharing between genomicists and their colleagues. The DrEdGE software guides genomicists through easily creating interactive online data visualizations, which colleagues can then explore and query according to their own conditions to discover genes, samples, or patterns of interest. We demonstrate DrEdGE’s features with three example websites we generated from publicly available datasets—human neuronal tissue, mouse embryonic tissue, and a C. elegans embryonic series. DrEdGE increases the utility of large genomics datasets by removing the technical obstacles that prevent interested parties from exploring the data independently.

2020 ◽  
Vol 36 (8) ◽  
pp. 2581-2583 ◽  
Author(s):  
Sophia C Tintori ◽  
Patrick Golden ◽  
Bob Goldstein

Abstract Summary Differential Expression Gene Explorer (DrEdGE) is a web-based tool that guides genomicists through easily creating interactive online data visualizations, which colleagues can query according to their own conditions to discover genes, samples or patterns of interest. We demonstrate DrEdGE’s features with three example websites generated from publicly available datasets—human neuronal tissue, mouse embryonic tissue and Caenorhabditis elegans whole embryos. DrEdGE increases the utility of large genomics datasets by removing technical obstacles to independent exploration. Availability and implementation Freely available at http://dredge.bio.unc.edu. Supplementary information Supplementary data are available at Bioinformatics online.


2015 ◽  
Vol 10 ◽  
pp. BMI.S25132 ◽  
Author(s):  
Jun-ichi Satoh ◽  
Yoshihiro Kino ◽  
Shumpei Niida

Background Alzheimer's disease (AD) is the most common cause of dementia with no curative therapy currently available. Establishment of sensitive and non-invasive biomarkers that promote an early diagnosis of AD is crucial for the effective administration of disease-modifying drugs. MicroRNAs (miRNAs) mediate posttranscriptional repression of numerous target genes. Aberrant regulation of miRNA expression is implicated in AD pathogenesis, and circulating miRNAs serve as potential biomarkers for AD. However, data analysis of numerous AD-specific miRNAs derived from small RNA-sequencing (RNA-Seq) is most often laborious. Methods To identify circulating miRNA biomarkers for AD, we reanalyzed a publicly available small RNA-Seq dataset, composed of blood samples derived from 48 AD patients and 22 normal control (NC) subjects, by a simple web-based miRNA data analysis pipeline that combines omiRas and DIANA miRPath. Results By using omiRas, we identified 27 miRNAs expressed differentially between both groups, including upregulation in AD of miR-26b-3p, miR-28–3p, miR-30c-5p, miR-30d-5p, miR-148b-5p, miR-151a-3p, miR-186–5p, miR-425–5p, miR-550a-5p, miR-1468, miR-4781–3p, miR-5001–3p, and miR-6513–3p and downregulation in AD of let-7a-5p, let-7e-5p, let-7f-5p, let-7g-5p, miR-15a-5p, miR-17–3p, miR-29b-3p, miR-98–5p, miR-144–5p, miR-148a-3p, miR-502–3p, miR-660–5p, miR-1294, and miR-3200–3p. DIANA miRPath indicated that miRNA-regulated pathways potentially down– regulated in AD are linked with neuronal synaptic functions, while those upregulated in AD are implicated in cell survival and cellular communication. Conclusions The simple web-based miRNA data analysis pipeline helps us to effortlessly identify candidates for miRNA biomarkers and pathways of AD from the complex small RNA–Seq data.


2018 ◽  
Author(s):  
Denis Torre ◽  
Alexander Lachmann ◽  
Avi Ma’ayan

AbstractInteractive notebooks can make bioinformatics data analyses more transparent, accessible and reusable. However, creating notebooks requires computer programming expertise. Here we introduce BioJupies, a web server that enables automated creation, storage, and deployment of Jupyter Notebooks containing RNA-seq data analyses. Through an intuitive interface, novice users can rapidly generate tailored reports to analyze and visualize their own raw sequencing files, their gene expression tables, or fetch data from >5,500 published studies containing >250,000 preprocessed RNA-seq samples. Generated notebooks have executable code of the entire pipeline, rich narrative text, interactive data visualizations, and differential expression and enrichment analyses. The notebooks are permanently stored in the cloud and made available online through a persistent URL. The notebooks are downloadable, customizable, and can run within a Docker container. By providing an intuitive user interface for notebook generation for RNA-seq data analysis, starting from the raw reads, all the way to a complete interactive and reproducible report, BioJupies is a useful resource for experimental and computational biologists. BioJupies is freely available as a web-based application from:http://biojupies.cloudand as a Chrome extension from theChrome Web Store.


2016 ◽  
Author(s):  
Sophia C. Tintori ◽  
Erin Osborne Nishimura ◽  
Patrick Golden ◽  
Jason D. Lieb ◽  
Bob Goldstein

HIGHLIGHTS‒RNA-seq on each cell of the early C. elegans embryo complements the known lineage‒We measured the zygotic activation specific to each unique cell of the embryo‒We identified genes that are functionally redundant and critical for development‒We created an interactive online data visualization tool for exploring the dataeTOC BLURBC. elegans is a powerful model for development, with an invariant and completely described cell lineage. To enrich this resource, we performed single-cell RNA-seq on each cell of the embryo through the 16-cell stage. Zygotic genome activation is differential between cell types. We identified hundreds of candidates for partially redundant genes, and verified one such set as critical for development. We created an interactive online data visualization tool to invite others to explore our dataset.SUMMARYDuring embryonic development, cells must establish fates, morphologies and behaviors in coordination with one another to form a functional body. A prevalent hypothesis for how this coordination is achieved is that each cell’s fate and behavior is determined by a defined mixture of RNAs. Only recently has it become possible to measure the full suite of transcripts in a single cell. Here we quantify the abundance of every mRNA transcript in each cell of the C. elegans embryo up to the 16-cell stage. We describe spatially dynamic expression, quantify cell-specific differential activation of the zygotic genome, and identify critical developmental genes previously unappreciated because of their partial redundancy. We present an interactive data visualization tool that allows broad access to our dataset. This genome-wide single-cell map of mRNA abundance, alongside the well-studied life history and fates of each cell, describes at a cellular resolution the mRNA landscape that guides development.


2017 ◽  
Author(s):  
Yaoyu E. Wang ◽  
Lev Kuznetsov ◽  
Antony Partensky ◽  
Jalil Farid ◽  
John Quackenbush

AbstractAlthough large, complex genomic data sets are increasingly easy to generate, and the number of publicly available data sets in cancer and other diseases is rapidly growing, the lack of intuitive, easy to use analysis tools has remained a barrier to the effective use of such data. WebMeV (https://mev.tm4.org) is an open-source, web-based tool that gives users access to sophisticated tools for analysis of RNA-Seq and other data in an interface designed to democratize data access. WebMeV combines cloud-based technologies with a simple user interface to allow users to access large public data sets such as that from The Cancer Genome Atlas (TCGA) or to upload their own. The interface allows users to visualize data and to apply advanced data mining analysis methods to explore the data and draw biologically meaningful conclusions. We provide an overview of WebMeV and demonstrate two simple use cases that illustrate the value of putting data analysis in the hands of those looking to explore the underlying biology of the systems being studied.


GigaScience ◽  
2021 ◽  
Vol 10 (2) ◽  
Author(s):  
Guilhem Sempéré ◽  
Adrien Pétel ◽  
Magsen Abbé ◽  
Pierre Lefeuvre ◽  
Philippe Roumagnac ◽  
...  

Abstract Background Efficiently managing large, heterogeneous data in a structured yet flexible way is a challenge to research laboratories working with genomic data. Specifically regarding both shotgun- and metabarcoding-based metagenomics, while online reference databases and user-friendly tools exist for running various types of analyses (e.g., Qiime, Mothur, Megan, IMG/VR, Anvi'o, Qiita, MetaVir), scientists lack comprehensive software for easily building scalable, searchable, online data repositories on which they can rely during their ongoing research. Results metaXplor is a scalable, distributable, fully web-interfaced application for managing, sharing, and exploring metagenomic data. Being based on a flexible NoSQL data model, it has few constraints regarding dataset contents and thus proves useful for handling outputs from both shotgun and metabarcoding techniques. By supporting incremental data feeding and providing means to combine filters on all imported fields, it allows for exhaustive content browsing, as well as rapid narrowing to find specific records. The application also features various interactive data visualization tools, ways to query contents by BLASTing external sequences, and an integrated pipeline to enrich assignments with phylogenetic placements. The project home page provides the URL of a live instance allowing users to test the system on public data. Conclusion metaXplor allows efficient management and exploration of metagenomic data. Its availability as a set of Docker containers, making it easy to deploy on academic servers, on the cloud, or even on personal computers, will facilitate its adoption.


2021 ◽  
Vol 25 (4) ◽  
pp. 949-972
Author(s):  
Nannan Zhang ◽  
Xixi Yao ◽  
Chao Luo

Fuzzy cognitive maps (FCMs) have widely been applied for knowledge representation and reasoning. However, in real life, reasoning is always accompanied with hesitation, which is deriving from the uncertainty and fuzziness. Especially, when processing the online data, since the internal and external interference, the distribution and characteristics of sequence data would be considerably changed along with the passage of time, which further increase the difficulty of modeling. In this article, based on intuitionistic fuzzy set theory, a new dynamic intuitionistic fuzzy cognitive map (DIFCM) scheme is proposed for online data prediction. Combined with a novel detection algorithm of concept drift, the structure of DIFCM can be adaptively updated with the online learning scheme, which can effectively improve the representation of online information by capturing the real-time changes of sequence data. Moreover, in order to tackle with the possible hesitancy in the process of modeling, intuitionistic fuzzy set is applied in the construction of dynamic FCM, where hesitation degree as a quantitative index explicitly expresses the hesitancy. Finally, a series of experiments using public data sets verify the effectiveness of the proposed method.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Shuhua Zhan ◽  
Cortland Griswold ◽  
Lewis Lukens

Abstract Background Genetic variation for gene expression is a source of phenotypic variation for natural and agricultural species. The common approach to map and to quantify gene expression from genetically distinct individuals is to assign their RNA-seq reads to a single reference genome. However, RNA-seq reads from alleles dissimilar to this reference genome may fail to map correctly, causing transcript levels to be underestimated. Presently, the extent of this mapping problem is not clear, particularly in highly diverse species. We investigated if mapping bias occurred and if chromosomal features associated with mapping bias. Zea mays presents a model species to assess these questions, given it has genotypically distinct and well-studied genetic lines. Results In Zea mays, the inbred B73 genome is the standard reference genome and template for RNA-seq read assignments. In the absence of mapping bias, B73 and a second inbred line, Mo17, would each have an approximately equal number of regulatory alleles that increase gene expression. Remarkably, Mo17 had 2–4 times fewer such positively acting alleles than did B73 when RNA-seq reads were aligned to the B73 reference genome. Reciprocally, over one-half of the B73 alleles that increased gene expression were not detected when reads were aligned to the Mo17 genome template. Genes at dissimilar chromosomal ends were strongly affected by mapping bias, and genes at more similar pericentromeric regions were less affected. Biased transcript estimates were higher in untranslated regions and lower in splice junctions. Bias occurred across software and alignment parameters. Conclusions Mapping bias very strongly affects gene transcript abundance estimates in maize, and bias varies across chromosomal features. Individual genome or transcriptome templates are likely necessary for accurate transcript estimation across genetically variable individuals in maize and other species.


2021 ◽  
Author(s):  
Victoria Leong ◽  
Kausar Raheel ◽  
Sim Jia Yi ◽  
Kriti Kacker ◽  
Vasilis M. Karlaftis ◽  
...  

Background. The global COVID-19 pandemic has triggered a fundamental reexamination of how human psychological research can be conducted both safely and robustly in a new era of digital working and physical distancing. Online web-based testing has risen to the fore as a promising solution for rapid mass collection of cognitive data without requiring human contact. However, a long-standing debate exists over the data quality and validity of web-based studies. Here, we examine the opportunities and challenges afforded by the societal shift toward web-based testing, highlight an urgent need to establish a standard data quality assurance framework for online studies, and develop and validate a new supervised online testing methodology, remote guided testing (RGT). Methods. A total of 85 healthy young adults were tested on 10 cognitive tasks assessing executive functioning (flexibility, memory and inhibition) and learning. Tasks were administered either face-to-face in the laboratory (N=41) or online using remote guided testing (N=44), delivered using identical web-based platforms (CANTAB, Inquisit and i-ABC). Data quality was assessed using detailed trial-level measures (missed trials, outlying and excluded responses, response times), as well as overall task performance measures. Results. The results indicated that, across all measures of data quality and performance, RGT data was statistically-equivalent to data collected in person in the lab. Moreover, RGT participants out-performed the lab group on measured verbal intelligence, which could reflect test environment differences, including possible effects of mask-wearing on communication. Conclusions. These data suggest that the RGT methodology could help to ameliorate concerns regarding online data quality and - particularly for studies involving high-risk or rare cohorts - offer an alternative for collecting high-quality human cognitive data without requiring in-person physical attendance.


Sign in / Sign up

Export Citation Format

Share Document