Development of Applications for Interactive and Reproducible Research: a Case Study

2016 ◽  
Vol 3 (1) ◽  
pp. 39 ◽  
Author(s):  
Federico Marini ◽  
Harald Binder

For a proper understanding of the organization and regulation of gene expression, the computational analysis is an essential component of the scientific workflow, and this is particularly true in the fields of biostatistics and bioinformatics. Interactivity and reproducibility are two highly relevant features to consider when adopting or designing a tool, and often they can not be provided simultaneously.In this work, we address the issue of developing a framework that can provide interactive analysis, in order to allow experimentalists to fully exploit advanced software tools, as well as reproducibility as an internal validation of the analysis steps, by providing the underlying code and data in such a way that enables the re-creation of the results, and also constitutes a didactic tool for the life scientist.We illustrate this paradigm with the help of the R/Bioconductor package pcaExplorer, designed as a practical companion for interactive and reproducible exploratory data analysis for high dimensional data (e.g. RNA-seq), and highlight some of the features that are provided in the software.

2021 ◽  
Author(s):  
Koen Van den Berge ◽  
Hsin-Jung Chou ◽  
Hector Roux de Bézieux ◽  
Kelly Street ◽  
Davide Risso ◽  
...  

AbstractModern assays have enabled high-throughput studies of epigenetic regulation of gene expression using DNA sequencing. In particular, the assay for transposase-accessible chromatin using sequencing (ATAC-seq) allows the study of chromatin configuration for an entire genome. Despite the gain in popularity of the assay, there have been limited studies investigating the analytical challenges related to ATAC-seq data, and most studies leverage tools developed for bulk transcriptome sequencing (RNA-seq). Here, we show that GC-content effects are omnipresent in ATAC-seq datasets. Since the GC-content effects are sample-specific, they can bias downstream analyses such as clustering and differential accessibility analysis. We evaluate twelve different normalization procedures on eight public ATAC-seq datasets and show that no method uniformly outperforms all others. However, our work clearly shows that accounting for GC-content effects in the normalization is crucial for common downstream ATAC-seq data analyses, such as clustering and differential accessibility analysis, leading to improved accuracy and interpretability of the results. Using two case studies, we show that exploratory data analysis is essential to guide the choice of an appropriate normalization method for a given dataset.


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0244122
Author(s):  
Dario Righelli ◽  
Claudia Angelini

During last years “irreproducibility” became a general problem in omics data analysis due to the use of sophisticated and poorly described computational procedures. For avoiding misleading results, it is necessary to inspect and reproduce the entire data analysis as a unified product. Reproducible Research (RR) provides general guidelines for public access to the analytic data and related analysis code combined with natural language documentation, allowing third-parties to reproduce the findings. We developed easyreporting, a novel R/Bioconductor package, to facilitate the implementation of an RR layer inside reports/tools. We describe the main functionalities and illustrate the organization of an analysis report using a typical case study concerning the analysis of RNA-seq data. Then, we show how to use easyreporting in other projects to trace R functions automatically. This latter feature helps developers to implement procedures that automatically keep track of the analysis steps. Easyreporting can be useful in supporting the reproducibility of any data analysis project and shows great advantages for the implementation of R packages and GUIs. It turns out to be very helpful in bioinformatics, where the complexity of the analyses makes it extremely difficult to trace all the steps and parameters used in the study.


2020 ◽  
Author(s):  
Dario Righelli ◽  
Claudia Angelini

AbstractDuring last years “irreproducibility” became a general problem in omics data analysis due to the use of sophisticated and poorly described computational procedures. For avoiding misleading results, it is necessary to inspect and reproduce the entire data analysis as a unified product. Reproducible Research (RR) provides general guidelines for public access to the analytic data and related analysis code combined with natural language documentation, allowing third-parties to reproduce the findings. We developed easyreporting, a novel R/Bioconductor package, to facilitate the implementation of an RR layer inside reports/tools without requiring any knowledge of the R Markdown language. We describe the main functionalities and illustrate how to create an analysis report using a typical case study concerning the analysis of RNA-seq data. Then, we also show how to trace R functions automatically. Thanks to this latter feature, easyreporting results beneficial for developers to implement procedures that automatically keep track of the analysis steps within Graphical User Interfaces (GUIs). Easyreporting can be useful in supporting the reproducibility of any data analysis project and the implementation of GUIs. It turns out to be very helpful in bioinformatics, where the complexity of the analyses makes it extremely difficult to trace all the steps and parameters used in the study.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Ryosuke Nakamura ◽  
Shigeyuki Mukudai ◽  
Renjie Bing ◽  
Michael J. Garabedian ◽  
Ryan C. Branski

AbstractSimilar to the hypertrophic scar and keloids, the efficacy of glucorticoids (GC) for vocal fold injury is highly variable. We previously reported dexamethasone enhanced the pro-fibrotic effects of transforming growth factor (TGF)-β as a potential mechanism for inconsistent clinical outcomes. In the current study, we sought to determine the mechanism(s) whereby GCs influence the fibrotic response and mechanisms underlying these effects with an emphasis on TGF-β and nuclear receptor subfamily 4 group A member 1 (NR4A1) signaling. Human VF fibroblasts (HVOX) were treated with three commonly-employed GCs+ /-TGF-β1. Phosphorylation of the glucocorticoid receptor (GR:NR3C1) and activation of NR4A1 was analyzed by western blotting. Genes involved in the fibrotic response, including ACTA2, TGFBR1, and TGFBR2 were analyzed by qPCR. RNA-seq was performed to identify global changes in gene expression induced by dexamethasone. GCs enhanced phosphorylation of GR at Ser211 and TGF-β-induced ACTA2 expression. Dexamethasone upregulated TGFBR1, and TGFBR2 in the presence of TGF-β1 and increased active NR4A1. RNA-seq results confirmed numerous pathways, including TGF-β signaling, affected by dexamethasone. Synergistic pro-fibrotic effects of TGF-β were observed across GCs and appeared to be mediated, at least partially, via upregulation of TGF-β receptors. Dexamethasone exhibited diverse regulation of gene expression including NR4A1 upregulation consistent with the anti-fibrotic potential of GCs.


2012 ◽  
Vol 10 (01) ◽  
pp. 1240007 ◽  
Author(s):  
CHENGCHENG SHEN ◽  
YING LIU

Alteration of gene expression in response to regulatory molecules or mutations could lead to different diseases. MicroRNAs (miRNAs) have been discovered to be involved in regulation of gene expression and a wide variety of diseases. In a tripartite biological network of human miRNAs, their predicted target genes and the diseases caused by altered expressions of these genes, valuable knowledge about the pathogenicity of miRNAs, involved genes and related disease classes can be revealed by co-clustering miRNAs, target genes and diseases simultaneously. Tripartite co-clustering can lead to more informative results than traditional co-clustering with only two kinds of members and pass the hidden relational information along the relation chain by considering multi-type members. Here we report a spectral co-clustering algorithm for k-partite graph to find clusters with heterogeneous members. We use the method to explore the potential relationships among miRNAs, genes and diseases. The clusters obtained from the algorithm have significantly higher density than randomly selected clusters, which means members in the same cluster are more likely to have common connections. Results also show that miRNAs in the same family based on the hairpin sequences tend to belong to the same cluster. We also validate the clustering results by checking the correlation of enriched gene functions and disease classes in the same cluster. Finally, widely studied miR-17-92 and its paralogs are analyzed as a case study to reveal that genes and diseases co-clustered with the miRNAs are in accordance with current research findings.


2017 ◽  
Vol 5 (1) ◽  
pp. 135-150
Author(s):  
Kusnan Kusnan

Arabic is the language of Muslim’s holy book, and reading  it is obligatory for Muslims. In Islamic education, Arabic is the language that should be mastered as a means of understanding the original texts of the source of Islamic law. One of the important thing in learning Arabic is the method. Zam-Zam Muhammadiyyah Modern Islamic Boarding School implements a diffent model and method of learning Arabic compared to other Islamic boarding schools in the district of Cilongok. This is a qualitative research, through a case study using interview, observation and documentation techniques for collecting data and interactive analysis for analying data. The findings of this research are three models of Arabic learning in Pondok Pesantren. The first model is khiwar or muhadatsah, the second is mufrodat walls intended to make students familiar with Arabic vocabulary, and the third is Lughoh. The method and model of Arabic learning in the institution as described above is a combined method. There are at least three methods used, i.e. Communicative Problem-Based Learning Method, Audiolingual Method, and Grammar-Translation Method.


2021 ◽  
Author(s):  
Mirela T. Cazzolato ◽  
Lucas S. Rodrigues ◽  
Marcela X. Ribeiro ◽  
Marco A. Gutierrez ◽  
Caetano Traina Jr. ◽  
...  

With the COVID-19 pandemic, many hospitals have collected Electronic Health Records (EHRs) from patients and shared them publicly. EHRs include heterogeneous attribute types, such as image exams, numerical, textual, and categorical information. Simply posing similarity queries over EHRs can underestimate the semantics and potential information of particular attributes and thus would be best supported by exploratory data analysis methods. Thus, we propose the Sketch method for comparing EHRs by similarity to provide a tool for a correlation-based exploratory analysis over different attributes. Sketch computes the overall data correlation considering the distance space of every attribute. Further, it employs both ANOVA and association rules with lift correlations to study the relationship between variables, allowing a deep data analysis. As a case study, we employed two open databases of COVID-19 cases, showing that specialists can benefit from the inference modules of Sketch to analyze EHRs. Sketch found strong correlations among tuples and attributes, with statistically significant results. The exploratory analysis has shown to complement the similarity search task, identifying and evaluating patterns discovered from heterogeneous attributes.


2020 ◽  
Vol 43 (2) ◽  
pp. 73
Author(s):  
Selamet Eko Edy Saputro ◽  
Dwiningtyas Padmaningrum ◽  
Arip Wijianto

The form of traditional knowledge that still persistence one is wiwitan tradition in Kedon Hamlet, Sumbermulyo Village, Bambanglipuro Subdistrict, Bantul Regency. The persistence of wiwitan tradition in Kedon Hamlet when the appreciation of cultural heritage generationsstarted to fade indicates that the presence of preservation efforts by the local community is still exist.Thisefforts can be done through learning and dissemination of wiwitan tradition by the local communities. Based on the idea this research aimed to; (1) determine how the communities of Kedon Hamlet, Sumbermulyo Village, Bambanglipuro Subdistrict Bantul Regency disseminate wiwitan tradition, (2) determine how the communities of Kedon Hamlet, Sumbermulyo Village, Bambanglipuro Subdistrict, Bantul Regencylearn wiwitan tradition. This research used a single case study method that examine the phenomenon of the persistence of wiwitan tradition. The data was collected by observation, documentation and indepth interview with informan. The determination informan conducted in purposive. Informan in the research are the head of Kedon Hamlet, chairman of the farmer groups, and the local communities. The data was analyzed using the model of interactive analysis Miles and Hubermen. This research showed that; (1) the communities of Kedon Hamlet disseminated wiwitan tradition in the form of cultural carnival, words of mouth and digital media, (2) the communities of Kedon Hamlet learned wiwitan tradition through social learning.


2015 ◽  
Vol 112 (27) ◽  
pp. E3545-E3554 ◽  
Author(s):  
Xu Wang ◽  
John H. Werren ◽  
Andrew G. Clark

There is extraordinary diversity in sexual dimorphism (SD) among animals, but little is known about its epigenetic basis. To study the epigenetic architecture of SD in a haplodiploid system, we performed RNA-seq and whole-genome bisulfite sequencing of adult females and males from two closely related parasitoid wasps, Nasonia vitripennis and Nasonia giraulti. More than 75% of expressed genes displayed significantly sex-biased expression. As a consequence, expression profiles are more similar between species within each sex than between sexes within each species. Furthermore, extremely male- and female-biased genes are enriched for totally different functional categories: male-biased genes for key enzymes in sex-pheromone synthesis and female-biased genes for genes involved in epigenetic regulation of gene expression. Remarkably, just 70 highly expressed, extremely male-biased genes account for 10% of all transcripts in adult males. Unlike expression profiles, DNA methylomes are highly similar between sexes within species, with no consistent sex differences in methylation found. Therefore, methylation changes cannot explain the extensive level of sex-biased gene expression observed. Female-biased genes have smaller sequence divergence between species, higher conservation to other hymenopterans, and a broader expression range across development. Overall, female-biased genes have been recruited from genes with more conserved and broadly expressing “house-keeping” functions, whereas male-biased genes are more recently evolved and are predominately testis specific. In summary, Nasonia accomplish a striking degree of sex-biased expression without sex chromosomes or epigenetic differences in methylation. We propose that methylation provides a general signal for constitutive gene expression, whereas other sex-specific signals cause sex-biased gene expression.


Sign in / Sign up

Export Citation Format

Share Document