iCOBRA: open, reproducible, standardized and live method benchmarking

Mapping Intimacies ◽

10.1101/033431 ◽

2015 ◽

Author(s):

Charlotte Soneson ◽

Mark D Robinson

Keyword(s):

Ground Truth ◽

R Package ◽

General Purpose ◽

Web Based ◽

Standard Format ◽

Standardized Method

We present iCOBRA, a flexible general-purpose web-based application and accompanying R package to evaluate, compare and visualize the performance of methods for estimation or classification when ground truth is available. iCOBRA is interactive, can be run locally or remotely and generates customizable, publication-ready graphics. To facilitate open, reproducible and standardized method comparisons, expanding as new innovations are made, we encourage the community to provide benchmark results in a standard format.

Download Full-text

Web-based HMI of Industrial Controllers for General Purpose

202020 3rd IEEE International Conference on Knowledge Innovation and Invention (ICKII) ◽

10.1109/ickii50300.2020.9318905 ◽

2020 ◽

Author(s):

Shyr-Long Jeng ◽

Wei-Hua Chieng

Keyword(s):

General Purpose ◽

Web Based

Download Full-text

MF2C3: Multi-Feature Fuzzy Clustering to Enhance Cell Colony Detection in Automated Clonogenic Assay Evaluation

Symmetry ◽

10.3390/sym12050773 ◽

2020 ◽

Vol 12 (5) ◽

pp. 773 ◽

Cited By ~ 3

Author(s):

Carmelo Militello ◽

Leonardo Rundo ◽

Luigi Minafra ◽

Francesco Paolo Cammarata ◽

Marco Calvaruso ◽

...

Keyword(s):

Clonogenic Assay ◽

Ground Truth ◽

General Purpose ◽

Computational Method ◽

Counting Procedure ◽

Proliferative Effect ◽

Biological Studies ◽

Biological Technique ◽

Fuzzy C Means Clustering ◽

Cell Colony

A clonogenic assay is a biological technique for calculating the Surviving Fraction (SF) that quantifies the anti-proliferative effect of treatments on cell cultures: this evaluation is often performed via manual counting of cell colony-forming units. Unfortunately, this procedure is error-prone and strongly affected by operator dependence. Besides, conventional assessment does not deal with the colony size, which is generally correlated with the delivered radiation dose or administered cytotoxic agent. Relying upon the direct proportional relationship between the Area Covered by Colony (ACC) and the colony count and size, along with the growth rate, we propose MF2C3, a novel computational method leveraging spatial Fuzzy C-Means clustering on multiple local features (i.e., entropy and standard deviation extracted from the input color images acquired by a general-purpose flat-bed scanner) for ACC-based SF quantification, by considering only the covering percentage. To evaluate the accuracy of the proposed fully automatic approach, we compared the SFs obtained by MF2C3 against the conventional counting procedure on four different cell lines. The achieved results revealed a high correlation with the ground-truth measurements based on colony counting, by outperforming our previously validated method using local thresholding on L*u*v* color well images. In conclusion, the proposed multi-feature approach, which inherently leverages the concept of symmetry in the pixel local distributions, might be reliably used in biological studies.

Download Full-text

TFEA.ChIP: a tool kit for transcription factor binding site enrichment analysis capitalizing on ChIP-seq datasets

Bioinformatics ◽

10.1093/bioinformatics/btz573 ◽

2019 ◽

Vol 35 (24) ◽

pp. 5339-5340 ◽

Cited By ~ 8

Author(s):

Laura Puente-Santamaria ◽

Wyeth W Wasserman ◽

Luis del Peso

Keyword(s):

Genomic Analysis ◽

Enrichment Analysis ◽

R Package ◽

Supplementary Information ◽

Web Based ◽

Factor Binding Site ◽

Gene Sets ◽

Transcription Regulators ◽

Computational Identification ◽

On Chip

Abstract Summary The computational identification of the transcription factors (TFs) [more generally, transcription regulators, (TR)] responsible for the co-regulation of a specific set of genes is a common problem found in genomic analysis. Herein, we describe TFEA.ChIP, a tool that makes use of ChIP-seq datasets to estimate and visualize TR enrichment in gene lists representing transcriptional profiles. We validated TFEA.ChIP using a wide variety of gene sets representing signatures of genetic and chemical perturbations as input and found that the relevant TR was correctly identified in 126 of a total of 174 analyzed. Comparison with other TR enrichment tools demonstrates that TFEA.ChIP is an highly customizable package with an outstanding performance. Availability and implementation TFEA.ChIP is implemented as an R package available at Bioconductor https://www.bioconductor.org/packages/devel/bioc/html/TFEA.ChIP.html and github https://github.com/LauraPS1/TFEA.ChIP_downloads. A web-based GUI to the package is also available at https://www.iib.uam.es/TFEA.ChIP/ Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

modelBuildR: an R package for model building and feature selection with erroneous classifications

PeerJ ◽

10.7717/peerj.10849 ◽

2021 ◽

Vol 9 ◽

pp. e10849

Author(s):

Maximilian Knoll ◽

Jennifer Furkel ◽

Juergen Debus ◽

Amir Abdollahi

Keyword(s):

Feature Selection ◽

Cross Validation ◽

Model Building ◽

Linear Models ◽

Binary Classification ◽

Ground Truth ◽

R Package ◽

Methylation Array ◽

Survival Difference ◽

Error Probabilities

Background Model building is a crucial part of omics based biomedical research to transfer classifications and obtain insights into underlying mechanisms. Feature selection is often based on minimizing error between model predictions and given classification (maximizing accuracy). Human ratings/classifications, however, might be error prone, with discordance rates between experts of 5–15%. We therefore evaluate if a feature pre-filtering step might improve identification of features associated with true underlying groups. Methods Data was simulated for up to 100 samples and up to 10,000 features, 10% of which were associated with the ground truth comprising 2–10 normally distributed populations. Binary and semi-quantitative ratings with varying error probabilities were used as classification. For feature preselection standard cross-validation (V2) was compared to a novel heuristic (V1) applying univariate testing, multiplicity adjustment and cross-validation on switched dependent (classification) and independent (features) variables. Preselected features were used to train logistic regression/linear models (backward selection, AIC). Predictions were compared against the ground truth (ROC, multiclass-ROC). As use case, multiple feature selection/classification methods were benchmarked against the novel heuristic to identify prognostically different G-CIMP negative glioblastoma tumors from the TCGA-GBM 450 k methylation array data cohort, starting from a fuzzy umap based rough and erroneous separation. Results V1 yielded higher median AUC ranks for two true groups (ground truth), with smaller differences for true graduated differences (3–10 groups). Lower fractions of models were successfully fit with V1. Median AUCs for binary classification and two true groups were 0.91 (range: 0.54–1.00) for V1 (Benjamini-Hochberg) and 0.70 (0.28–1.00) for V2, 13% (n = 616) of V2 models showed AUCs < = 50% for 25 samples and 100 features. For larger numbers of features and samples, median AUCs were 0.75 (range 0.59–1.00) for V1 and 0.54 (range 0.32–0.75) for V2. In the TCGA-GBM data, modelBuildR allowed best prognostic separation of patients with highest median overall survival difference (7.51 months) followed a difference of 6.04 months for a random forest based method. Conclusions The proposed heuristic is beneficial for the retrieval of features associated with two true groups classified with errors. We provide the R package modelBuildR to simplify (comparative) evaluation/application of the proposed heuristic (http://github.com/mknoll/modelBuildR).

Download Full-text

PRISMA2020: an R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis

10.1101/2021.07.14.21260492 ◽

2021 ◽

Author(s):

Neal R Haddaway ◽

Matthew J Page ◽

Christopher C Pritchard ◽

Luke A McGuinness

Keyword(s):

Systematic Review ◽

Systematic Reviews ◽

Science Communication ◽

Review Process ◽

Evidence Synthesis ◽

R Package ◽

Web Based ◽

Shiny App ◽

Data Files ◽

Flow Diagrams

Background Reporting standards, such as PRISMA aim to ensure that the methods and results of systematic reviews are described in sufficient detail to allow full transparency. Flow diagrams in evidence syntheses allow the reader to rapidly understand the core procedures used in a review and examine the attrition of irrelevant records throughout the review process. Recent research suggests that use of flow diagrams in systematic reviews is poor and of low quality and called for standardised templates to facilitate better reporting in flow diagrams. The increasing options for interactivity provided by the Internet gives us an opportunity to support easy-to-use evidence synthesis tools, and here we report on the development of tools for the production of PRISMA 2020-compliant systematic review flow diagrams. Methods and Findings We developed a free-to-use, Open Source R package and web-based Shiny app to allow users to design PRISMA flow diagrams for their own systematic reviews. Our tools allow users to produce standardised visualisations that transparently document the methods and results of a systematic review process in a variety of formats. In addition, we provide the opportunity to produce interactive, web-based flow diagrams (exported as HTML files), that allow readers to click on boxes of the diagram and navigate to further details on methods, results or data files. We provide an interactive example here; https://driscoll.ntu.ac.uk/prisma/. Conclusions We have developed a user-friendly suite of tools for producing PRISMA 2020-compliant flow diagrams for users with coding experience and, importantly, for users without prior experience in coding by making use of Shiny. These free-to-use tools will make it easier to produce clear and PRISMA 2020-compliant systematic review flow diagrams. Significantly, users can also produce interactive flow diagrams for the first time, allowing readers of their reviews to smoothly and swiftly explore and navigate to further details of the methods and results of a review. We believe these tools will increase use of PRISMA flow diagrams, improve the compliance and quality of flow diagrams, and facilitate strong science communication of the methods and results of systematic reviews by making use of interactivity. We encourage the systematic review community to make use of these tools, and provide feedback to streamline and improve their usability and efficiency.

Download Full-text

Design and Build Reporting Acceptance Test Procedure (ATP) for Web-Based BTS Installation

Jurnal Jartel Jurnal Jaringan Telekomunikasi ◽

10.33795/jartel.v3i2.219 ◽

2016 ◽

Vol 3 (2) ◽

pp. 40-46

Author(s):

Yanna Maharastri

Keyword(s):

System Design ◽

Test Procedure ◽

General Purpose ◽

Acceptance Test ◽

Work Time ◽

Web Based ◽

Service Testing ◽

Report System ◽

Report Data ◽

Using Data

The general purpose of making an ATP (Acceptance Test Procedure) report is to test whether the system that has been worked on is in accordance with the function specifications (validation). The ATP report used by an agency is infrastructure, equipment installation, etc. The manufacture of ATP for BTS installation on telecommunication subcons is manual which can reduce a lot of work time and data discrepancies. Therefore, a web-based ATP report system was designed to be able to perform reports with proper coordinate validation prior to installation and can also save work time. The web-based ATP report can match the important points of BTS installation. System planning starts from data collection and analysis to be used as a web-based report. Data transmission will be accommodated on the server and will be stored in the data base using My SQL. After the design is complete, it can be seen that with the analysis stage of the system design and system design using Data Flow Diagrams (DFD), input output design, responses from each each user with the function of each form content. for user engineers there are 97.2% agree from some menu functions, for user documentation there are 97.5% agree with the system, for user owners there are 91.65% agree and also QoS (Quality Of Service) testing is carried out for several cellular operators namely Im3 ooredoo , XL, Telkomsel and also the Polynema wifi network.

Download Full-text

Improving the Efficiency of Image Interpretation Using Ground Truth Terrestrial Photographs

Environmental Information Systems ◽

10.4018/978-1-5225-7033-2.ch049 ◽

2019 ◽

pp. 1098-1128

Author(s):

Gennady Gienko ◽

Michael Govorov

Keyword(s):

Data Collection ◽

Image Interpretation ◽

Ground Truth ◽

Remotely Sensed ◽

Natural Sciences ◽

Aerial Photographs ◽

Field Observations ◽

Web Based ◽

Remotely Sensed Imagery ◽

The Social

Researchers worldwide use remotely sensed imagery in their projects, in both the social and natural sciences. However, users often encounter difficulties working with satellite images and aerial photographs, as image interpretation requires specific experience and skills. The best way to acquire these skills is to go into the field, identify your location in an overhead image, observe the landscape, and find corresponding features in the overhead image. In many cases, personal observations could be substituted by using terrestrial photographs taken from the ground with conventional cameras. This chapter discusses the value of terrestrial photographs as a substitute for field observations, elaborates on issues of data collection, and presents results of experimental estimation of the effectiveness of the use of terrestrial ground truth photographs for interpretation of remotely sensed imagery. The chapter introduces the concept of GeoTruth – a web-based collaborative framework for collection, storing and distribution of ground truth terrestrial photographs and corresponding metadata.

Download Full-text

Entity deduplication in big data graphs for scholarly communication

Data Technologies and Applications ◽

10.1108/dta-09-2019-0163 ◽

2020 ◽

Vol 54 (4) ◽

pp. 409-435

Author(s):

Paolo Manghi ◽

Claudio Atzori ◽

Michele De Bonis ◽

Alessia Bardi

Keyword(s):

Big Data ◽

Ground Truth ◽

Open Science ◽

General Purpose ◽

High Level Architecture ◽

Multiple Sources ◽

Actual Problem ◽

Content Type ◽

Infrastructure System ◽

High Level

PurposeSeveral online services offer functionalities to access information from “big research graphs” (e.g. Google Scholar, OpenAIRE, Microsoft Academic Graph), which correlate scholarly/scientific communication entities such as publications, authors, datasets, organizations, projects, funders, etc. Depending on the target users, access can vary from search and browse content to the consumption of statistics for monitoring and provision of feedback. Such graphs are populated over time as aggregations of multiple sources and therefore suffer from major entity-duplication problems. Although deduplication of graphs is a known and actual problem, existing solutions are dedicated to specific scenarios, operate on flat collections, local topology-drive challenges and cannot therefore be re-used in other contexts.Design/methodology/approachThis work presents GDup, an integrated, scalable, general-purpose system that can be customized to address deduplication over arbitrary large information graphs. The paper presents its high-level architecture, its implementation as a service used within the OpenAIRE infrastructure system and reports numbers of real-case experiments.FindingsGDup provides the functionalities required to deliver a fully-fledged entity deduplication workflow over a generic input graph. The system offers out-of-the-box Ground Truth management, acquisition of feedback from data curators and algorithms for identifying and merging duplicates, to obtain an output disambiguated graph.Originality/valueTo our knowledge GDup is the only system in the literature that offers an integrated and general-purpose solution for the deduplication graphs, while targeting big data scalability issues. GDup is today one of the key modules of the OpenAIRE infrastructure production system, which monitors Open Science trends on behalf of the European Commission, National funders and institutions.

Download Full-text

www.wapip.com: A Special-Purpose Search Engine for Web Applications in Product Introduction Process

Volume 2: 26th Design Automation Conference ◽

10.1115/detc2000/dac-14517 ◽

2000 ◽

Author(s):

G. Q. Huang ◽

B. Shen ◽

K. L. Mak

Keyword(s):

Product Design ◽

Search Engine ◽

Web Applications ◽

General Purpose ◽

New Product Introduction ◽

Design Development ◽

Product Introduction ◽

Web Based ◽

The Right ◽

Design And Manufacture

Abstract There have been widespread interests in the development and application of World Wide Web (WWW) to support decision-making activities in product design and manufacture. An increasing number of web applications are emerging and a large number of practitioners are keen on trying these web-based decision support systems. In the meanwhile, it becomes increasingly difficult to surf for appropriate web applications on the Internet with general-purpose search engines. This paper describes a web site, WAPIP, that has been developed specifically to support new product introduction activities. It provides databases for software vendors and researchers to register their web applications with the “wapip” search engine. It also provides facilities to support practitioners in product design and manufacture to search rapidly for the right web applications suitable for solving their problems. This paper discusses the various issues regarding the design, development and operation of this “wapip” search engine.

Download Full-text

Biclique: An R package for Maximal Biclique Enumeration in Bipartite Graphs

10.21203/rs.2.16755/v2 ◽

2020 ◽

Author(s):

Yuping Lu ◽

Charles A. Phillips ◽

Michael A. Langston

Keyword(s):

State Of The Art ◽

Basic Research ◽

R Package ◽

Bipartite Graphs ◽

Heterogeneous Data ◽

General Purpose ◽

Public Repository ◽

Data Types ◽

Statistical Programming ◽

Reference Manual

Abstract Objective Bipartite graphs are widely used to model relationships between pairs of heterogeneous data types. Maximal bicliques are foundational structures in such graphs, and their enumeration is an important task in systems biology, epidemiology and many other problem domains. Thus, there is a need for an efficient, general purpose, publicly available tool to enumerate maximal bicliques in bipartite graphs. The statistical programming language R is a logical choice for such a tool, but until now no R package has existed for this purpose. Our objective is to provide such a package, so that the research community can more easily perform this computationally demanding task. Results Biclique is an R package that takes as input a bipartite graph and produces a listing of all maximal bicliques in this graph. Input and output formats are straightforward, with examples provided both in this paper and in the package documentation. Biclique employs a state-of-the-art algorithm previously developed for basic research in functional genomics. This package, along with its source code and reference manual, are freely available from the CRAN public repository at https://cran.r-project.org/web/packages/biclique/index.html .

Download Full-text