Automated Gene Data Integration with Databio

Boa: a link between worlds

10.7287/peerj.preprints.1947v1 ◽

2016 ◽

Author(s):

Stephen Romansky ◽

Sadegh Charmchi ◽

Abram Hindle

Keyword(s):

Business Models ◽

Source Code ◽

Data Sets ◽

Additional Insight ◽

Software Projects ◽

Software Developers ◽

Topic Analysis ◽

Platform As A Service ◽

Software Changes ◽

The Web

The business models of software/platform as a service have contributed to developers dependence on the Internet. Developers can rapidly point each other and consumers to the newest software changes with the power of the hyper link. But, developers are not limited to referencing software changes to one another through the web. Other shared hypermedia might include links to: Stack Overflow, Twitter, and issue trackers. This work explores the software traceability of Uniform Resource Locators (URLs) which software developers leave in commit messages and software repositories. URLs are easily extracted from commit messages and source code. Therefore, it would be useful to researchers if URLs provide additional insight on project development. To assess traceability, manual topic labelling is evaluated against automated topic labelling on URL data sets. This work also shows differences between URL data collected from commit messages versus URL data collected from source code. As well, this work explores outlying software projects with many URLs in case these projects do not provide meaningful software relationship information. Results from manual topic labelling show promise under evaluation while automated topic labelling did not yield precise topics. Further investigation of manual and automated topic analysis would be useful.

Matchathon: A guide to student-faculty connections in PhD programs

10.1101/2020.11.06.371526 ◽

2020 ◽

Author(s):

Haley Amemiya ◽

Zena Lapp ◽

Cathy Smith ◽

Margaret Durdan ◽

Michelle DiMondo ◽

...

Keyword(s):

Open Source ◽

Source Code ◽

Faculty Members ◽

Retention Rates ◽

Link Type ◽

Shiny App ◽

R Shiny ◽

Web App ◽

Phd Programs ◽

The Web

AbstractRelevant and impactful mentors are essential to a graduate student’s career. Finding mentors can be challenging in umbrella programs with hundreds of faculty members. To foster connections between potential mentors and students with similar research interests, we created a Matchathon event, which has successfully enabled students to find mentors. We developed an easy-to-use R Shiny app (https://github.com/UM-OGPS/matchathon/) to facilitate matching and organizing the event that can be used at any institution. It is our hope that this resource will improve the environment and retention rates for students in the academy.The open source app is publicly available on the web (app: https://UM-OGPS.shinyapps.io/matchathon/; source code: https://github.com/UM-OGPS/matchathon/).

MMseqs2 desktop and local web server app for fast, interactive sequence searches

10.1101/419895 ◽

2018 ◽

Author(s):

Milot Mirdita ◽

Martin Steinegger ◽

Johannes Söding

Keyword(s):

Protein Sequence ◽

Response Times ◽

Source Code ◽

Web Server ◽

Link Type ◽

Server Application ◽

The Web

SummaryThe MMseqs2 desktop and web server app facilitates interactive sequence searches through custom protein sequence and profile databases on personal workstations. By eliminating MMseqs2’s runtime overhead, we reduced response times to a few seconds at sensitivities close to BLAST.Availability and implementationThe app is easy to install for non-experts. Source code, prebuilt desktop app packages for Windows, macOS and Linux, Docker images for the web server application, and a demo web server are available at https://[email protected] or [email protected]

AKT: Ancestry and Kinship Toolkit

10.1101/047829 ◽

2016 ◽

Author(s):

Rudy Arthur ◽

Ole Schulz-Trieglaff ◽

Anthony J. Cox ◽

Jared Michael O’Connell

Keyword(s):

Data Clustering ◽

State Of The Art ◽

Source Code ◽

Statistical Genetics ◽

Data Sets ◽

Whole Genome ◽

Link Type ◽

Art Methods ◽

Invaluable Tool

AbstractAncestry and Kinship Toolkit (AKT) is a statistical genetics tool for analysing large cohorts of whole-genome sequenced samples. It can rapidly detect related samples, characterise sample ancestry, calculate correlation between variants, check Mendel consistency and perform data clustering. AKT brings together the functionality of many state-of-the-art methods, with a focus on speed and a unified interface. We believe it will be an invaluable tool for the curation of large WGS data-sets.AvailabilityThe source code is available at https://illumina.github.io/[email protected], [email protected]

Boa: a link between worlds

10.7287/peerj.preprints.1947 ◽

2016 ◽

Author(s):

Stephen Romansky ◽

Sadegh Charmchi ◽

Abram Hindle

Keyword(s):

Business Models ◽

Source Code ◽

Data Sets ◽

Additional Insight ◽

Software Projects ◽

Software Developers ◽

Topic Analysis ◽

Platform As A Service ◽

Software Changes ◽

The Web

The business models of software/platform as a service have contributed to developers dependence on the Internet. Developers can rapidly point each other and consumers to the newest software changes with the power of the hyper link. But, developers are not limited to referencing software changes to one another through the web. Other shared hypermedia might include links to: Stack Overflow, Twitter, and issue trackers. This work explores the software traceability of Uniform Resource Locators (URLs) which software developers leave in commit messages and software repositories. URLs are easily extracted from commit messages and source code. Therefore, it would be useful to researchers if URLs provide additional insight on project development. To assess traceability, manual topic labelling is evaluated against automated topic labelling on URL data sets. This work also shows differences between URL data collected from commit messages versus URL data collected from source code. As well, this work explores outlying software projects with many URLs in case these projects do not provide meaningful software relationship information. Results from manual topic labelling show promise under evaluation while automated topic labelling did not yield precise topics. Further investigation of manual and automated topic analysis would be useful.

A decoupled, modular and scriptable architecture for tools to curate data platforms

10.1101/2020.09.28.282699 ◽

2020 ◽

Author(s):

Moritz Langenstein ◽

Henning Hermjakob ◽

Manuel Bernal Llinares

Keyword(s):

Web Application ◽

Production Systems ◽

Source Code ◽

Black Box ◽

Command Line ◽

Web Interface ◽

Link Type ◽

Data Platform ◽

The Web

AbstractMotivationCuration is essential for any data platform to maintain the quality of the data it provides. Existing databases, which require maintenance, and the amount of newly published information that needs to be surveyed, are growing rapidly. More efficient curation is often vital to keep up with this growth, requiring modern curation tools. However, curation interfaces are often complex and difficult to further develop. Furthermore, opportunities for experimentation with curation workflows may be lost due to a lack of development resources, or a reluctance to change sensitive production systems.ResultsWe propose a decoupled, modular and scriptable architecture to build curation tools on top of existing platforms. Instead of modifying the existing infrastructure, our architecture treats the existing platform as a black box and relies only on its public APIs and web application. As a decoupled program, the tool’s architecture gives more freedom to developers and curators. This added flexibility allows for quickly prototyping new curation workflows as well as adding all kinds of analysis around the data platform. The tool can also streamline and enhance the curator’s interaction with the web interface of the platform. We have implemented this design in cmd-iaso, a command-line curation tool for the identifiers.org registry.AvailabilityThe cmd-iaso curation tool is implemented in Python 3.7+ and supports Linux, macOS and Windows. Its source code and documentation are freely available from https://github.com/identifiers-org/cmd-iaso. It is also published as a Docker container at https://hub.docker.com/r/identifiersorg/[email protected]

HiC-Spector: A matrix library for spectral and reproducibility analysis of Hi-C contact maps

10.1101/088922 ◽

2016 ◽

Cited By ~ 1

Author(s):

Koon-Kiu Yan ◽

Galip Guürkan Yardimci ◽

William S. Noble ◽

Mark Gerstein

Keyword(s):

Spectral Analysis ◽

Data Structures ◽

Spectral Decomposition ◽

Source Code ◽

Cell Types ◽

Proximity Ligation ◽

Contact Maps ◽

Link Type ◽

Genome Wide ◽

Different Cell Types

AbstractSummaryGenome-wide proximity ligation based assays like Hi-C have opened a window to the 3D organization of the genome. In so doing, they present data structures that are different from conventional 1D signal tracks. To exploit the 2D nature of Hi-C contact maps, matrix techniques like spectral analysis are particularly useful. Here, we present HiC-spector, a collection of matrix-related functions for analyzing Hi-C contact maps. In particular, we introduce a novel reproducibility metric for quantifying the similarity between contact maps based on spectral decomposition. The metric successfully separates contact maps mapped from Hi-C data coming from biological replicates, pseudo-replicates and different cell types.AvailabilitySource code in Julia and the documentation of HiC-spector can be freely obtained athttps://github.com/gersteinlab/[email protected]

ComPath: An ecosystem for exploring, analyzing, and curating mappings across pathway databases

10.1101/353235 ◽

2018 ◽

Author(s):

Daniel Domingo-Fernández ◽

Charles Tapley Hoyt ◽

Carlos Bobis-Álvarez ◽

Josep Marín-Llaó ◽

Martin Hofmann-Apitius

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Web Application ◽

Biological Systems ◽

Source Code ◽

Major Pathway ◽

Link Type ◽

Pathway Databases ◽

The Web

AbstractAlthough pathways are widely used for the analysis and representation of biological systems, their lack of clear boundaries, their dispersion across numerous databases, and the lack of interoperability impedes the evaluation of the coverage, agreements, and discrepancies between them. Here, we present ComPath, an ecosystem that supports curation of pathway mappings between databases and fosters the exploration of pathway knowledge through several novel visualizations. We have curated mappings between three of the major pathway databases and present a case study focusing on Parkinson’s disease that illustrates how ComPath can generate new biological insights by identifying pathway modules, clusters, and cross-talks with these mappings. The ComPath source code and resources are available at https://github.com/ComPath and the web application can be accessed at http://compath.scai.fraunhofer.de/.

libsbmljs — Enabling Web–Based SBML Tools

10.1101/594804 ◽

2019 ◽

Author(s):

J Kyle Medley ◽

Joseph Hellerstein ◽

Herbert M Sauro

Keyword(s):

Systems Biology ◽

Model Building ◽

Source Code ◽

Stochastic Simulations ◽

Web Based ◽

Server Side ◽

Link Type ◽

Software Distribution ◽

Api Documentation ◽

The Web

The SBML standard is used in a number of online repositories for storing systems biology models, yet there is currently no Web–capable JavaScript library that can read and write the SBML format. This is a severe limitation since the Web has become a universal means of software distribution, and the graphical capabilities of modern web browsers offer a powerful means for building rich, interactive applications. Also, there is a growing developer population specialized in web technologies that is poised to take advantage of the universality of the web to build the next generation of tools in systems biology and other fields. However, current solutions require server– side processing in order to support existing standards in modeling. We present libsbmljs, a JavaScript / WebAssembly library for Node.js and the Web with full support for all SBML extensions. Our library is an enabling technology for online SBML editors, model–building tools, and web–based simulators, and runs entirely in the browser without the need for any dedicated server resources. We provide NPM packages, an extensive set of examples, JavaScript API documentation, and an online demo that allows users to read and validate the SBML content of any model in the BioModels and BiGG databases. We also provide instructions and scripts to allow users to build a copy of libsbmljs against any libSBML version. Although our library supports all existing SBML extensions, we cover how to add additional extensions to the wrapper, should any arise in the future. To demonstrate the utility of this implementation, we also provide a demo at https://libsbmljsdemo.github.io/ with a proof–of–concept SBML simulator that supports ODE and stochastic simulations for SBML core models. Our project is hosted at https://libsbmljs.github.io/, which contains links to examples, API documentation, and all source code files and build scripts used to create libsbmljs. Our source code is licensed under the Apache 2.0 open source license.

A Fast Method for Defogging of Outdoor Visual Images

Recent Patents on Computer Science ◽

10.2174/2213275912666190819105422 ◽

2019 ◽

Vol 12 ◽

Author(s):

Tannistha Pal

Keyword(s):

Intelligent Vehicles ◽

Visual Surveillance ◽

Qualitative Assessment ◽

Computational Time ◽

Data Sets ◽

Fast Method ◽

Time Data ◽

Image Defogging ◽

Clear Vision ◽

Dark Channel

Images captured in severe atmospheric catastrophe especially in fog critically degrade the quality of an image and thereby reduces the visibility of an image which in turn affects several computer vision applications like visual surveillance detection, intelligent vehicles, remote sensing, etc. Thus acquiring clear vision is the prime requirement of any image. In the last few years, many approaches have been made towards solving this problem. In this article, a comparative analysis has been made on different existing image defogging algorithms and then a technique has been proposed for image defogging based on dark channel prior strategy. Experimental results show that the proposed method shows efficient results by significantly improving the visual effects of images in foggy weather. Also computational time of the existing techniques are much higher which has been overcame in this paper by using the proposed method. Qualitative assessment evaluation is performed on both benchmark and real time data sets for determining theefficacy of the technique used. Finally, the whole work is concluded with its relative advantages and shortcomings.