Raritas and RaritasVox: Programs for counting high diversity categorical data with highly unequal abundances

10.7287/peerj.preprints.26836 ◽

2018 ◽

Author(s):

David Lazarus ◽

Johan Renaudie ◽

Dorina Lenz ◽

Patrick Diver ◽

Jens Klump

Keyword(s):

Graphical User Interface ◽

Categorical Data ◽

Quantitative Data ◽

Source Code ◽

Flexible Structure ◽

Command Line ◽

File Format ◽

Occurrence Data ◽

Rare Category ◽

High Diversity

Acquiring data on the occurrences of many types of difficult to identify objects are often still made by human observation, e.g. in biodiversity and paleontologic research. Existing computer counting programs used to record such data have various limitations, including inflexibility and cost. We describe a pair of new open-source programs for this purpose - Raritas and RaritasVox, which share a similar graphical user interface for mouse based counting, and file output format. Raritas is written in Python and can be run as a standalone app for recent versions of either MacOS or Windows, or from the command line as easily customized source code. RaritasVox in addition supports voice based counting but is written in Java and is more complex to install or modify. Both programs explicitly support a rare category count mode which makes it easier to collect quantitative data on rare categories, e.g. rare species which are important in biodiversity surveys. Lastly, as to our knowledge no standards exist yet, we describe a new stratigraphic occurrence data (SOD) unitary file format which combines extensive metadata and a flexible structure for recording occurrence data of species or other categories in a series of samples.

Download Full-text

Raritas: a program for counting high diversity categorical data with highly unequal abundances

PeerJ ◽

10.7717/peerj.5453 ◽

2018 ◽

Vol 6 ◽

pp. e5453

Author(s):

David B. Lazarus ◽

Johan Renaudie ◽

Dorina Lenz ◽

Patrick Diver ◽

Jens Klump

Keyword(s):

Categorical Data ◽

Source Code ◽

Data File ◽

Command Line ◽

File Format ◽

Biodiversity Data ◽

Occurrence Data ◽

Source Program ◽

Rare Category ◽

High Diversity

Acquiring data on the occurrences of many types of difficult to identify objects are often still made by human observation, for example, in biodiversity and paleontologic research. Existing computer counting programs used to record such data have various limitations, including inflexibility and cost. We describe a new open-source program for this purpose—Raritas. Raritas is written in Python and can be run as a standalone app for recent versions of either MacOS or Windows, or from the command line as easily customized source code. The program explicitly supports a rare category count mode which makes it easier to collect quantitative data on rare categories, for example, rare species which are important in biodiversity surveys. Lastly, we describe the file format used by Raritas and propose it as a standard for storing geologic biodiversity data. ‘Stratigraphic occurrence data’ file format combines extensive sample metadata and a flexible structure for recording occurrence data of species or other categories in a series of samples.

Download Full-text

Alview: Portable Software for Viewing Sequence Reads in BAM Formatted Files

Cancer Informatics ◽

10.4137/cin.s26470 ◽

2015 ◽

Vol 14 ◽

pp. CIN.S26470 ◽

Cited By ~ 2

Author(s):

Richard P. Finney ◽

Qing-Rong Chen ◽

Cu V. Nguyen ◽

Chih Hao Hsu ◽

Chunhua Yan ◽

...

Keyword(s):

Graphical User Interface ◽

Reference Genome ◽

Source Code ◽

Software Tool ◽

Command Line ◽

Sequencing Data ◽

Genome Data ◽

Command Line Tool ◽

Portable Software ◽

Microsoft Windows

The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview . The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview .

Download Full-text

Audit logs to enforce document integrity in Skyline and Panorama

Bioinformatics ◽

10.1093/bioinformatics/btaa547 ◽

2020 ◽

Vol 36 (15) ◽

pp. 4366-4368

Author(s):

Tobias Rohde ◽

Rita Chupalov ◽

Nicholas Shulman ◽

Vagisha Sharma ◽

Josh Eckels ◽

...

Keyword(s):

User Interface ◽

Graphical User Interface ◽

Quantitative Data ◽

Supplementary Information ◽

Command Line ◽

Supplementary Data ◽

Spectrometry Method ◽

Quantitative Data Analysis ◽

User Actions ◽

Web Repository

Abstract Summary Skyline is a Windows application for targeted mass spectrometry method creation and quantitative data analysis. Like most graphical user interface (GUI) tools, it has a complex user interface with many ways for users to edit their files which makes the task of logging user actions challenging and is the reason why audit logging of every change is not common in GUI tools. We present an object comparison-based approach to audit logging for Skyline that is extensible to other GUI tools. The new audit logging system keeps track of all document modifications made through the GUI or the command line and displays them in an interactive grid. The audit log can also be uploaded and viewed in Panorama, a web repository for Skyline documents that can be configured to only accept documents with a valid audit log, based on embedded hashes to protect log integrity. This makes workflows involving Skyline and Panorama more reproducible. Availability and implementation Skyline is freely available at https://skyline.ms. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

NanoPack: visualizing and processing long read sequencing data

10.1101/237180 ◽

2017 ◽

Cited By ~ 2

Author(s):

Wouter De Coster ◽

Svenn D’Hert ◽

Darrin T. Schultz ◽

Marc Cruts ◽

Christine Van Broeckhoven

Keyword(s):

Web Service ◽

Graphical User Interface ◽

Source Code ◽

Supplementary Information ◽

Command Line ◽

Sequencing Data ◽

Link Type ◽

Oxford Nanopore ◽

Long Read ◽

Oxford Nanopore Technologies

AbstractSummary: Here we describe NanoPack, a set of tools developed for visualization and processing of long read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences.Availability and Implementation: The NanoPack tools are written in Python3 and released under the GNU GPL3.0 Licence. The source code can be found at https://github.com/wdecoster/nanopack, together with links to separate scripts and their documentation. The scripts are compatible with Linux, Mac OS and the MS Windows 10 subsystem for linux and are available as a graphical user interface, a web service at http://nanoplot.bioinf.be and command line tools.Contact:[email protected] information: Supplementary tables and figures are available at Bioinformatics online.

Download Full-text

Development of graphical user interface for open source VLSI digital synthesis tool Qflow

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.1.12649 ◽

2018 ◽

Vol 7 (2) ◽

pp. 710

Author(s):

K. Sripath Roy ◽

K. Abhiram ◽

M. Arun Sumanth ◽

Jaishree Jaishankar ◽

P. Abhishek ◽

...

Keyword(s):

User Interface ◽

Open Source ◽

Graphical User Interface ◽

Source Code ◽

Command Line ◽

Simulation Process ◽

Vlsi Technology ◽

Synthesis Tool ◽

Digital Synthesis ◽

User Friendly

There are many tools that are used for simulation in the domain of VLSI technology but none of them are easily accessible. There is a need for Free and open source tools in this stream so as to make them accessible to everyone. There are efficient tools that already exist in open source in VLSI stream but are not used widely because of their command line user interface. Hence, creating a user friendly interface will help many developers and users to work easily. This paper deals with the idea to solve the above issue by creating a Graphical User Interface for the open source VLSI tool called QFlow. Qflow is a tool used in synthesizing a VLSI circuit from the Verilog source code. There are multiple tools integrated with this tool to assure the simulation process. It is a combination of many dependencies that are used for synthesis, placement, layout viewing and routing in a fabrication process. All the independent tools used for the Verilog code simulation are integrated onto a single platform. Qt is used for creating the cross-stage application.

Download Full-text

UROPA GUI: A web platform for genomic region annotation

10.1101/302091 ◽

2018 ◽

Author(s):

Hendrik Schultheis ◽

Jens Preussner ◽

Annika Fust ◽

Mette Bentsen ◽

Carsten Kuenne ◽

...

Keyword(s):

Graphical User Interface ◽

Bioinformatics Analysis ◽

Source Code ◽

Genomic Region ◽

Command Line ◽

Web Based ◽

Link Type ◽

R Shiny ◽

Considerable Impact ◽

Web Platform

AbstractThe annotation of genomic ranges such as peaks resulting from ChIP-seq/ATAC-seq or other techniques represents a fundamental task of bioinformatics analysis with considerable impact on many downstream analyses. In our previous work, we introduced the Universal Robust Peak Annotator (UROPA), a flexible command line based tool which improves upon the functionality of existing annotation software. In order to reduce the complexity for biologists and clinicians, we have implemented an intuitive web-based graphical user interface (GUI) and fully functional service platform for UROPA. This extension will empower all users to generate annotations for regions of interest interactively.Availability and ImplementationThe open source UROPA GUI server was implemented in R Shiny and Python and is available from http://loosolab.mpi-bn.mpg.de. The source code of our App can be downloaded at https://github.molgen.mpg.de/loosolab/UROPA_GUI under the MIT license.

Download Full-text

Vargas: heuristic-free alignment for assessing linear and graph read aligners

10.1101/2019.12.20.884676 ◽

2019 ◽

Author(s):

Charlotte A. Darby ◽

Ravi Gaddipati ◽

Michael C. Schatz ◽

Ben Langmead

Keyword(s):

Gold Standard ◽

Source Code ◽

Alignment Accuracy ◽

Local Alignment ◽

Maximum Speed ◽

Command Line ◽

Scoring Functions ◽

Large Numbers ◽

Computationally Intensive ◽

Optimal Alignments

AbstractRead alignment is central to many aspects of modern genomics. Most aligners use heuristics to accelerate processing, but these heuristics can fail to find the optimal alignments of reads. Alignment accuracy is typically measured through simulated reads; however, the simulated location may not be the (only) location with the optimal alignment score. Vargas implements a heuristic-free algorithm guaranteed to find the highest-scoring alignment for real sequencing reads to a linear or graph genome. With semiglobal and local alignment modes and affine gap and quality-scaled mismatch penalties, it can implement the scoring functions of commonly used aligners to calculate optimal alignments. While this is computationally intensive, Vargas uses multi-core parallelization and vectorized (SIMD) instructions to make it practical to optimally align large numbers of reads, achieving a maximum speed of 456 billion cell updates per second. We demonstrate how these “gold standard” Vargas alignments can be used to improve heuristic alignment accuracy by optimizing command-line parameters in Bowtie 2, BWA-MEM, and vg to align more reads correctly. Source code implemented in C++ and compiled binary releases are available at https://github.com/langmead-lab/vargas under the MIT license.

Download Full-text

idCOV: a pipeline for quick clade identification of SARS-CoV-2 isolates

10.1101/2020.10.08.330456 ◽

2020 ◽

Author(s):

Xun Zhu ◽

Ti-Cheng Chang ◽

Richard Webby ◽

Gang Wu

Keyword(s):

Personal Computer ◽

Source Code ◽

Command Line ◽

Sequencing Data ◽

Link Type ◽

Public Dataset ◽

Virus Isolates

AbstractidCOV is a phylogenetic pipeline for quickly identifying the clades of SARS-CoV-2 virus isolates from raw sequencing data based on a selected clade-defining marker list. Using a public dataset, we show that idCOV can make equivalent calls as annotated by Nextstrain.org on all three common clade systems using user uploaded FastQ files directly. Web and equivalent command-line interfaces are available. It can be deployed on any Linux environment, including personal computer, HPC and the cloud. The source code is available at https://github.com/xz-stjude/idcov. A documentation for installation can be found at https://github.com/xz-stjude/idcov/blob/master/README.md.

Download Full-text

aCLImatise: automated generation of tool definitions for bioinformatics workflows

Bioinformatics ◽

10.1093/bioinformatics/btaa1033 ◽

2020 ◽

Author(s):

Michael Milton ◽

Natalie Thorne

Keyword(s):

Source Code ◽

Supplementary Information ◽

Command Line ◽

Supplementary Data ◽

Automated Generation ◽

Base Camp ◽

Python Package ◽

Bioinformatics Workflow ◽

Bioinformatics Workflows

Abstract Summary aCLImatise is a utility for automatically generating tool definitions compatible with bioinformatics workflow languages, by parsing command-line help output. aCLImatise also has an associated database called the aCLImatise Base Camp, which provides thousands of pre-computed tool definitions. Availability and implementation The latest aCLImatise source code is available within a GitHub organisation, under the GPL-3.0 license: https://github.com/aCLImatise. In particular, documentation for the aCLImatise Python package is available at https://aclimatise.github.io/CliHelpParser/, and the aCLImatise Base Camp is available at https://aclimatise.github.io/BaseCamp/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text