Biobtree: A tool to search, map and visualize bioinformatics identifiers and special keywords

Due to their nature, bioinformatics datasets are often closely related to each other. For this reason, search, mapping and visualization of these relations are often performed manually or programmatically via identifiers or special keywords such as gene symbols. Although various tools exist for these situations, the growing volume of bioinformatics datasets, emerging new software tools and approaches motivates new solutions. To provide a new tool for these current cases, I present the Biobtree bioinformatics tool. Biobtree effectively fetches and indexes identifiers and special keywords with their related identifiers from supported datasets, optionally with user pre-defined datasets and provides a web interface, web services and direct B+ tree data structure based single uniform database output. Biobtree can handle billions of identifiers and runs via a single executable file with no installation and dependency required. It also aims to provide a relatively small codebase for easy maintenance, addition of new features and extension to larger datasets. Biobtree is available to download from GitHub.

Download Full-text

Biobtree: A tool to search, map and visualize bioinformatics identifiers and special keywords

10.1101/520841 ◽

2019 ◽

Author(s):

Tamer Gur

Keyword(s):

Data Structure ◽

Web Services ◽

Software Tools ◽

Web Interface ◽

Bioinformatics Tool ◽

Link Type ◽

Executable File ◽

Tree Data ◽

Tree Data Structure ◽

Gene Symbols

AbstractDue to their nature, bioinformatics datasets are often closely related to each other. For this reason, search, mapping and visualization of these relations are often performed manual or programmatically via identifiers or special keywords such as gene symbols. Although various tools exist for these situations, the growing volume of bioinformatics datasets, emerging new software tools and approaches motivates new solutions. To provide a new tool for these current cases, I present the Biobtree bioinformatics tool. Biobtree effectively fetches and indexes identifiers and special keywords with their related identifiers from supported datasets, optionally with user pre-defined datasets and provides a web interface, web services and direct B+ tree data structure–based single uniform database output. Biobtree can handle billions of identifiers and runs via a single executable file with no installation and dependency required. It also aims to provide a relatively small codebase for easy maintenance, addition of new features and extension to larger datasets. Biobtree is available to download at https://www.github.com/tamerh/biobtree.

Download Full-text

Biobtree: A tool to search and map bioinformatics identifiers and special keywords

F1000Research ◽

10.12688/f1000research.17927.3 ◽

2020 ◽

Vol 8 ◽

pp. 145

Author(s):

Tamer Gur

Keyword(s):

Web Services ◽

Open Source ◽

Resource Usage ◽

Web Interface ◽

Bioinformatics Tool ◽

Link Type ◽

As Species ◽

Storage Resource ◽

Executable File ◽

And Storage

Biobtree is a bioinformatics tool to search and map bioinformatics datasets via identifiers or special keywords such as species name. It processes large bioinformatics datasets using a specialized MapReduce-based solution with optimum computational and storage resource usage. It provides uniform and B+ tree-based database output, a web interface, web services and allows performing chain mapping queries between datasets. It can be used via a single executable file or alternatively it can be used via the R or Python-based wrapper packages which are additionally provided for easier integration into existing pipelines. Biobtree is open source and available at GitHub.

Download Full-text

Biobtree: A tool to search and map bioinformatics identifiers and special keywords

F1000Research ◽

10.12688/f1000research.17927.4 ◽

2020 ◽

Vol 8 ◽

pp. 145

Author(s):

Tamer Gur

Keyword(s):

Web Services ◽

Open Source ◽

Resource Usage ◽

Web Interface ◽

Bioinformatics Tool ◽

Link Type ◽

As Species ◽

Storage Resource ◽

Executable File ◽

And Storage

Download Full-text

A Scalable Algorithm for Constructing Frequent Pattern Tree

International Journal of Intelligent Information Technologies ◽

10.4018/ijiit.2014010103 ◽

2014 ◽

Vol 10 (1) ◽

pp. 42-56 ◽

Cited By ~ 3

Author(s):

Zailani Abdullah ◽

Tutut Herawan ◽

A. Noraziah ◽

Mustafa Mat Deris

Keyword(s):

Data Structure ◽

Frequent Pattern ◽

Frequent Patterns ◽

Scalable Algorithm ◽

Tree Construction ◽

Frequent Pattern Tree ◽

Support Threshold ◽

Benchmark Datasets ◽

Tree Data ◽

Tree Data Structure

Frequent Pattern Tree (FP-Tree) is a compact data structure of representing frequent itemsets. The construction of FP-Tree is very important prior to frequent patterns mining. However, there have been too limited efforts specifically focused on constructing FP-Tree data structure beyond from its original database. In typical FP-Tree construction, besides the prior knowledge on support threshold, it also requires two database scans; first to build and sort the frequent patterns and second to build its prefix paths. Thus, twice database scanning is a key and major limitation in completing the construction of FP-Tree. Therefore, this paper suggests scalable Trie Transformation Technique Algorithm (T3A) to convert our predefined tree data structure, Disorder Support Trie Itemset (DOSTrieIT) into FP-Tree. Experiment results through two UCI benchmark datasets show that the proposed T3A generates FP-Tree up to 3 magnitudes faster than that the benchmarked FP-Growth.

Download Full-text