A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm

A design of systolic array-based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one-clock-cycle, our design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single-clock-cycle. Further, we designed a Hits Combination Block which combines overlapping hits from systolic array into one hit. These implementations completed the first and second step of BLAST architecture and achieved significant speedup comparing with previously published architectures.

Download Full-text

High Performance Systolic Array Core Architecture Design for DNA Sequencer

MATEC Web of Conferences ◽

10.1051/matecconf/201815006009 ◽

2018 ◽

Vol 150 ◽

pp. 06009 ◽

Cited By ~ 1

Author(s):

Dayana Saiful Nurdin ◽

Mohd. Nazrin Md. Isa ◽

Rizalafande Che Ismail ◽

Muhammad Imran Ahmad

Keyword(s):

Systolic Array ◽

Dna Sequences ◽

High Performance ◽

Computation Time ◽

Local Alignment ◽

Architecture Design ◽

Alignment Algorithm ◽

The Core ◽

Field Programmable ◽

Dna Sequencer

This paper presents a high performance systolic array (SA) core architecture design for Deoxyribonucleic Acid (DNA) sequencer. The core implements the affine gap penalty score Smith-Waterman (SW) algorithm. This time-consuming local alignment algorithm guarantees optimal alignment between DNA sequences, but it requires quadratic computation time when performed on standard desktop computers. The use of linear SA decreases the time complexity from quadratic to linear. In addition, with the exponential growth of DNA databases, the SA architecture is used to overcome the timing issue. In this work, the SW algorithm has been captured using Verilog Hardware Description Language (HDL) and simulated using Xilinx ISIM simulator. The proposed design has been implemented in Xilinx Virtex -6 Field Programmable Gate Array (FPGA) and improved in the core area by 90% reduction.

Download Full-text

Research and Implementation of Biological Sequence Alignment Algorithm

International Conference on Measurement and Control Engineering 2nd (ICMCE 2011) ◽

10.1115/1.859858.paper32 ◽

2011 ◽

pp. 221-228

Keyword(s):

Sequence Alignment ◽

Alignment Algorithm ◽

Biological Sequence ◽

Sequence Alignment Algorithm

Download Full-text

Accelerating Biological Sequence Alignment Algorithm on GPU with CUDA

2011 International Conference on Computational and Information Sciences ◽

10.1109/iccis.2011.61 ◽

2011 ◽

Cited By ~ 3

Author(s):

Fang Zheng ◽

Xianbin Xu ◽

Yuanhua Yang ◽

Shuibing He ◽

Yuping Zhang

Keyword(s):

Sequence Alignment ◽

Alignment Algorithm ◽

Biological Sequence ◽

Sequence Alignment Algorithm

Download Full-text

Research on Components Assembly Platform of Biological Sequences Alignment Algorithm

Frontiers in Genetics ◽

10.3389/fgene.2020.630923 ◽

2021 ◽

Vol 11 ◽

Author(s):

Haihe Shi ◽

Gang Wu ◽

Xuchu Zhang ◽

Jun Wang ◽

Haipeng Shi ◽

...

Keyword(s):

Sequence Alignment ◽

Alignment Algorithm ◽

Biological Sequence ◽

Sequence Alignment Algorithm ◽

Alignment Algorithms ◽

Abstract Level ◽

Assembly Technology ◽

Component Assembly ◽

Efficiency And Reliability ◽

Visual Interface

After years of development, the complexity of the biological sequence alignment algorithm is gradually increasing, and the lack of high abstract level domain research leads to the complexity of its algorithm development and improvement. By applying the idea of software components to the design and development of algorithms, the development efficiency and reliability of biological sequence alignment algorithms can be effectively improved. The component assembly platform applies related assembly technology, which simplifies the operation difficulty of component assembly and facilitates the maintenance and optimization of the algorithm. At the same time, a friendly visual interface is used to intuitively complete the assembly of algorithm components, and an executable sequence alignment algorithm program is obtained, which can directly carry out alignment computing.

Download Full-text

Hardware Accelerated Sequence Alignment with Traceback

International Journal of Reconfigurable Computing ◽

10.1155/2009/762362 ◽

2009 ◽

Vol 2009 ◽

pp. 1-10 ◽

Cited By ~ 13

Author(s):

Scott Lloyd ◽

Quinn O. Snell

Keyword(s):

Sequence Alignment ◽

Biomedical Applications ◽

Sequence Length ◽

Alignment Algorithm ◽

Performance Gain ◽

Biological Sequence ◽

Timely Manner ◽

Desktop Computer ◽

Processing Elements ◽

Sequence Alignment Algorithm

Biological sequence alignment is an essential tool used in molecular biology and biomedical applications. The growing volume of genetic data and the complexity of sequence alignment present a challenge in obtaining alignment results in a timely manner. Known methods to accelerate alignment on reconfigurable hardware only address sequence comparison, limit the sequence length, or exhibit memory and I/O bottlenecks. A space-efficient, global sequence alignment algorithm and architecture is presented that accelerates the forward scan and traceback in hardware without memory and I/O limitations. With 256 processing elements in FPGA technology, a performance gain over 300 times that of a desktop computer is demonstrated on sequence lengths of 16000. For greater performance, the architecture is scalable to more processing elements.

Download Full-text

The Influence of Memory-Aware Computation on Distributed BLAST

Current Bioinformatics ◽

10.2174/1574893613666180601080811 ◽

2019 ◽

Vol 14 (2) ◽

pp. 157-163

Author(s):

Majid Hajibaba ◽

Mohsen Sharifi ◽

Saeid Gorgin

Keyword(s):

Search Time ◽

Genomic Research ◽

Local Alignment ◽

Negative Effects ◽

Sequencing Technologies ◽

Percent Improvement ◽

Fast Processing ◽

Search Tool ◽

Memory Awareness ◽

Generation Sequencing

Background: One of the pivotal challenges in nowadays genomic research domain is the fast processing of voluminous data such as the ones engendered by high-throughput Next-Generation Sequencing technologies. On the other hand, BLAST (Basic Local Alignment Search Tool), a longestablished and renowned tool in Bioinformatics, has shown to be incredibly slow in this regard. Objective: To improve the performance of BLAST in the processing of voluminous data, we have applied a novel memory-aware technique to BLAST for faster parallel processing of voluminous data. Method: We have used a master-worker model for the processing of voluminous data alongside a memory-aware technique in which the master partitions the whole data in equal chunks, one chunk for each worker, and consequently each worker further splits and formats its allocated data chunk according to the size of its memory. Each worker searches every split data one-by-one through a list of queries. Results: We have chosen a list of queries with different lengths to run insensitive searches in a huge database called UniProtKB/TrEMBL. Our experiments show 20 percent improvement in performance when workers used our proposed memory-aware technique compared to when they were not memory aware. Comparatively, experiments show even higher performance improvement, approximately 50 percent, when we applied our memory-aware technique to mpiBLAST. Conclusion: We have shown that memory-awareness in formatting bulky database, when running BLAST, can improve performance significantly, while preventing unexpected crashes in low-memory environments. Even though distributed computing attempts to mitigate search time by partitioning and distributing database portions, our memory-aware technique alleviates negative effects of page-faults on performance.

Download Full-text

Fast DNA Sequence Alignment Algorithm Based on Quality Score Using Improved Dynamic Programming and Fuzzy Gap Cost Control

Current Bioinformatics ◽

10.2174/1574893609666140523000227 ◽

2014 ◽

Vol 9 (5) ◽

pp. 540-547

Author(s):

Kwang Kim ◽

Hyun Park ◽

Doo Song

Keyword(s):

Dynamic Programming ◽

Dna Sequence ◽

Sequence Alignment ◽

Cost Control ◽

Quality Score ◽

Alignment Algorithm ◽

Sequence Alignment Algorithm ◽

Dna Sequence Alignment ◽

Improved Dynamic Programming

Download Full-text

Bloom Filter-Based Parallel Architecture for Accelerating Equi-Join Operation on FPGA

Electronics ◽

10.3390/electronics10151778 ◽

2021 ◽

Vol 10 (15) ◽

pp. 1778

Author(s):

Binhao He ◽

Meiting Xue ◽

Shubiao Liu ◽

Wei Luo

Keyword(s):

Relational Databases ◽

Parallel Architecture ◽

Operation Time ◽

Bloom Filter ◽

Search Tree ◽

Binary Search Tree ◽

Maximum Acceleration ◽

Data Intensive ◽

Match Rate ◽

Field Programmable

As one of the most important operations in relational databases, the join is data-intensive and time-consuming. Thus, offloading this operation using field-programmable gate arrays (FPGAs) has attracted much interest and has been broadly researched in recent years. However, the available SRAM-based join architectures are often resource-intensive, power-consuming, or low-throughput. Besides, a lower match rate does not lead to a shorter operation time. To address these issues, a Bloom filter (BF)-based parallel join architecture is presented in this paper. This architecture first leverages the BF to discard the tuples that are not in the join result and classifies the remaining tuples into different channels. Second, a binary search tree is used to reduce the number of comparisons. The proposed method was implemented on a Xilinx FPGA, and the experimental results show that under a match rate of 50%, our architecture achieved a high join throughput of 145.8 million tuples per second and a maximum acceleration factor of 2.3 compared to the existing SRAM-based join architectures.

Download Full-text

Predicting bacteriophage hosts based on sequences of annotated receptor-binding proteins

Scientific Reports ◽

10.1038/s41598-021-81063-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dimitri Boeckaerts ◽

Michiel Stock ◽

Bjorn Criel ◽

Hans Gerstmans ◽

Bernard De Baets ◽

...

Keyword(s):

Machine Learning ◽

Predictive Model ◽

Receptor Binding ◽

Bacterial Infections ◽

Sequence Data ◽

Sequence Similarity ◽

Area Under The Curve ◽

Local Alignment ◽

Search Tool ◽

Different Levels

AbstractNowadays, bacteriophages are increasingly considered as an alternative treatment for a variety of bacterial infections in cases where classical antibiotics have become ineffective. However, characterizing the host specificity of phages remains a labor- and time-intensive process. In order to alleviate this burden, we have developed a new machine-learning-based pipeline to predict bacteriophage hosts based on annotated receptor-binding protein (RBP) sequence data. We focus on predicting bacterial hosts from the ESKAPE group, Escherichia coli, Salmonella enterica and Clostridium difficile. We compare the performance of our predictive model with that of the widely used Basic Local Alignment Search Tool (BLAST). Our best-performing predictive model reaches Precision-Recall Area Under the Curve (PR-AUC) scores between 73.6 and 93.8% for different levels of sequence similarity in the collected data. Our model reaches a performance comparable to that of BLASTp when sequence similarity in the data is high and starts outperforming BLASTp when sequence similarity drops below 75%. Therefore, our machine learning methods can be especially useful in settings in which sequence similarity to other known sequences is low. Predicting the hosts of novel metagenomic RBP sequences could extend our toolbox to tune the host spectrum of phages or phage tail-like bacteriocins by swapping RBPs.

Download Full-text

Traffic Type Recognition Method for Unknown Protocol—Applying Fuzzy Inference

Electronics ◽

10.3390/electronics10010036 ◽

2020 ◽

Vol 10 (1) ◽

pp. 36

Author(s):

Sang-Won Kim ◽

Kee-Cheon Kim

Keyword(s):

Fuzzy Inference ◽

Static Field ◽

Theoretical Background ◽

Alignment Algorithm ◽

Traffic Classification ◽

Recognition Method ◽

Inference System ◽

Traffic Characteristics ◽

Sequence Alignment Algorithm ◽

Low Performance

In this paper, we propose a system that can recognize traffic types without prior knowledge of static features such as protocol header information by combining protocol analysis based on an ecological sequence alignment algorithm in a bioinformatics and fuzzy inference system. The algorithm proposed in this paper obtained up to a 91% level of performance at a similar level to several existing algorithms in experiments using datasets containing various types of traffic. In addition, it showed an excellent accuracy of 82.5% or more even under severe conditions that lowered the amount of data to a level of at least 40% or only included data in the middle of the traffic. This shows that the problem of dependence on initial data that frequently occurs in existing machine learning and deep learning-based traffic classification algorithms does not appear in the proposed algorithm. Furthermore, based on the ability to directly extract traffic characteristics without being dependent on static field values, it has secured the ability to respond with a small number of data by taking advantage of the flexibility of the membership function of the fuzzy inference engine. Through this, the applicability to low-power and low-performance environments such as IoT networks was confirmed. In this paper, we describe in detail the theoretical background for constructing such an algorithm and relevant experiments and considerations for actual verification.

Download Full-text