scholarly journals A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm

2012 ◽  
Vol 2012 ◽  
pp. 1-11 ◽  
Author(s):  
Xinyu Guo ◽  
Hong Wang ◽  
Vijay Devabhaktuni

A design of systolic array-based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one-clock-cycle, our design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single-clock-cycle. Further, we designed a Hits Combination Block which combines overlapping hits from systolic array into one hit. These implementations completed the first and second step of BLAST architecture and achieved significant speedup comparing with previously published architectures.

2018 ◽  
Vol 150 ◽  
pp. 06009 ◽  
Author(s):  
Dayana Saiful Nurdin ◽  
Mohd. Nazrin Md. Isa ◽  
Rizalafande Che Ismail ◽  
Muhammad Imran Ahmad

This paper presents a high performance systolic array (SA) core architecture design for Deoxyribonucleic Acid (DNA) sequencer. The core implements the affine gap penalty score Smith-Waterman (SW) algorithm. This time-consuming local alignment algorithm guarantees optimal alignment between DNA sequences, but it requires quadratic computation time when performed on standard desktop computers. The use of linear SA decreases the time complexity from quadratic to linear. In addition, with the exponential growth of DNA databases, the SA architecture is used to overcome the timing issue. In this work, the SW algorithm has been captured using Verilog Hardware Description Language (HDL) and simulated using Xilinx ISIM simulator. The proposed design has been implemented in Xilinx Virtex -6 Field Programmable Gate Array (FPGA) and improved in the core area by 90% reduction.


2021 ◽  
Vol 11 ◽  
Author(s):  
Haihe Shi ◽  
Gang Wu ◽  
Xuchu Zhang ◽  
Jun Wang ◽  
Haipeng Shi ◽  
...  

After years of development, the complexity of the biological sequence alignment algorithm is gradually increasing, and the lack of high abstract level domain research leads to the complexity of its algorithm development and improvement. By applying the idea of software components to the design and development of algorithms, the development efficiency and reliability of biological sequence alignment algorithms can be effectively improved. The component assembly platform applies related assembly technology, which simplifies the operation difficulty of component assembly and facilitates the maintenance and optimization of the algorithm. At the same time, a friendly visual interface is used to intuitively complete the assembly of algorithm components, and an executable sequence alignment algorithm program is obtained, which can directly carry out alignment computing.


2009 ◽  
Vol 2009 ◽  
pp. 1-10 ◽  
Author(s):  
Scott Lloyd ◽  
Quinn O. Snell

Biological sequence alignment is an essential tool used in molecular biology and biomedical applications. The growing volume of genetic data and the complexity of sequence alignment present a challenge in obtaining alignment results in a timely manner. Known methods to accelerate alignment on reconfigurable hardware only address sequence comparison, limit the sequence length, or exhibit memory and I/O bottlenecks. A space-efficient, global sequence alignment algorithm and architecture is presented that accelerates the forward scan and traceback in hardware without memory and I/O limitations. With 256 processing elements in FPGA technology, a performance gain over 300 times that of a desktop computer is demonstrated on sequence lengths of 16000. For greater performance, the architecture is scalable to more processing elements.


2019 ◽  
Vol 14 (2) ◽  
pp. 157-163
Author(s):  
Majid Hajibaba ◽  
Mohsen Sharifi ◽  
Saeid Gorgin

Background: One of the pivotal challenges in nowadays genomic research domain is the fast processing of voluminous data such as the ones engendered by high-throughput Next-Generation Sequencing technologies. On the other hand, BLAST (Basic Local Alignment Search Tool), a longestablished and renowned tool in Bioinformatics, has shown to be incredibly slow in this regard. Objective: To improve the performance of BLAST in the processing of voluminous data, we have applied a novel memory-aware technique to BLAST for faster parallel processing of voluminous data. Method: We have used a master-worker model for the processing of voluminous data alongside a memory-aware technique in which the master partitions the whole data in equal chunks, one chunk for each worker, and consequently each worker further splits and formats its allocated data chunk according to the size of its memory. Each worker searches every split data one-by-one through a list of queries. Results: We have chosen a list of queries with different lengths to run insensitive searches in a huge database called UniProtKB/TrEMBL. Our experiments show 20 percent improvement in performance when workers used our proposed memory-aware technique compared to when they were not memory aware. Comparatively, experiments show even higher performance improvement, approximately 50 percent, when we applied our memory-aware technique to mpiBLAST. Conclusion: We have shown that memory-awareness in formatting bulky database, when running BLAST, can improve performance significantly, while preventing unexpected crashes in low-memory environments. Even though distributed computing attempts to mitigate search time by partitioning and distributing database portions, our memory-aware technique alleviates negative effects of page-faults on performance.


Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1778
Author(s):  
Binhao He ◽  
Meiting Xue ◽  
Shubiao Liu ◽  
Wei Luo

As one of the most important operations in relational databases, the join is data-intensive and time-consuming. Thus, offloading this operation using field-programmable gate arrays (FPGAs) has attracted much interest and has been broadly researched in recent years. However, the available SRAM-based join architectures are often resource-intensive, power-consuming, or low-throughput. Besides, a lower match rate does not lead to a shorter operation time. To address these issues, a Bloom filter (BF)-based parallel join architecture is presented in this paper. This architecture first leverages the BF to discard the tuples that are not in the join result and classifies the remaining tuples into different channels. Second, a binary search tree is used to reduce the number of comparisons. The proposed method was implemented on a Xilinx FPGA, and the experimental results show that under a match rate of 50%, our architecture achieved a high join throughput of 145.8 million tuples per second and a maximum acceleration factor of 2.3 compared to the existing SRAM-based join architectures.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Dimitri Boeckaerts ◽  
Michiel Stock ◽  
Bjorn Criel ◽  
Hans Gerstmans ◽  
Bernard De Baets ◽  
...  

AbstractNowadays, bacteriophages are increasingly considered as an alternative treatment for a variety of bacterial infections in cases where classical antibiotics have become ineffective. However, characterizing the host specificity of phages remains a labor- and time-intensive process. In order to alleviate this burden, we have developed a new machine-learning-based pipeline to predict bacteriophage hosts based on annotated receptor-binding protein (RBP) sequence data. We focus on predicting bacterial hosts from the ESKAPE group, Escherichia coli, Salmonella enterica and Clostridium difficile. We compare the performance of our predictive model with that of the widely used Basic Local Alignment Search Tool (BLAST). Our best-performing predictive model reaches Precision-Recall Area Under the Curve (PR-AUC) scores between 73.6 and 93.8% for different levels of sequence similarity in the collected data. Our model reaches a performance comparable to that of BLASTp when sequence similarity in the data is high and starts outperforming BLASTp when sequence similarity drops below 75%. Therefore, our machine learning methods can be especially useful in settings in which sequence similarity to other known sequences is low. Predicting the hosts of novel metagenomic RBP sequences could extend our toolbox to tune the host spectrum of phages or phage tail-like bacteriocins by swapping RBPs.


Electronics ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 36
Author(s):  
Sang-Won Kim ◽  
Kee-Cheon Kim

In this paper, we propose a system that can recognize traffic types without prior knowledge of static features such as protocol header information by combining protocol analysis based on an ecological sequence alignment algorithm in a bioinformatics and fuzzy inference system. The algorithm proposed in this paper obtained up to a 91% level of performance at a similar level to several existing algorithms in experiments using datasets containing various types of traffic. In addition, it showed an excellent accuracy of 82.5% or more even under severe conditions that lowered the amount of data to a level of at least 40% or only included data in the middle of the traffic. This shows that the problem of dependence on initial data that frequently occurs in existing machine learning and deep learning-based traffic classification algorithms does not appear in the proposed algorithm. Furthermore, based on the ability to directly extract traffic characteristics without being dependent on static field values, it has secured the ability to respond with a small number of data by taking advantage of the flexibility of the membership function of the fuzzy inference engine. Through this, the applicability to low-power and low-performance environments such as IoT networks was confirmed. In this paper, we describe in detail the theoretical background for constructing such an algorithm and relevant experiments and considerations for actual verification.


Sign in / Sign up

Export Citation Format

Share Document