Two-Phase PFAC Algorithm for Multiple Patterns Matching on CUDA GPUs

Wei-Shen Lai; Chao-Chin Wu; Lien-Fu Lai; Min-Chi Sie

doi:10.3390/electronics8030270

Two-Phase PFAC Algorithm for Multiple Patterns Matching on CUDA GPUs

Electronics ◽

10.3390/electronics8030270 ◽

2019 ◽

Vol 8 (3) ◽

pp. 270

Author(s):

Wei-Shen Lai ◽

Chao-Chin Wu ◽

Lien-Fu Lai ◽

Min-Chi Sie

Keyword(s):

Pattern Matching ◽

High Speed ◽

Finite State Machine ◽

State Machine ◽

Second Phase ◽

Processing Unit ◽

Two Phase ◽

High Speed Networks ◽

Network Intrusion ◽

Finite State

The rapid advancement of high speed networks has resulted in a significantly increasing number of network packets per second nowadays, implying network intrusion detection systems (NIDSs) need to accelerate the inspection of packet content to protect the computer systems from attacks. On average, the pattern matching process in a NIDS consumes approximately 70% of the overall processing time. The conventional Aho–Corasick (AC) algorithm, adopting a finite state machine to identify attack patterns in NIDSs, is too slow to meet the requirement of high speed networks. In view of this, several studies have used the features of a graphics processing unit (GPU) to improve the core searching process of the AC algorithm. For instance, parallel failureless Aho-Corasick (PFAC) algorithm improves the process of pattern matching effectively by removing backward branches in the original finite state machine created using the AC algorithm. In this way, boundary detection can be avoided totally if we allocate an individual thread to each byte of an input stream to identify any pattern starting at the thread’s starting position. However, through analysis, we found that this algorithm experiences a serious load imbalance problem. Therefore, this paper proposes a two-phase PFAC algorithm to address the problem. A threshold is predefined to divide execution into two phases, and the failureless finite state machine is also decoupled into two parts accordingly. In the first phase, every thread identifies patterns by running the tiny part of the decoupled failureless finite state machine that are stored in fast shared memory. In the second phase, all the threads requiring further searching in a same block are regrouped into a few warps for less branch divergence. According to experimental results, the proposed algorithm shows a performance improvement of 50% compared to the PFAC algorithm.

Download Full-text

High-Speed Finite State Machine Design by State Splitting

Contemporary Complex Systems and Their Dependability - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-319-91446-6_7 ◽

2018 ◽

pp. 64-73

Author(s):

Damian Borecki ◽

Valery Salauyou ◽

Tomasz Grzes

Keyword(s):

High Speed ◽

Finite State Machine ◽

State Machine ◽

Machine Design ◽

State Splitting ◽

Finite State

Download Full-text

RECONFIGURABLE SELF-ADDRESSABLE MEMORY-BASED FSM A SCALABLE INTRUSION DETECTION ENGINE

International Journal of Smart Sensor and Adhoc Network. ◽

10.47893/ijssan.2012.1153 ◽

2012 ◽

pp. 144-151

Author(s):

B. SRILATHA ◽

KRISHNA KISHORE

Keyword(s):

High Speed ◽

Finite State Machine ◽

State Machine ◽

Network Attack ◽

Memory Efficiency ◽

Attack Pattern ◽

Current State ◽

Incoming Packet ◽

Finite State ◽

Attack Patterns

One way to detect and thwart a network attack is to compare each incoming packet with predefined patterns, also Called an attack pattern database, and raise an alert upon detecting a match. This article presents a novel pattern-matching Engine that exploits a memory-based, programmable state machine to achieve deterministic processing rates that are Independent of packet and pattern characteristics. Our engine is a self addressable memory based finite state machine (samFsm), whose current state coding exhibits all its possible next states. Moreover, it is fully reconfigurable in that new attack Patterns can be updated easily. A methodology was developed to program the memory and logic. Specifically, we merge “non-equivalent” states by introducing “super characters” on their inputs to further enhance memory efficiency without Adding labels. This is the most high speed self addressable memory based fsm.sam-fsm is one of the most storage-Efficient machines and reduces the memory requirement by 60 times. Experimental results are presented to demonstrate the Validity of sam-fsm.

Download Full-text

P3FSM: Portable Predictive Pattern Matching Finite State Machine

10.1109/asap.2009.16 ◽

2009 ◽

Cited By ~ 7

Author(s):

Lucas Vespa ◽

Mini Mathew ◽

Ning Weng

Keyword(s):

Pattern Matching ◽

Finite State Machine ◽

State Machine ◽

Finite State

Download Full-text

High-Speed and Area-Efficient Reconfigurable Multiplexer Bank for RAM-Based Finite State Machine Implementations

Journal of Circuits System and Computers ◽

10.1142/s0218126615501017 ◽

2015 ◽

Vol 24 (07) ◽

pp. 1550101 ◽

Cited By ~ 4

Author(s):

Raouf Senhadji-Navaro ◽

Ignacio Garcia-Vargas

Keyword(s):

High Speed ◽

Finite State Machine ◽

State Of The Art ◽

Behavioral Model ◽

State Machine ◽

The State ◽

Experimental Results ◽

State Machines ◽

Finite State ◽

Area Efficient

This work is focused on the problem of designing efficient reconfigurable multiplexer banks for RAM-based implementations of reconfigurable state machines. We propose a new architecture (called combination-based reconfigurable multiplexer bank, CRMUX) that use multiplexers simpler than that of the state-of-the-art architecture (called variation-based reconfigurable multiplexer bank, VRMUX). The performance (in terms of speed, area and reconfiguration cost) of both architectures is compared. Experimental results from MCNC finite state machine (FSM) benchmarks show that CRMUX is faster and more area-efficient than VRMUX. The reconfiguration cost of both multiplexer banks is studied using a behavioral model of a reconfigurable state machine. The results show that the reconfiguration cost of CRMUX is lower than that of VRMUX in most cases.

Download Full-text

Reconfigurable finite-state machine based IP lookup engine for high-speed router

IEEE Journal on Selected Areas in Communications ◽

10.1109/jsac.2003.810498 ◽

2003 ◽

Vol 21 (4) ◽

pp. 501-512 ◽

Cited By ~ 9

Author(s):

M. Desai ◽

R. Gupta ◽

A. Karandikar ◽

K. Saxena ◽

V. Samant

Keyword(s):

High Speed ◽

Finite State Machine ◽

State Machine ◽

Ip Lookup ◽

Finite State

Download Full-text

Optimal Differential Routing based on Finite State Machine Theory

VLSI Design ◽

10.1155/1999/83648 ◽

1999 ◽

Vol 9 (2) ◽

pp. 105-117 ◽

Cited By ~ 1

Author(s):

M. S. Krishnamoorthy ◽

James R. Loy ◽

John F. McDonald

Keyword(s):

High Speed ◽

Finite State Machine ◽

Linear Time ◽

State Machine ◽

Formal Proof ◽

Routing Problem ◽

Differential Signal ◽

Finite State ◽

Machine Theory ◽

Proof Of Correctness

Noise margins in high speed digital systems continue to erode. Full differential signal routing provides a mechanism for deferring these effects. This paper proposes a three stage routing process for solving the adjacent placement routing problem of differential signal pairs, and proves that it is optimal. The process views differential pairs as logical nets; routes the logical nets; then bifurcates the result to achieve a physical realization. Finite state machine theory provides the critical theoretical underpinning and formal proof of correctness necessary for linear time bifurcation. Regular expressions map the theoretical solution to an appropriate implementation strategy that employs feature vectors for net recognition.

Download Full-text

Deep learning-based feature extraction and optimizing pattern matching for intrusion detection using finite state machine

Computers & Electrical Engineering ◽

10.1016/j.compeleceng.2021.107094 ◽

2021 ◽

Vol 92 ◽

pp. 107094

Author(s):

Junaid Shabbir Abbasi ◽

Faisal Bashir ◽

Kashif Naseer Qureshi ◽

Muhammad Najam ul Islam ◽

Gwanggil Jeon

Keyword(s):

Feature Extraction ◽

Deep Learning ◽

Intrusion Detection ◽

Pattern Matching ◽

Finite State Machine ◽

State Machine ◽

Finite State

Download Full-text

Realization of Parallelism in a Sequential Legacy „C‟ Program

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b4158.079220 ◽

2020 ◽

Vol 9 (2) ◽

pp. 1179-1183

Keyword(s):

High Speed ◽

Finite State Machine ◽

Research Work ◽

State Machine ◽

Parallel Processors ◽

Software Systems ◽

Functional Modules ◽

Finite State ◽

C Program ◽

Computational Platform

In the present era of high speed computation with the multicore and other parallel processors in the computational field, there are still some organizations which rely on their old software systems developed years ago, which over the time have been subjected to continous development by different developers. Even though these softwares persist with the old and little in use technology, they still work to satisfy the operational demands of the organizations and have kept them going in the competetive industry. These systems which have with time grown into legacy, embed the major business functionalities of the organization, which is but effort of years. Hence a methodology is required to rebuild the legacy system to make them suitable for execution on to the present computation systems. The paper discusses a research work, wherein work is done to realize points of latent parallelism in a sequentially executing legacy ‘C’ program which is initially restructured and the design information abstracted. A technique using finite state machine is proposed to identify tasks, events, processes and jobs in the program, which helps to locate functionally independent computational units in the program. Furthur using the slicing technique, slicing is performed to extract out the appropriate lines of codes defined by the slicing criteria, which assembled together form a functionality that can be executed in parallel with other extracted functional modules or computational units on any parallel computational platform.

Download Full-text

A 3 GHz Semi-Digital Delay Locked Loop with High Resolution

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.571-572.881 ◽

2014 ◽

Vol 571-572 ◽

pp. 881-884

Author(s):

Yi Ding ◽

Xi Duan ◽

Jun Liu

Keyword(s):

High Resolution ◽

High Speed ◽

Finite State Machine ◽

Delay Line ◽

State Machine ◽

Phase Detector ◽

Delay Locked Loop ◽

Finite State

A high speed and high resolution semi-digital DLL (Delay Locked Loop) circuit will be discussed. The circuit is composed of three blocks: delay line, phase detector and digital finite-state machine (FSM). The delay line consists of two steps: the coarse tuning by tapping and the fine delay using interpolation to enable a resolution as high as 2 picoseconds. With the two steps approach and configuration of delay line, 3 GHz speed and picoseconds-level resolution can be achieved.

Download Full-text

A finite state machine algorithm for finding restriction sites and other pattern matching applications

Bioinformatics ◽

10.1093/bioinformatics/4.4.459 ◽

1988 ◽

Vol 4 (4) ◽

pp. 459-465 ◽

Cited By ~ 1

Author(s):

Roy Smith

Keyword(s):

Pattern Matching ◽

Finite State Machine ◽

State Machine ◽

Restriction Sites ◽

Finite State

Download Full-text