Comparison of high-performance packet processing frameworks on NUMA

The continuous increase in network traffic has sharply increased the demand for high-performance packet processing systems. For a high-performance packet processing system based on multi-core processors, the packet scheduling algorithm is critical because of the significant role it plays in load distribution, which is related to system throughput, attracting intensive research attention. However, it is not an easy task since the canonical flow-level packet scheduling algorithm is vulnerable to traffic locality, while the packet-level packet scheduling algorithm fails to maintain cache affinity. In this paper, we propose an adaptive throughput-first packet scheduling algorithm for DPDK-based packet processing systems. Combined with the feature of DPDK burst-oriented packet receiving and transmitting, we propose using Subflow as the scheduling unit and the adjustment unit making the proposed algorithm not only maintain the advantages of flow-level packet scheduling algorithms when the adjustment does not happen but also avoid packet loss as much as possible when the target core may be overloaded Experimental results show that the proposed method outperforms Round-Robin, HRW (High Random Weight), and CRC32 on system throughput and packet loss rate.

Download Full-text

A RISC-V in-network accelerator for flexible high-performance low-power packet processing

2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) ◽

10.1109/isca52012.2021.00079 ◽

2021 ◽

Author(s):

Salvatore Di Girolamo ◽

Andreas Kurth ◽

Alexandru Calotoiu ◽

Thomas Benz ◽

Timo Schneider ◽

...

Keyword(s):

Low Power ◽

High Performance ◽

Packet Processing

Download Full-text

packetC Language for High Performance Packet Processing

packetC Programming ◽

10.1007/978-1-4302-4159-1_28 ◽

2011 ◽

pp. 311-318

Author(s):

Ralph Duncan ◽

Peder Jungck

Keyword(s):

High Performance ◽

Packet Processing

Download Full-text

Towards high-performance flow-level packet processing on multi-core network processors

Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems - ANCS '07 ◽

10.1145/1323548.1323552 ◽

2007 ◽

Cited By ~ 31

Author(s):

Yaxuan Qi ◽

Bo Xu ◽

Fei He ◽

Baohua Yang ◽

Jianming Yu ◽

...

Keyword(s):

High Performance ◽

Core Network ◽

Packet Processing ◽

Network Processors ◽

Performance Flow

Download Full-text

Towards high performance packet processing for 5G

2015 IEEE Conference on Network Function Virtualization and Software Defined Network (NFV-SDN) ◽

10.1109/nfv-sdn.2015.7387408 ◽

2015 ◽

Cited By ~ 1

Author(s):

Balazs Pinczel ◽

Daniel Gehberger ◽

Zoltan Turanyi ◽

Bence Formanek

Keyword(s):

High Performance ◽

Packet Processing

Download Full-text

DrawerPipe: A Reconfigurable Pipeline for Network Processing on FPGA-Based SmartNIC

Electronics ◽

10.3390/electronics9010059 ◽

2019 ◽

Vol 9 (1) ◽

pp. 59 ◽

Cited By ~ 2

Author(s):

Junnan Li ◽

Zhigang Sun ◽

Jinli Yan ◽

Xiangrui Yang ◽

Yue Jiang ◽

...

Keyword(s):

High Performance ◽

Packet Processing ◽

The Public ◽

Performance Requirements ◽

Application Logic ◽

Network Functions ◽

Indexing Mechanism ◽

Load Balancer ◽

Rapid Deployment ◽

Network Processing

In the public cloud, FPGA-based SmartNICs are widely deployed to accelerate network functions (NFs) for datacenter operators. We argue that with the trend of network as a service (NaaS) in the cloud is also meaningful to accelerate tenant NFs to meet performance requirements. However, in pursuit of high performance, existing work such as AccelNet is carefully designed to accelerate specific NFs for datacenter providers, which sacrifices the flexibility of rapidly deploying new NFs. For most tenants with limited hardware design ability, it is time-consuming to develop NFs from scratch due to the lack of a rapidly reconfigurable framework. In this paper, we present a reconfigurable network processing pipeline, i.e., DrawerPipe, which abstracts packet processing into multiple “drawers” connected by the same interface. NF developers can easily share existing modules with other NFs and simply load core application logic in the appropriate “drawer” to implement new NFs. Furthermore, we propose a programmable module indexing mechanism, namely PMI, which can connect “drawers” in any logical order, to perform distinct NFs for different tenants or flows. Finally, we implemented several highly reusable modules for low-level packet processing, and extended four example NFs (firewall, stateful firewall, load balancer, IDS) based on DrawerPipe. Our evaluation shows that DrawerPipe can easily offload customized packet processing to FPGA with high performance up to 100 Mpps and ultra-low latency (<10 µs). Moreover, DrawerPipe enables modular development of NFs, which is suitable for rapid deployment of NFs. Compared with individual NF development, DrawerPipe reduces the line of code (LoC) of the four NFs above by 68%.

Download Full-text