IP LOOKUP BY BINARY SEARCH ON PREFIX LENGTH

Waldvogel et al.9 have proposed a collection of hash tables (CHT) organization for an IP router table. Each hash table in the CHT contains prefixes of the same length together with markers for longer-length prefixes. IP lookup can be done with O( log ldist) hash-table searches, where ldist is the number of distinct prefix-lengths (also equal to the number of hash tables in the CHT). Srinivasan and Varghese8 have proposed the use of controlled prefix-expansion to reduce the value of ldist. The details of their algorithm to reduce the number of lengths are given in [7]. The complexity of this algorithm is O(nW2), where n is the number of prefixes, and W is the length of the longest prefix. The algorithm of [7] does not minimize the storage required by the prefixes and markers for the resulting set of prefixes. We develop an algorithm that minimizes storage requirement but takes O(nW3 + kW4) time, where k is the desired number of distinct lengths. Also, we propose improvements to the heuristic of [7].

Download Full-text

FMPC: A Fast Multi-Dimensional Packet Classification Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.644-650.3365 ◽

2014 ◽

Vol 644-650 ◽

pp. 3365-3370

Author(s):

Zhen Hong Guo ◽

Lin Li ◽

Qing Wang ◽

Meng Lin ◽

Rui Pan

Keyword(s):

Rapid Development ◽

Hash Table ◽

Bloom Filter ◽

Packet Classification ◽

Binary Search ◽

Bloom Filters ◽

Algorithm Analysis ◽

Hash Tables ◽

Worst Case ◽

Embedded Sram

With the rapid development of the Internet, the number of firewall rules is increasing. The enormous quantity of rules challenges the performance of the packet classification that has already become a bottleneck in firewalls. This dissertation proposes a rapid and multi-dimensional algorithm for packet classification based on BSOL(Binary Search On Leaves), which is named FMPC(FastMulti-dimensional Packet Classification). Different from BSOL, FMPC cuts all dimensions at the same time to decompose rule spaces and stores leaf spaces into hash tables; FMPC constructs a Bloom Filter for every hash table and stores them into embedded SRAM. When classifying a packet, FMPC performs parallel queries on Bloom Filters and determines how to visit hash tables according to the results. Algorithm analysis and the result of simulations show: the average number of hash-table lookups of FMPC is 1 when classifying a packet, which is much smaller than that of BSOL; inthe worst case, the number of hash-table lookups of FMPCisO(logwmax+1⁡), which is also smaller than that of BSOL in multi-dimensional environment, where wmax is the length, in bits, of the dimension whose length is the longest..

Download Full-text

The Maximum Displacement for Linear Probing Hashing

Combinatorics Probability Computing ◽

10.1017/s0963548312000582 ◽

2013 ◽

Vol 22 (3) ◽

pp. 455-476

Author(s):

NICLAS PETERSSON

Keyword(s):

Probabilistic Model ◽

Hash Table ◽

Critical Value ◽

Hash Tables ◽

Maximum Displacement ◽

Asymptotic Nature ◽

Process Convergence

In this paper we study the maximum displacement for linear probing hashing. We use the standard probabilistic model together with the insertion policy known as First-Come-(First-Served). The results are of asymptotic nature and focus on dense hash tables. That is, the number of occupied cellsnand the size of the hash tablemtend to infinity with ration/m→ 1. We present distributions and moments for the size of the maximum displacement, as well as for the number of items with displacement larger than some critical value. This is done via process convergence of the (appropriately normalized) length of the largest block of consecutive occupied cells, when the total number of occupied cellsnvaries.

Download Full-text

HD-Tree: An Efficient High-Dimensional Virtual Index Structure Using a Half Decomposition Strategy

Algorithms ◽

10.3390/a13120338 ◽

2020 ◽

Vol 13 (12) ◽

pp. 338

Author(s):

Ting Huang ◽

Zhengping Weng ◽

Gang Liu ◽

Zhenwen He

Keyword(s):

Hash Table ◽

Index Structure ◽

High Dimensional ◽

Sequential Search ◽

Storage Space ◽

Hash Tables ◽

Time Performance ◽

Indexing Method ◽

Point Data ◽

Better Than

To manage multidimensional point data more efficiently, this paper presents an improvement, called HD-tree, of a previous indexing method, called D-tree. Both structures combine quadtree-like partitioning (using integer shift operations without storing internal nodes, but only leaves) and hash tables (for searching for the nodes stored). However, the HD-tree follows a brand-new decomposition strategy, which is called half decomposition strategy. This improvement avoids the generation of nodes containing only a small amount of data and the sequential search of the hash table, so that it can save storage space while having faster I/O and better time performance when building the tree and querying data. The results demonstrate convincingly that the time and space performance of HD-tree is better than that of D-tree regardless of uniform or uneven data, which are less affected by data distribution.

Download Full-text

Privacy Hash Table

Privacy Protection Measures and Technologies in Business Organizations ◽

10.4018/978-1-61350-501-4.ch005 ◽

2012 ◽

pp. 129-145

Author(s):

Xiaoxun Sun ◽

Min Li

Keyword(s):

Public Health ◽

Hash Table ◽

Minimal Solution ◽

Binary Search ◽

Search Method ◽

The Internet ◽

Demographic Research ◽

Gender And Age ◽

Zip Code ◽

Full Domain

A number of organizations publish microdata for purposes such as public health and demographic research. Although attributes of microdata that clearly identify individuals, such as name, are generally removed, these databases can sometimes be joined with other public databases on attributes such as Zip code, Gender, and Age to re-identify individuals who were supposed to remain anonymous. These linking attacks are made easier by the availability of other complementary databases over the Internet. K-anonymity is a technique that prevents linking attacks by generalizing or suppressing portions of the released microdata so that no individual can be uniquely distinguished from a group of size k. In this chapter, we investigate a practical full-domain generalization model of k-anonymity and examine the issue of computing minimal k-anonymous solution. We introduce the hash-based technique previously used in mining associate rules and present an efficient and effective privacy hash table structure to derive the minimal solution. The experimental results show the proposed hash-based technique is highly efficient compared with the binary search method.

Download Full-text

An Efficient Parallel IP Lookup Technique for IPv6 Routers Using Multiple Hashing with Ternary marker storage

African Journal of Information & Communication Technology ◽

10.5130/ajict.v6i1.627 ◽

2011 ◽

Vol 6 (1) ◽

Author(s):

KIRAN SREE POKKULURI

Keyword(s):

Challenging Problem ◽

Hash Tables ◽

Worst Case ◽

Average Case ◽

Ip Lookup ◽

Ipv6 Address ◽

Internet Address ◽

Routing Table ◽

Novel Approach ◽

Routing Lookup

Internet address lookup is a challenging problem because of the increasing routing table sizes, increased traffic, higher speed links, and the migration to 128 bit IPv6 addresses. Routing lookup involves computation of best matching prefix for which existing solutions scale poorly when traffic in the router increases or when employed for IPV6 address lookup. Our paper describes a novel approach which employs multiple hashing on reduced number of hash tables on which ternary search on levels is applied in parallel. This scheme handles large number of prefixes generated by controlled prefix expansion by reducing collision and distributing load fairly in the hash buckets thus providing faster worst case and average case lookups. The approach we describe is fast, simple, scalable, parallelizable, and flexible.

Download Full-text

SAHA: A String Adaptive Hash Table for Analytical Databases

Applied Sciences ◽

10.3390/app10061915 ◽

2020 ◽

Vol 10 (6) ◽

pp. 1915

Author(s):

Tianqi Zheng ◽

Zhibin Zhang ◽

Xueqi Cheng

Keyword(s):

Data Structure ◽

State Of The Art ◽

Long Strings ◽

Hash Table ◽

Use Cases ◽

Hash Tables ◽

Modern Analytical

Hash tables are the fundamental data structure for analytical database workloads, such as aggregation, joining, set filtering and records deduplication. The performance aspects of hash tables differ drastically with respect to what kind of data are being processed or how many inserts, lookups and deletes are constructed. In this paper, we address some common use cases of hash tables: aggregating and joining over arbitrary string data. We designed a new hash table, SAHA, which is tightly integrated with modern analytical databases and optimized for string data with the following advantages: (1) it inlines short strings and saves hash values for long strings only; (2) it uses special memory loading techniques to do quick dispatching and hashing computations; and (3) it utilizes vectorized processing to batch hashing operations. Our evaluation results reveal that SAHA outperforms state-of-the-art hash tables by one to five times in analytical workloads, including Google’s SwissTable and Facebook’s F14Table. It has been merged into the ClickHouse database and shows promising results in production.

Download Full-text

ANALYSIS AND IMPLEMENTATION OF PARALLEL UNIFORM HASHING

International Journal of Foundations of Computer Science ◽

10.1142/s0129054192000061 ◽

1992 ◽

Vol 03 (01) ◽

pp. 55-63

Author(s):

FABRIZIO LUCCIO ◽

ANDREA PIETRACAPRINA ◽

GEPPINO PUCCI

Keyword(s):

Shared Memory ◽

High Probability ◽

Hash Table ◽

Fixed Size ◽

Hash Tables ◽

Load Factors ◽

Before And After

The performance of hash tables is analyzed in a parallel context. Assuming that a hash table of fixed size is allocated in the shared memory of a PRAM with n processors, a Ph-step is defined as a PRAM computation in which each processor searches or inserts a key in the table. It is shown that the maximum number of table probes needed for a single key in a Ph-step is Ω( log 1/αn) and O( log 1/α′n) with high probability, where α and α′ are the load factors before and after the execution of the Ph-step. However, a clever implementation of a Ph-step is proposed, which runs in time O(( log 1/α′n)1/2) with high probability. The algorithm exploits the fact that operations relative to different keys have different durations; hence, the processors in charge of shorter operations, once finished, are used to perform part of the longer ones.

Download Full-text

PBHT Scheduling Algorithm for Embedded Real-Time Systems

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1046.504 ◽

2014 ◽

Vol 1046 ◽

pp. 504-507

Author(s):

Kai Song ◽

Hai Sheng Li

Keyword(s):

Real Time ◽

Time Complexity ◽

Scheduling Algorithm ◽

Hash Table ◽

Experimental Results ◽

Real Time Systems ◽

Hash Tables ◽

Series Of Experiments ◽

Time Systems

In the paper, a new scheduling algorithm, priority bitmap and hash table (PBHT) algorithm, is put forward. The important parts of the scheduling algorithm, priority bitmap scheduling algorithm and hash tables are analyzed, and the workflow and time complexity of the scheduling algorithm are described in detail. Series of experiments are designed and completed. The feasibility, rationality, completeness and the scheduling algorithm are verified by the experimental results.

Download Full-text

HashGraph—Scalable Hash Tables Using a Sparse Graph Data Structure

ACM Transactions on Parallel Computing ◽

10.1145/3460872 ◽

2021 ◽

Vol 8 (2) ◽

pp. 1-17

Author(s):

Oded Green

Keyword(s):

Data Structures ◽

Recent Progress ◽

State Of The Art ◽

Hash Table ◽

Sparse Graph ◽

Dynamic Graph ◽

Hash Tables ◽

Graph Representations ◽

Graph Data ◽

Open Addressing

In this article, we introduce HashGraph, a new scalable approach for building hash tables that uses concepts taken from sparse graph representations—hence, the name HashGraph. HashGraph introduces a new way to deal with hash-collisions that does not use “open-addressing” or “separate-chaining,” yet it has the benefits of both these approaches. HashGraph currently works for static inputs. Recent progress with dynamic graph data structures suggests that HashGraph might be extendable to dynamic inputs as well. We show that HashGraph can deal with a large number of hash values per entry without loss of performance. Last, we show a new querying algorithm for value lookups. We experimentally compare HashGraph to several state-of-the-art implementations and find that it outperforms them on average 2× when the inputs are unique and by as much as 40× when the input contains duplicates. The implementation of HashGraph in this article is for NVIDIA GPUs. HashGraph can build a hash table at a rate of 2.5 billion keys per second on a NVIDIA GV100 GPU and can query at nearly the same rate.

Download Full-text