scholarly journals IP LOOKUP BY BINARY SEARCH ON PREFIX LENGTH

2002 ◽  
Vol 03 (03n04) ◽  
pp. 105-128 ◽  
Author(s):  
KUN SUK KIM ◽  
SARTAJ SAHNI

Waldvogel et al.9 have proposed a collection of hash tables (CHT) organization for an IP router table. Each hash table in the CHT contains prefixes of the same length together with markers for longer-length prefixes. IP lookup can be done with O( log ldist) hash-table searches, where ldist is the number of distinct prefix-lengths (also equal to the number of hash tables in the CHT). Srinivasan and Varghese8 have proposed the use of controlled prefix-expansion to reduce the value of ldist. The details of their algorithm to reduce the number of lengths are given in [7]. The complexity of this algorithm is O(nW2), where n is the number of prefixes, and W is the length of the longest prefix. The algorithm of [7] does not minimize the storage required by the prefixes and markers for the resulting set of prefixes. We develop an algorithm that minimizes storage requirement but takes O(nW3 + kW4) time, where k is the desired number of distinct lengths. Also, we propose improvements to the heuristic of [7].

2014 ◽  
Vol 644-650 ◽  
pp. 3365-3370
Author(s):  
Zhen Hong Guo ◽  
Lin Li ◽  
Qing Wang ◽  
Meng Lin ◽  
Rui Pan

With the rapid development of the Internet, the number of firewall rules is increasing. The enormous quantity of rules challenges the performance of the packet classification that has already become a bottleneck in firewalls. This dissertation proposes a rapid and multi-dimensional algorithm for packet classification based on BSOL(Binary Search On Leaves), which is named FMPC(FastMulti-dimensional Packet Classification). Different from BSOL, FMPC cuts all dimensions at the same time to decompose rule spaces and stores leaf spaces into hash tables; FMPC constructs a Bloom Filter for every hash table and stores them into embedded SRAM. When classifying a packet, FMPC performs parallel queries on Bloom Filters and determines how to visit hash tables according to the results. Algorithm analysis and the result of simulations show: the average number of hash-table lookups of FMPC is 1 when classifying a packet, which is much smaller than that of BSOL; inthe worst case, the number of hash-table lookups of FMPCisO(logwmax+1⁡), which is also smaller than that of BSOL in multi-dimensional environment, where wmax is the length, in bits, of the dimension whose length is the longest..


2013 ◽  
Vol 22 (3) ◽  
pp. 455-476
Author(s):  
NICLAS PETERSSON

In this paper we study the maximum displacement for linear probing hashing. We use the standard probabilistic model together with the insertion policy known as First-Come-(First-Served). The results are of asymptotic nature and focus on dense hash tables. That is, the number of occupied cellsnand the size of the hash tablemtend to infinity with ration/m→ 1. We present distributions and moments for the size of the maximum displacement, as well as for the number of items with displacement larger than some critical value. This is done via process convergence of the (appropriately normalized) length of the largest block of consecutive occupied cells, when the total number of occupied cellsnvaries.


Algorithms ◽  
2020 ◽  
Vol 13 (12) ◽  
pp. 338
Author(s):  
Ting Huang ◽  
Zhengping Weng ◽  
Gang Liu ◽  
Zhenwen He

To manage multidimensional point data more efficiently, this paper presents an improvement, called HD-tree, of a previous indexing method, called D-tree. Both structures combine quadtree-like partitioning (using integer shift operations without storing internal nodes, but only leaves) and hash tables (for searching for the nodes stored). However, the HD-tree follows a brand-new decomposition strategy, which is called half decomposition strategy. This improvement avoids the generation of nodes containing only a small amount of data and the sequential search of the hash table, so that it can save storage space while having faster I/O and better time performance when building the tree and querying data. The results demonstrate convincingly that the time and space performance of HD-tree is better than that of D-tree regardless of uniform or uneven data, which are less affected by data distribution.


Author(s):  
Xiaoxun Sun ◽  
Min Li

A number of organizations publish microdata for purposes such as public health and demographic research. Although attributes of microdata that clearly identify individuals, such as name, are generally removed, these databases can sometimes be joined with other public databases on attributes such as Zip code, Gender, and Age to re-identify individuals who were supposed to remain anonymous. These linking attacks are made easier by the availability of other complementary databases over the Internet. K-anonymity is a technique that prevents linking attacks by generalizing or suppressing portions of the released microdata so that no individual can be uniquely distinguished from a group of size k. In this chapter, we investigate a practical full-domain generalization model of k-anonymity and examine the issue of computing minimal k-anonymous solution. We introduce the hash-based technique previously used in mining associate rules and present an efficient and effective privacy hash table structure to derive the minimal solution. The experimental results show the proposed hash-based technique is highly efficient compared with the binary search method.


Author(s):  
KIRAN SREE POKKULURI

Internet address lookup is a challenging problem because of the increasing routing table sizes, increased traffic, higher speed links, and the migration to 128 bit IPv6 addresses. Routing lookup involves computation of best matching prefix for which existing solutions scale poorly when traffic in the router increases or when employed for IPV6 address lookup. Our paper describes a novel approach which employs multiple hashing on reduced number of hash tables on which ternary search on levels is applied in parallel. This scheme handles large number of prefixes generated by controlled prefix expansion by reducing collision and distributing load fairly in the hash buckets thus providing faster worst case and average case lookups. The approach we describe is fast, simple, scalable, parallelizable, and flexible.


2020 ◽  
Vol 10 (6) ◽  
pp. 1915
Author(s):  
Tianqi Zheng ◽  
Zhibin Zhang ◽  
Xueqi Cheng

Hash tables are the fundamental data structure for analytical database workloads, such as aggregation, joining, set filtering and records deduplication. The performance aspects of hash tables differ drastically with respect to what kind of data are being processed or how many inserts, lookups and deletes are constructed. In this paper, we address some common use cases of hash tables: aggregating and joining over arbitrary string data. We designed a new hash table, SAHA, which is tightly integrated with modern analytical databases and optimized for string data with the following advantages: (1) it inlines short strings and saves hash values for long strings only; (2) it uses special memory loading techniques to do quick dispatching and hashing computations; and (3) it utilizes vectorized processing to batch hashing operations. Our evaluation results reveal that SAHA outperforms state-of-the-art hash tables by one to five times in analytical workloads, including Google’s SwissTable and Facebook’s F14Table. It has been merged into the ClickHouse database and shows promising results in production.


1992 ◽  
Vol 03 (01) ◽  
pp. 55-63
Author(s):  
FABRIZIO LUCCIO ◽  
ANDREA PIETRACAPRINA ◽  
GEPPINO PUCCI

The performance of hash tables is analyzed in a parallel context. Assuming that a hash table of fixed size is allocated in the shared memory of a PRAM with n processors, a Ph-step is defined as a PRAM computation in which each processor searches or inserts a key in the table. It is shown that the maximum number of table probes needed for a single key in a Ph-step is Ω( log 1/αn) and O( log 1/α′n) with high probability, where α and α′ are the load factors before and after the execution of the Ph-step. However, a clever implementation of a Ph-step is proposed, which runs in time O(( log 1/α′n)1/2) with high probability. The algorithm exploits the fact that operations relative to different keys have different durations; hence, the processors in charge of shorter operations, once finished, are used to perform part of the longer ones.


2014 ◽  
Vol 1046 ◽  
pp. 504-507
Author(s):  
Kai Song ◽  
Hai Sheng Li

In the paper, a new scheduling algorithm, priority bitmap and hash table (PBHT) algorithm, is put forward. The important parts of the scheduling algorithm, priority bitmap scheduling algorithm and hash tables are analyzed, and the workflow and time complexity of the scheduling algorithm are described in detail. Series of experiments are designed and completed. The feasibility, rationality, completeness and the scheduling algorithm are verified by the experimental results.


2021 ◽  
Vol 8 (2) ◽  
pp. 1-17
Author(s):  
Oded Green

In this article, we introduce HashGraph, a new scalable approach for building hash tables that uses concepts taken from sparse graph representations—hence, the name HashGraph. HashGraph introduces a new way to deal with hash-collisions that does not use “open-addressing” or “separate-chaining,” yet it has the benefits of both these approaches. HashGraph currently works for static inputs. Recent progress with dynamic graph data structures suggests that HashGraph might be extendable to dynamic inputs as well. We show that HashGraph can deal with a large number of hash values per entry without loss of performance. Last, we show a new querying algorithm for value lookups. We experimentally compare HashGraph to several state-of-the-art implementations and find that it outperforms them on average 2× when the inputs are unique and by as much as 40× when the input contains duplicates. The implementation of HashGraph in this article is for NVIDIA GPUs. HashGraph can build a hash table at a rate of 2.5 billion keys per second on a NVIDIA GV100 GPU and can query at nearly the same rate.


Sign in / Sign up

Export Citation Format

Share Document