scholarly journals HD-Tree: An Efficient High-Dimensional Virtual Index Structure Using a Half Decomposition Strategy

Algorithms ◽  
2020 ◽  
Vol 13 (12) ◽  
pp. 338
Author(s):  
Ting Huang ◽  
Zhengping Weng ◽  
Gang Liu ◽  
Zhenwen He

To manage multidimensional point data more efficiently, this paper presents an improvement, called HD-tree, of a previous indexing method, called D-tree. Both structures combine quadtree-like partitioning (using integer shift operations without storing internal nodes, but only leaves) and hash tables (for searching for the nodes stored). However, the HD-tree follows a brand-new decomposition strategy, which is called half decomposition strategy. This improvement avoids the generation of nodes containing only a small amount of data and the sequential search of the hash table, so that it can save storage space while having faster I/O and better time performance when building the tree and querying data. The results demonstrate convincingly that the time and space performance of HD-tree is better than that of D-tree regardless of uniform or uneven data, which are less affected by data distribution.

2001 ◽  
Author(s):  
Daoguo Dong ◽  
Xiangyang Xue ◽  
Hangzai Luo ◽  
Yingqiang Lin

2016 ◽  
Vol 6 (4) ◽  
pp. 16-29 ◽  
Author(s):  
Arun Kumar Yadav ◽  
Divakar Yadav ◽  
Rajesh Prasad

Searching on the web is one of the most progressive and expanding field nowadays. A large amount of information is available on the World Wide Web, motivating the need of efficient text indexing method that support fast text retrieval. In the past, two main indexing techniques: Signature files and Inverted files have been proposed. First require much larger space to store index and are more expensive to construct and update than inverted files. Second has been efficiently implemented using different structures like Sorted array and B-Tree. Sorted array was very expensive in updating the indices while appending a new keyword and B-tree method breaks down if there are many words with the same prefix. This paper presents a modified index structure for text retrieval that keeps a good result to optimize the space needed to store and time to search document. The proposed index is designed using the Wavelet Tree (WT), which was originally designed as wavelet transform for images. Experimental results show that on increasing the query length, the WT based index performs better than others.


2021 ◽  
Author(s):  
Shengnan Ke ◽  
Jun Gong ◽  
Songnian Li ◽  
Qing Zhu ◽  
Xintao Liu ◽  
...  

In recent years, there has been tremendous growth in the field of indoor and outdoor positioning sensors continuously producing huge volumes of trajectory data that has been used in many fields such as location-based services or location intelligence. Trajectory data is massively increased and semantically complicated, which poses a great challenge on spatio-temporal data indexing. This paper proposes a spatio-temporal data indexing method, named HBSTR-tree, which is a hybrid index structure comprising spatio-temporal R-tree, B*-tree and Hash table. To improve the index generation efficiency, rather than directly inserting trajectory points, we group consecutive trajectory points as nodes according to their spatio-temporal semantics and then insert them into spatio-temporal R-tree as leaf nodes. Hash table is used to manage the latest leaf nodes to reduce the frequency of insertion. A new spatio-temporal interval criterion and a new node-choosing sub-algorithm are also proposed to optimize spatio-temporal R-tree structures. In addition, a B*-tree sub-index of leaf nodes is built to query the trajectories of targeted objects efficiently. Furthermore, a database storage scheme based on a NoSQL-type DBMS is also proposed for the purpose of cloud storage. Experimental results prove that HBSTR-tree outperforms TB*-tree in some aspects such as generation efficiency, query performance and query type.


2016 ◽  
Vol 25 (03) ◽  
pp. 1650013
Author(s):  
Shuyin Xia ◽  
Guoyin Wang ◽  
Hong Yu ◽  
Qun Liu ◽  
Jin Wang

Outlier detection is a difficult problem due to its time complexity being quadratic or cube in most cases, which makes it necessary to develop corresponding acceleration algorithms. Since the index structure (c.f. R tree) is used in the main acceleration algorithms, those approaches deteriorate when the dimensionality increases. In this paper, an approach named VBOD (vibration-based outlier detection) is proposed, in which the main variants assess the vibration. Since the basic model and approximation algorithm FASTVBOD do not need to compute the index structure, their performances are less sensitive to increasing dimensions than traditional approaches. The basic model of this approach has only quadratic time complexity. Furthermore, accelerated algorithms decrease time complexity to [Formula: see text]. The fact that this approach does not rely on any parameter selection is another advantage. FASTVBOD was compared with other state-of-the-art algorithms, and it performed much better than other methods especially on high dimensional data.


2021 ◽  
Author(s):  
Shengnan Ke ◽  
Jun Gong ◽  
Songnian Li ◽  
Qing Zhu ◽  
Xintao Liu ◽  
...  

In recent years, there has been tremendous growth in the field of indoor and outdoor positioning sensors continuously producing huge volumes of trajectory data that has been used in many fields such as location-based services or location intelligence. Trajectory data is massively increased and semantically complicated, which poses a great challenge on spatio-temporal data indexing. This paper proposes a spatio-temporal data indexing method, named HBSTR-tree, which is a hybrid index structure comprising spatio-temporal R-tree, B*-tree and Hash table. To improve the index generation efficiency, rather than directly inserting trajectory points, we group consecutive trajectory points as nodes according to their spatio-temporal semantics and then insert them into spatio-temporal R-tree as leaf nodes. Hash table is used to manage the latest leaf nodes to reduce the frequency of insertion. A new spatio-temporal interval criterion and a new node-choosing sub-algorithm are also proposed to optimize spatio-temporal R-tree structures. In addition, a B*-tree sub-index of leaf nodes is built to query the trajectories of targeted objects efficiently. Furthermore, a database storage scheme based on a NoSQL-type DBMS is also proposed for the purpose of cloud storage. Experimental results prove that HBSTR-tree outperforms TB*-tree in some aspects such as generation efficiency, query performance and query type.


Author(s):  
George H. Cheng ◽  
Adel Younis ◽  
Kambiz Haji Hajikolaei ◽  
G. Gary Wang

Mode Pursuing Sampling (MPS) was developed as a global optimization algorithm for optimization problems involving expensive black box functions. MPS has been found to be effective and efficient for problems of low dimensionality, i.e., the number of design variables is less than ten. A previous conference publication integrated the concept of trust regions into the MPS framework to create a new algorithm, TRMPS, which dramatically improved performance and efficiency for high dimensional problems. However, although TRMPS performed better than MPS, it was unproven against other established algorithms such as GA. This paper introduces an improved algorithm, TRMPS2, which incorporates guided sampling and low function value criterion to further improve algorithm performance for high dimensional problems. TRMPS2 is benchmarked against MPS and GA using a suite of test problems. The results show that TRMPS2 performs better than MPS and GA on average for high dimensional, expensive, and black box (HEB) problems.


Sign in / Sign up

Export Citation Format

Share Document