scholarly journals An Indexing Method to Construct Unbalanced Layers for High-Dimensional Data in Mobile Environments

2017 ◽  
Vol 2017 ◽  
pp. 1-13 ◽  
Author(s):  
Sun-Young Ihm ◽  
Jae-Hee Hur ◽  
Young-Ho Park

A top-k query processing is widely used in many applications and mobile environments. An index is used for efficient query processing and layer-based indexing methods are representative to perform the top-k query processing efficiently. However, the existing methods have a problem of high index building time for multidimensional and large data; thus, it is difficult to use them. In this paper, we proposed a new concept of constructing layer-based index, which is called unbalanced layer (UB-Layer). The existing methods construct a layer as a balanced layer with outermost data and wrap the rest of the input data. However, UB-Layer constructs a layer as an unbalanced layer that does not wrap the rest of the data. To construct UB-Layer, we fist divide the dimension of the input data into divided-dimensional data and compute the convex hull in each divided-dimensional data. And then, we combine divided-convex hull to build UB-Layer. We also propose UB-SelectAttribute algorithm for dividing the dimension with major attributes. We demonstrate the superiority of the proposed methods by the performance experiments.

Computing ◽  
2021 ◽  
Author(s):  
Sun-Young Ihm ◽  
So-Hyun Park ◽  
Young-Ho Park

AbstractCloud computing, which is distributed, stored and managed, is drawing attention as data generation and storage volumes increase. In addition, research on green computing, which increases energy efficiency, is also widely studied. An index is constructed to retrieve huge dataset efficiently, and the layer-based indexing methods are widely used for efficient query processing. These methods construct a list of layers, so that only one layer is required for information retrieval instead of the entire dataset. The existing layer-based methods construct the layers using a convex hull algorithm. However, the execution time of this method is very high, especially in large, high-dimensional datasets. Furthermore, if the total number of layers increases, the query processing time also increases, resulting in efficient, but slow, query processing. In this paper, we propose an unbalanced-hierarchical layer method, which hierarchically divides the dimensions of input data to increase the total number of layers and reduce the index building time. We demonstrate that the proposed procedure significantly increases the total number of layers and reduces the index building time, compared to existing methods through the various experiments.


Author(s):  
N. Puviarasan ◽  
R. Bhavani

In Content based image retrieval (CBIR) applications, the idea of indexing is mapping the extracted descriptors from images into a high-dimensional space. In this paper, visual features like color, texture and shape are considered. The color features are extracted using color coherence vector (CCV), texture features are obtained from Segmentation based Fractal Texture Analysis (SFTA). The shape features of an image are extracted using the Fourier Descriptors (FD) which is the contour based feature extraction method. All features of an image are then combined. After combining the color, texture and shape features using appropriate weights, the quadtree is used for indexing the images. It is experimentally found that the proposed indexing method using quadtree gives better performance than the other existing indexing methods.


Author(s):  
Lixin Fu

In high-dimensional data sets, both the number of dimensions and the cardinalities of the dimensions are large and data is often very sparse, that is, most cubes are empty. For such large data sets, it is a well-known challenging problem to compute the aggregation of a measure over arbitrary combinations of dimensions efficiently. However, in real-world applications, users are usually not interested in all the sparse cubes, most of which are empty or contain only one or few tuples. Instead, they focus more on the “big picture” information the highly aggregated data, where the “where clauses” of the SQL queries involve only few dimensions. Although the input data set is sparse, this aggregate data is dense. The existing multi-pass, full-cube computation algorithms are prohibitively slow for this type of application involving very large input data sets. We propose a new dynamic data structure called Restricted Sparse Statistics Tree (RSST) and a novel cube evaluation algorithm, which are especially well suited for efficiently computing dense sub-cubes imbedded in high-dimensional sparse data sets. RSST only computes the aggregations of non-empty cube cells where the number of non-star coordinates (i.e., the number of group by attributes) is restricted to be no more than a user-specified threshold. Our innovative algorithms are scalable and I/O efficient. RSST is incrementally maintainable, which makes it suitable for data warehousing and the analysis of streaming data. We have compared our algorithms with top, state-of-the-art cube computation algorithms such as Dwarf and QCT in construction times, query response times, and data compression. Experiments demonstrate the excellent performance and good scalability of our approach.


2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Jiuwen Cao ◽  
Zhiping Lin

Extreme learning machine (ELM) has been developed for single hidden layer feedforward neural networks (SLFNs). In ELM algorithm, the connections between the input layer and the hidden neurons are randomly assigned and remain unchanged during the learning process. The output connections are then tuned via minimizing the cost function through a linear system. The computational burden of ELM has been significantly reduced as the only cost is solving a linear system. The low computational complexity attracted a great deal of attention from the research community, especially for high dimensional and large data applications. This paper provides an up-to-date survey on the recent developments of ELM and its applications in high dimensional and large data. Comprehensive reviews on image processing, video processing, medical signal processing, and other popular large data applications with ELM are presented in the paper.


Sign in / Sign up

Export Citation Format

Share Document