scholarly journals LevelFilesSet: An efficient Data Structure for Scalable Web Tiled Map Management Systems

2019 ◽  
Vol 2 ◽  
pp. 1-10
Author(s):  
Menelaos Kotsollaris ◽  
William Liu ◽  
Emmanuel Stefanakis ◽  
Yun Zhang

<p><strong>Abstract.</strong> Modern map visualizations are built using data structures for storing tile images, while their main concerns are to maximize efficiency and usability. The core functionality of a web tiled map management system is to provide tile images to the end user; several tiles combined construe the web map. To achieve this, several data structures are showcased and analyzed. Specifically, this paper focuses on the SimpleFormat, which stores the tiles directly on the file system; the ImageBlock, which divides each tile folder (a folder where the tile images are stored) into subfolders that contain multiple tiles prior to storing the tiles on the file system; the LevelFilesSet, a data structure that creates dedicated Random-Access files, wherein the tile dataset is first stored and then parsed in files to retrieve the tile images; and, finally, the LevelFilesBlock, a hybrid data structure which combines ImageBlock and LevelFilesSet data structures. This work signifies the first time this hybrid approach has been implemented and applied in a web tiled map context. The JDBC API was used for integrating with the PostgreSQL database. This database was then used to conduct cross-testing amongst the data structures. Subsequently, several benchmark tests on local and cloud environments are developed anew and assessed under different system configurations to compare the data structures and provide a thorough analysis of their efficiency. These benchmarks showcased the efficiency of LevelFilesSet, which retrieved tiles up to 3.3 times faster than the other data structures. Peripheral features and principles of implementing scalable web tiled map management systems among different software architectures and system configurations are analyzed and discussed.</p>

2013 ◽  
Vol 756-759 ◽  
pp. 1387-1391
Author(s):  
Xiao Dong Wang ◽  
Jun Tian

Building an efficient data structure for range selection problems is considered. While there are several theoretical solutions to the problem, only a few have been tried out, and there is little idea on how the others would perform. The computation model used in this paper is the RAM model with word-size . Our data structure is a practical linear space data structure that supports range selection queries in time with preprocessing time.


2016 ◽  
Vol 8 (6) ◽  
Author(s):  
Yuanxi Sun ◽  
Wenjie Ge ◽  
Jia Zheng ◽  
Dianbiao Dong

This paper presents a systematic solution of the kinematics of the planar mechanism from the aspect of Assur groups. When the planar mechanism is decomposed into Assur groups, the detailed calculating order of Assur groups is unknown. To solve this problem, first, the decomposed Assur groups are classified into three types according to their calculability, which lays the foundation for the establishment of the automatic solving algorithm for decomposed Assur groups. Second, the data structure for the Assur group is presented, which enables the automatic solving algorithm with the input and output parameters of each Assur group. All decomposed Assur groups are stored in the component stack, and all parameters of which are stored in the parameter stacks. The automatic algorithm will detect identification flags of each Assur group in the component stack and their corresponding parameters in the parameter stacks in order to decide which Assur group is calculable and which one can be solved afterward. The proposed systematic solution is able to generate an automatic solving order for all Assur groups in the planar mechanism and allows the adding, modifying, and removing of Assur groups at any time. Two planar mechanisms are given as examples to show the detailed process of the proposed systematic solution.


Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 276
Author(s):  
Paniz Abedin ◽  
Arnab Ganguly ◽  
Solon P. Pissis ◽  
Sharma V. Thankachan

Let T[1,n] be a string of length n and T[i,j] be the substring of T starting at position i and ending at position j. A substring T[i,j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as input, the Shortest Unique Substring problem is to find a shortest substring of T that does not occur elsewhere in T. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over T answering the following type of online queries efficiently. Given a range [α,β], return a shortest substring T[i,j] of T with exactly one occurrence in [α,β]. We present an O(nlogn)-word data structure with O(logwn) query time, where w=Ω(logn) is the word size. Our construction is based on a non-trivial reduction allowing for us to apply a recently introduced optimal geometric data structure [Chan et al., ICALP 2018]. Additionally, we present an O(n)-word data structure with O(nlogϵn) query time, where ϵ>0 is an arbitrarily small constant. The latter data structure relies heavily on another geometric data structure [Nekrich and Navarro, SWAT 2012].


2019 ◽  
pp. 58-66
Author(s):  
Máté Nagy ◽  
János Tapolcai ◽  
Gábor Rétvári

Opportunistic data structures are used extensively in big data practice to break down the massive storage space requirements of processing large volumes of information. A data structure is called (singly) opportunistic if it takes advantage of the redundancy in the input in order to store it in iformationtheoretically minimum space. Yet, efficient data processing requires a separate index alongside the data, whose size often substantially exceeds that of the compressed information. In this paper, we introduce doubly opportunistic data structures to not only attain best possible compression on the input data but also on the index. We present R3D3 that encodes a bitvector of length n and Shannon entropy H0 to nH0 bits and the accompanying index to nH0(1/2 + O(log C/C)) bits, thus attaining provably minimum space (up to small error terms) on both the data and the index, and supports a rich set of queries to arbitrary position in the compressed bitvector in O(C) time when C = o(log n). Our R3D3 prototype attains several times space reduction beyond known compression techniques on a wide range of synthetic and real data sets, while it supports operations on the compressed data at comparable speed.


Algorithmica ◽  
2020 ◽  
Vol 82 (12) ◽  
pp. 3707-3743
Author(s):  
Amihood Amir ◽  
Panagiotis Charalampopoulos ◽  
Solon P. Pissis ◽  
Jakub Radoszewski

Abstract Given two strings S and T, each of length at most n, the longest common substring (LCS) problem is to find a longest substring common to S and T. This is a classical problem in computer science with an $$\mathcal {O}(n)$$ O ( n ) -time solution. In the fully dynamic setting, edit operations are allowed in either of the two strings, and the problem is to find an LCS after each edit. We present the first solution to the fully dynamic LCS problem requiring sublinear time in n per edit operation. In particular, we show how to find an LCS after each edit operation in $$\tilde{\mathcal {O}}(n^{2/3})$$ O ~ ( n 2 / 3 ) time, after $$\tilde{\mathcal {O}}(n)$$ O ~ ( n ) -time and space preprocessing. This line of research has been recently initiated in a somewhat restricted dynamic variant by Amir et al. [SPIRE 2017]. More specifically, the authors presented an $$\tilde{\mathcal {O}}(n)$$ O ~ ( n ) -sized data structure that returns an LCS of the two strings after a single edit operation (that is reverted afterwards) in $$\tilde{\mathcal {O}}(1)$$ O ~ ( 1 ) time. At CPM 2018, three papers (Abedin et al., Funakoshi et al., and Urabe et al.) studied analogously restricted dynamic variants of problems on strings; specifically, computing the longest palindrome and the Lyndon factorization of a string after a single edit operation. We develop dynamic sublinear-time algorithms for both of these problems as well. We also consider internal LCS queries, that is, queries in which we are to return an LCS of a pair of substrings of S and T. We show that answering such queries is hard in general and propose efficient data structures for several restricted cases.


Author(s):  
L. Andrade ◽  
T. Taylor

Abstract High volume products in manufacturing require fast yield learning, root cause identification, and verification that process or tool problems are fixed. Yield losses of 1% correspond to very large dollar losses. Therefore, it is important to have sophisticated data analysis tools that handle large volumes of data to drive higher yields. This paper will present our methodology for defining yields, assessing wafer yield signatures, and using data analysis tools to determine tools or processes which drive yield loss. A SAS based data analysis tool will be shown which can identify tool or process related problems causing abnormalities in parametrics and impacting yield. Case studies illustrating the usefulness of the tool are shown for a Synchronous Dynamic Random Access Memory (SDRAM) product from our wafer fab. In the final analysis, it is clear that an efficient data analysis approach utilizes resources most effectively and pinpoints yield problems with minimal cycle time.


2001 ◽  
Vol 11 (5) ◽  
pp. 525-556 ◽  
Author(s):  
GRAEME E. MOSS ◽  
COLIN RUNCIMAN

Every designer of a new data structure wants to know how well it performs in comparison with others. But finding, coding and testing applications as benchmarks can be tedious and time-consuming. Besides, how a benchmark uses a data structure may considerably affect its apparent efficiency, so the choice of applications may bias the results. We address these problems by developing a tool for inductive benchmarking. This tool, Auburn, can generate benchmarks across a wide distribution of uses. We precisely define ‘the use of a data structure’, upon which we build the core algorithms of Auburn: how to generate a benchmark from a description of use, and how to extract a description of use from an application. We then apply inductive classification techniques to obtain decision trees for the choice between competing data structures. We test Auburn by benchmarking several implementations of three common data structures: queues, random-access lists and heaps. These and other results show Auburn to be a useful and accurate tool, but they also reveal some limitations of the approach.


1991 ◽  
Vol 01 (03) ◽  
pp. 207-226 ◽  
Author(s):  
SESHAGIRI RAO ALA

In this paper we propose a universal data structure (UDS), termed as UDS, which will aid in the design of optimal boundary data structures. We later show, with the aid of some recently published data structures, that any data structure can be expressed as a special case of UDS. We demonstrate how the application of the optimality concepts of the UDS can lead us to the discovery of more efficient data structures than popular data structures. We also discuss two approaches for optimization. We show that a globally optimal data structure is better than a special purpose optimal data structure.


2021 ◽  
Vol 25 (2) ◽  
pp. 283-303
Author(s):  
Na Liu ◽  
Fei Xie ◽  
Xindong Wu

Approximate multi-pattern matching is an important issue that is widely and frequently utilized, when the pattern contains variable-length wildcards. In this paper, two suffix array-based algorithms have been proposed to solve this problem. Suffix array is an efficient data structure for exact string matching in existing studies, as well as for approximate pattern matching and multi-pattern matching. An algorithm called MMSA-S is for the short exact characters in a pattern by dynamic programming, while another algorithm called MMSA-L deals with the long exact characters by the edit distance method. Experimental results of Pizza & Chili corpus demonstrate that these two newly proposed algorithms, in most cases, are more time-efficient than the state-of-the-art comparison algorithms.


Sign in / Sign up

Export Citation Format

Share Document