Designing Data Marts from XML and Relational Data Sources

Author(s):  
Yasser Hachaichi ◽  
Jamel Feki ◽  
Hanene Ben-Abdallah

Due to the international economic competition, enterprises are ever looking for efficient methods to build data marts/warehouses to analyze the large data volume in their decision making process. On the other hand, even though the relational data model is the most commonly used model, any data mart/ warehouse construction method must now deal with other data types and in particular XML documents which represent the dominant type of data exchanged between partners and retrieved from the Web. This chapter presents a data mart design method that starts from both a relational database source and XML documents compliant to a given DTD. Besides considering these two types of data structures, the originality of our method lies in its being decision maker centered, its automatic extraction of loadable data mart schemas and its genericity.

2011 ◽  
pp. 427-451
Author(s):  
Yasser Hachaichi ◽  
Jamel Feki ◽  
Hanene Ben-Abdallah

Due to the international economic competition, enterprises are ever looking for efficient methods to build data marts/warehouses to analyze the large data volume in their decision making process. On the other hand, even though the relational data model is the most commonly used model, any data mart/ warehouse construction method must now deal with other data types and in particular XML documents which represent the dominant type of data exchanged between partners and retrieved from the Web. This chapter presents a data mart design method that starts from both a relational database source and XML documents compliant to a given DTD. Besides considering these two types of data structures, the originality of our method lies in its being decision maker centered, its automatic extraction of loadable data mart schemas and its genericity.


2021 ◽  
Vol 17 (2) ◽  
pp. 1-45
Author(s):  
Cheng Pan ◽  
Xiaolin Wang ◽  
Yingwei Luo ◽  
Zhenlin Wang

Due to large data volume and low latency requirements of modern web services, the use of an in-memory key-value (KV) cache often becomes an inevitable choice (e.g., Redis and Memcached). The in-memory cache holds hot data, reduces request latency, and alleviates the load on background databases. Inheriting from the traditional hardware cache design, many existing KV cache systems still use recency-based cache replacement algorithms, e.g., least recently used or its approximations. However, the diversity of miss penalty distinguishes a KV cache from a hardware cache. Inadequate consideration of penalty can substantially compromise space utilization and request service time. KV accesses also demonstrate locality, which needs to be coordinated with miss penalty to guide cache management. In this article, we first discuss how to enhance the existing cache model, the Average Eviction Time model, so that it can adapt to modeling a KV cache. After that, we apply the model to Redis and propose pRedis, Penalty- and Locality-aware Memory Allocation in Redis, which synthesizes data locality and miss penalty, in a quantitative manner, to guide memory allocation and replacement in Redis. At the same time, we also explore the diurnal behavior of a KV store and exploit long-term reuse. We replace the original passive eviction mechanism with an automatic dump/load mechanism, to smooth the transition between access peaks and valleys. Our evaluation shows that pRedis effectively reduces the average and tail access latency with minimal time and space overhead. For both real-world and synthetic workloads, our approach delivers an average of 14.0%∼52.3% latency reduction over a state-of-the-art penalty-aware cache management scheme, Hyperbolic Caching (HC), and shows more quantitative predictability of performance. Moreover, we can obtain even lower average latency (1.1%∼5.5%) when dynamically switching policies between pRedis and HC.


2018 ◽  
Vol 4 (12) ◽  
pp. 142 ◽  
Author(s):  
Hongda Shen ◽  
Zhuocheng Jiang ◽  
W. Pan

Hyperspectral imaging (HSI) technology has been used for various remote sensing applications due to its excellent capability of monitoring regions-of-interest over a period of time. However, the large data volume of four-dimensional multitemporal hyperspectral imagery demands massive data compression techniques. While conventional 3D hyperspectral data compression methods exploit only spatial and spectral correlations, we propose a simple yet effective predictive lossless compression algorithm that can achieve significant gains on compression efficiency, by also taking into account temporal correlations inherent in the multitemporal data. We present an information theoretic analysis to estimate potential compression performance gain with varying configurations of context vectors. Extensive simulation results demonstrate the effectiveness of the proposed algorithm. We also provide in-depth discussions on how to construct the context vectors in the prediction model for both multitemporal HSI and conventional 3D HSI data.


2021 ◽  
Author(s):  
Rens Hofman ◽  
Joern Kummerow ◽  
Simone Cesca ◽  
Joachim Wassermann ◽  
Thomas Plenefisch ◽  
...  

<p>The AlpArray seismological experiment is an international and interdisciplinary project to advance our understanding of geophysical processes in the greater Alpine region. The heart of the project consists of a large seismological array that covers the mountain range and its surrounding areas. To understand how the Alps and their neighbouring mountain belts evolved through time, we can only study its current structure and processes. The Eastern Alps are of prime interest since they currently demonstrate the highest crustal deformation rates. A key question is how these surface processes are linked to deeper structures. The Swath-D network is an array of temporary seismological stations complementary to the AlpArray network located in the Eastern Alps. This creates a unique opportunity to investigate high resolution seismicity on a local scale.</p><p>In this study, a combination of waveform-based detection methods was used to find small earthquakes in the large data volume of the Swath-D network. Methods were developed to locate the seismic events using semi-automatic picks, and estimate event magnitudes. We present an overview of the methods and workflow, as well as a preliminary overview of the seismicity in the Eastern Alps.</p>


Sign in / Sign up

Export Citation Format

Share Document