Spatial Data in Multidimensional Conceptual Models

Author(s):  
Elzbieta Malinowski

Data warehouses (DWs) are used for storing and analyzing high volumes of historical data. The structure of DWs is usually represented as a star schema consisting of fact and dimension tables. A fact table contains numeric data called measures (e.g., quantity). Dimensions are used for exploring measures from different analysis perspectives (e.g., according to products). They usually contain hierarchies required for online analysis processing (OLAP) systems in order to dynamically manipulate DW data. While traversing hierarchy, two operations can be executed: the roll-up operation, which transforms detailed measures into aggregated data (e.g., daily into monthly sales); and the drill-down operation, which does the opposite.

Author(s):  
Elzbieta Malinowski ◽  
Esteban Zimányi

Data warehouses keep large amounts of historical data in order to help users at different management levels to make more effective decisions. Conventional data warehouses are designed based on a multidimensional view of data. They are usually represented as star or snowflake schemas that contain relational tables called fact and dimension tables. A fact table expresses the focus of analysis (e.g., analysis of sales) and contains numeric data called measures (e.g., quantity). Measures can be analyzed according to different analysis criteria or dimensions (e.g., by product). Dimensions include attributes that can form hierarchies (e.g., product-category). Data in a data warehouse can be dynamically manipulated using on-line analysis processing (OLAP) systems. In particular, these systems allow automatic measure aggregations while traversing hierarchies. For example, the roll-up operation transforms detailed measures into aggregated data (e.g., daily into monthly sales) while the drill-down operation does the contrary. Data warehouses typically include a location dimension, e.g., store or client address. This dimension is usually represented in an alphanumeric format. However, the advantages of using spatial data in the analysis process are well known since visualizing data in space allows users to reveal patterns that are difficult to discover otherwise. Spatial databases have been used for several decades for storing and managing spatial data. This kind of data typically represents geographical objects, i.e., objects located on the Earth’s surface (such as mountains, cities) or geographic phenomena (such as temperature, altitude). Due to technological advances, the amount of available spatial data is growing considerably, e.g., satellite images, and location data from remote sensing systems, such as Global Positioning Systems (GPS). Spatial databases are typically used for daily business manipulations, e.g., to find a specific place from the current position given by a GPS. However, spatial databases are not well suited for supporting the decision-making process (Bédard, Rivest, & Proulx, 2007), e.g., to find the best location for a new store. Therefore, the field of spatial data warehouses emerged as a response to the necessity of analyzing high volumes of spatial data. Since applications including spatial data are usually complex, they should be modeled at a conceptual level taking into account users’ requirements and leaving out complex implementation details. The advantages of using conceptual models for database design are well known. In conventional data warehouses, a multidimensional model is commonly used for expressing users’ requirements and for facilitating the subsequent implementation; however, in spatial data warehouses this model is seldom used. Further, existing conceptual models for spatial databases are not adequate for multidimensional modeling since they do not include the concepts of dimensions, hierarchies, and measures.


Author(s):  
Elzbieta Malinowski

In the database design, the advantages of using conceptual models for representing users’ requirements are well known. Nevertheless, even though data warehouses (DWs) are databases that store historical data for analytical purposes, they are usually represented at the logical level using the star and snowflake schemas. These schemas facilitate delivery of data for online analytical processing (OLAP) systems. In particular, hierarchies are important since traversing them, OLAP tools perform automatic aggregations of data using the roll-up and drill-down operations. The former operation transforms detailed data into aggregated ones (e.g., daily into monthly sales) while the latter does the opposite.


2017 ◽  
Vol 10 (04) ◽  
pp. 745-754
Author(s):  
Mudasir M Kirmani

Data Warehouse design requires a radical rebuilding of tremendous measures of information, frequently of questionable or conflicting quality, drawn from various heterogeneous sources. Data Warehouse configuration assimilates business learning and innovation know-how. The outline of theData Warehouse requires a profound comprehension of the business forms in detail. The principle point of this exploration paper is to contemplate and investigate the transformation model to change over the E-R outlines to Star Schema for developing Data Warehouses. The Dimensional modelling is a logical design technique used for data warehouses. This research paper addresses various potential differences between the two techniques and highlights the advantages of using dimensional modelling along with disadvantages as well. Dimensional Modelling is one of the popular techniques for databases that are designed keeping in mind the queries from end-user in a data warehouse. In this paper the focus has been on Star Schema, which basically comprises of Fact table and Dimension tables. Each fact table further comprises of foreign keys of various dimensions and measures and degenerate dimensions if any. We also discuss the possibilities of deployment and acceptance of Conversion Model (CM) to provide the details of fact table and dimension tables according to the local needs. It will also highlight to why dimensional modelling is preferred over E-R modelling when creating data warehouse.


2018 ◽  
Vol 11 (2) ◽  
pp. 205979911878774 ◽  
Author(s):  
Mark Finnane ◽  
Andy Kaladelfos ◽  
Alana Piper

Historical data pose a variety of problems to those who seek statistically based understandings of the past. Quantitative historical analysis has been limited by researcher’s reliance on rigid statistics collected by individuals or agencies, or else by researcher access to small samples of raw data. Even digital technologies by themselves have not been enough to overcome the challenges of working with manuscript sources and aligning dis-aggregated data. However, by coupling the facilities enabled by the web with the enthusiasm of the public for explorations of the past, history has started to make the same strides towards big data evident in other fields. While the use of citizens to crowdsource research data was first pioneered within the sciences, a number of projects have similarly begun to draw on the help of citizen historians. This article explores the particular example of the Prosecution Project, which since 2014 has been using crowdsourced volunteers on a research collaboration to build a large-scale relational database of criminal prosecutions throughout Australia from the early 1800s to 1960s. The article outlines the opportunities and challenges faced by projects seeking to use web technologies to access, store and re-use historical data in an environment that increasingly enables creative collaborations between researchers and other users of social and historical data.


Author(s):  
Khaled Dehdouh

In the big data warehouses context, a column-oriented NoSQL database system is considered as the storage model which is highly adapted to data warehouses and online analysis. Indeed, the use of NoSQL models allows data scalability easily and the columnar store is suitable for storing and managing massive data, especially for decisional queries. However, the column-oriented NoSQL DBMS do not offer online analysis operators (OLAP). To build OLAP cubes corresponding to the analysis contexts, the most common way is to integrate other software such as HIVE or Kylin which has a CUBE operator to build data cubes. By using that, the cube is built according to the row-oriented approach and does not allow to fully obtain the benefits of a column-oriented approach. In this chapter, the main contribution is to define a cube operator called MC-CUBE (MapReduce Columnar CUBE), which allows building columnar NoSQL cubes according to the columnar approach by taking into account the non-relational and distributed aspects when data warehouses are stored.


2008 ◽  
pp. 2364-2370
Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


2010 ◽  
Vol 1 (1) ◽  
pp. 38-42 ◽  
Author(s):  
Rex R. Johnson ◽  
Diane A. Granfors ◽  
Neal D. Niemuth ◽  
Michael E. Estey ◽  
Ronald E. Reynolds

Abstract Conservation of birds is increasingly focused on the importance of landscape characteristics to sustain populations. Implementing conservation on a landscape scale requires reliable spatial models that provide biological context for conservation actions. Before species-specific models relating grassland birds to their habitat at landscape scales existed, we created a conceptual model and applied it to spatial data to identify priority grassland habitats for the protection and restoration of populations of area sensitive grassland birds in the Prairie Pothole Region. Since that time, these Grassland Bird Conservation Areas have been widely used to guide conservation, and variations of these models have been adopted in other regions; however, the process used to delineate them (i.e., the conceptual models) is poorly understood by many users. We describe that process here and offer perspectives on the utility and limitations of conceptual models, especially on the value of making assumptions that commonly underlie management decisions explicitly, thereby making the assumptions testable, and hopefully increasing management transparency, credibility, and efficiency.


Sign in / Sign up

Export Citation Format

Share Document