Web Technologies and Data Warehousing Synergies

Author(s):  
John M. Artz

Data warehousing is an emerging technology that greatly extends the capabilities of relational databases specifically in the analysis of very large sets of time-oriented data. The emergence of data warehousing has been somewhat eclipsed over the past decade by the simultaneous emergence of Web technologies. However, Web technologies and data warehousing have some natural synergies that are not immediately obvious. First, Web technologies make data warehouse data more easily available to a much wider variety of users. Second, data warehouse technologies can be used to analyze traffic to a Web site in order to gain a much better understanding of the visitors to the Web site. It is this second synergy that is the focus of this article.

2008 ◽  
pp. 3411-3415
Author(s):  
John M. Artz

Data warehousing is an emerging technology that greatly extends the capabilities of relational databases specifically in the analysis of very large sets of time-oriented data. The emergence of data warehousing has been somewhat eclipsed over the past decade by the simultaneous emergence of Web technologies. However, Web technologies and data warehousing have some natural synergies that are not immediately obvious. First, Web technologies make data warehouse data more easily available to a much wider variety of users. Second, data warehouse technologies can be used to analyze traffic to a Web site in order to gain a much better understanding of the visitors to the Web site. It is this second synergy that is the focus of this article.


Author(s):  
John M. Artz

Data warehousing is an emerging technology that greatly extends the capabilities of relational databases specifically in the analysis of very large sets of time-oriented data. The emergence of data warehousing has been somewhat eclipsed by the simultaneous emergence of Web technologies. However, Web technologies and data warehousing have some natural synergies that are just now being recognized. First, Web technologies make data warehouse data more easily available to a much wider variety of users both internally and externally. Since the value of data is directly related to its availability for exploitation, Internets and intranets help increase the value of the data in the warehouse. Second, data warehouse technologies can be used to analyze traffic to a Web site in a wide variety of ways in order to make the Web site more effective. This chapter will focus on the latter of these synergies and show, through an evolving example, how a simple data set from the Web log can be enhanced, in a step-wise fashion, into a full-fledged market research data warehouse.


2008 ◽  
pp. 2364-2370
Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


2013 ◽  
Vol 4 (1) ◽  
pp. 190-197
Author(s):  
Payal Pahwa ◽  
Rashmi Chhabra

Data warehousing is an emerging technology and has proved to be very important for an organization. Today every  business organization needs accurate and large amount of information to make proper decisions. For taking the business  decisions the data should be of good quality. To improve the data quality data cleansing is needed. Data cleansing is fundamental to warehouse data reliability, and to data warehousing success. There are various methods for datacleansing. This paper addresses issues related data cleaning. We focus on the detection of duplicate records. Also anefficient algorithm for data cleaning is proposed. A review of data cleansing methods and comparison between them is presented.


Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (in say Wal-Mart’s data warehouse (Westerman, 2000)) and astronomical data (for example SKICAT) in scientific research, with textual data providing a descriptive rather than a central analytic role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for ‘non-numeric’ data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model, and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


Author(s):  
Wilfred Ng ◽  
Mark Levene

Data warehousing is a corporate strategy that needs to integrate information from several sources of separately developed Database Management Systems (DBMSs). A future DBMS of a data warehouse should provide adequate facilities to manage a wide range of information arising from such integration. We propose that the capabilities of database languages should be enhanced to manipulate user-defined data orderings, since business queries in an enterprise usually involve order. We extend the relational model to incorporate partial orderings into data domains and describe the ordered relational model. We have already defined and implemented a minimal extension of SQL, called OSQL, which allows querying over ordered relational databases. One of the important facilities provided by OSQL is that it allows users to capture the underlying semantics of the ordering of the data for a given application. Herein we demonstrate that OSQL aided with a package discipline can be an effective means to manage the inter-related operations and the underlying data domains of a wide range of advanced applications that are vital in data warehousing, such as temporal, incomplete and fuzzy information. We present the details of the generic operations arising from these applications in the form of three OSQL packages called: OSQL_TIME, OSQL_INCOMP and OSQL_FUZZY.


2002 ◽  
Vol 185 ◽  
pp. 166-167 ◽  
Author(s):  
R. Boninsegna ◽  
J. Vandenbroere ◽  
J.F. Le Borgne ◽  

The GEOS, Groupe Européen d’Observation Stellaires, is composed of observers living in France, Italy, Spain, Belgium and Switzerland. The main purpose is to give amateur astronomers the opportunity of carrying out scientific analyses in specific fields. Further details can be found at the web site http://www.upv.es/geos/In the past years, GEOS has approached the analysis of visual estimates of variable stars (Ralincourt et al., 1987) in an original manner, obtaining accurate light curves on red semiregulars. Moreover, the collaboration with teams of professional researchers allowed the group to obtain interesting results on the double-mode Cepheid EW Set (Figer et al., 1991), on the Be star OT Gem (Arellano Ferro et al., 1998) and on the eclipsing binary V753 Cyg (Beltraminelli et al., 2000). More recently, several GEOS members have started collecting a large sample of times of maximum brightness of RR Lyr stars in order to build-up a database as extensive as possible. Such a database is decribed below together with the results of a campaign on RR Lyr itself.


Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


Sign in / Sign up

Export Citation Format

Share Document