From Web Log to Data Warehouse

Web Technologies and Data Warehousing Synergies

Encyclopedia of Information Science and Technology, First Edition ◽

10.4018/978-1-59140-553-5.ch545 ◽

2005 ◽

pp. 3065-3067

Author(s):

John M. Artz

Keyword(s):

Data Warehouse ◽

Web Site ◽

Relational Databases ◽

Data Warehousing ◽

Emerging Technology ◽

Web Technologies ◽

The Past ◽

Large Sets ◽

The Web

Data warehousing is an emerging technology that greatly extends the capabilities of relational databases specifically in the analysis of very large sets of time-oriented data. The emergence of data warehousing has been somewhat eclipsed over the past decade by the simultaneous emergence of Web technologies. However, Web technologies and data warehousing have some natural synergies that are not immediately obvious. First, Web technologies make data warehouse data more easily available to a much wider variety of users. Second, data warehouse technologies can be used to analyze traffic to a Web site in order to gain a much better understanding of the visitors to the Web site. It is this second synergy that is the focus of this article.

Download Full-text

Web Technologies and Data Warehousing Synergies

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch214 ◽

2008 ◽

pp. 3411-3415

Author(s):

John M. Artz

Keyword(s):

Data Warehouse ◽

Web Site ◽

Relational Databases ◽

Data Warehousing ◽

Emerging Technology ◽

Web Technologies ◽

The Past ◽

Large Sets ◽

The Web

Data warehousing is an emerging technology that greatly extends the capabilities of relational databases specifically in the analysis of very large sets of time-oriented data. The emergence of data warehousing has been somewhat eclipsed over the past decade by the simultaneous emergence of Web technologies. However, Web technologies and data warehousing have some natural synergies that are not immediately obvious. First, Web technologies make data warehouse data more easily available to a much wider variety of users. Second, data warehouse technologies can be used to analyze traffic to a Web site in order to gain a much better understanding of the visitors to the Web site. It is this second synergy that is the focus of this article.

Download Full-text

Humanitites Data Warehousing

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch141 ◽

2008 ◽

pp. 2364-2370

Author(s):

Janet Delve

Keyword(s):

Data Warehouse ◽

Relational Databases ◽

Data Warehousing ◽

Numerical Data ◽

Complex Nature ◽

Data Warehouses ◽

Textual Data ◽

Numeric Data ◽

First Time ◽

And Linguistics

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.

Download Full-text

Humanities Data Warehousing

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch153 ◽

2011 ◽

pp. 987-992

Author(s):

Janet Delve

Keyword(s):

Data Warehouse ◽

Relational Databases ◽

Data Warehousing ◽

Numerical Data ◽

Complex Nature ◽

Data Warehouses ◽

Textual Data ◽

Numeric Data ◽

First Time ◽

And Linguistics

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (in say Wal-Mart’s data warehouse (Westerman, 2000)) and astronomical data (for example SKICAT) in scientific research, with textual data providing a descriptive rather than a central analytic role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for ‘non-numeric’ data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model, and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.

Download Full-text

The Development of Ordered SQL Packages to Support Data Warehousing

Data Warehousing and Web Engineering ◽

10.4018/978-1-931777-02-5.ch018 ◽

2011 ◽

pp. 285-311

Author(s):

Wilfred Ng ◽

Mark Levene

Keyword(s):

Data Warehouse ◽

Relational Databases ◽

Corporate Strategy ◽

Data Warehousing ◽

Effective Means ◽

Relational Model ◽

Minimal Extension ◽

Wide Range ◽

Partial Orderings ◽

Advanced Applications

Data warehousing is a corporate strategy that needs to integrate information from several sources of separately developed Database Management Systems (DBMSs). A future DBMS of a data warehouse should provide adequate facilities to manage a wide range of information arising from such integration. We propose that the capabilities of database languages should be enhanced to manipulate user-defined data orderings, since business queries in an enterprise usually involve order. We extend the relational model to incorporate partial orderings into data domains and describe the ordered relational model. We have already defined and implemented a minimal extension of SQL, called OSQL, which allows querying over ordered relational databases. One of the important facilities provided by OSQL is that it allows users to capture the underlying semantics of the ordering of the data for a given application. Herein we demonstrate that OSQL aided with a package discipline can be an effective means to manage the inter-related operations and the underlying data domains of a wide range of advanced applications that are vital in data warehousing, such as temporal, incomplete and fuzzy information. We present the details of the generic operations arising from these applications in the form of three OSQL packages called: OSQL_TIME, OSQL_INCOMP and OSQL_FUZZY.

Download Full-text

MANAGING ACTIVITY DYNAMICS OF WEB BASED COLLABORATIVE APPLICATIONS

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213099000142 ◽

1999 ◽

Vol 08 (02) ◽

pp. 207-227 ◽

Cited By ~ 1

Author(s):

CHEN-CHUNG LIU ◽

GWO-DONG CHEN ◽

KUO-LIANG OU ◽

BAW-JHIUNE LIU ◽

JORNG-TZONG HORNG

Keyword(s):

Web Site ◽

Dynamic Structure ◽

Communication Behavior ◽

Static Structure ◽

Web Based ◽

Web Log ◽

Web Access ◽

Collaborative Activities ◽

The Web ◽

Activity Dynamics

The World Wide Web has been widely accepted as a viable communication infrastructure to support collaborative activities on computer networks. While cooperating objects of different roles can easily and freely communicate knowledge on the web, the web site managers/developers must write programs to manage the communication behavior in collaborative activities. However, the current hypertext model for the web concentrates on the static structure of hypertext. Few conceptual specifications are capable of effectively integrating the hypertext model with activity dynamics to clarify the dynamic interaction and constraints of desired collaborative activities on the web. Furthermore, decision-makers must observe communication behavior on the web to adapt collaborative activities. Although web servers register each web access in a web log, up to now, only a few query or report mechanisms have been available to obtain required information from the web log. This study presents a specification to capture the static and dynamic structure of intended collaborative activities, and a query mechanism to obtain required information from the web log. The specification and query mechanism make it possible to construct a web site that will provide group activity space and flexibly interpret roles, encourage individuals to commit to responsibilities, and enable activities to be observed.

Download Full-text

Mining Frequent Generalized Patterns for Web Personalization in the Presence of Taxonomies

Exploring Advances in Interdisciplinary Data Mining and Analytics ◽

10.4018/978-1-61350-474-1.ch004 ◽

2011 ◽

pp. 52-68

Author(s):

Panagiotis Giannikopoulos ◽

Iraklis Varlamis ◽

Magdalini Eirinaki

Keyword(s):

Association Rules ◽

Web Site ◽

Aggregate Level ◽

Web Page ◽

Web Log ◽

Log Files ◽

Generalized Association Rules ◽

Generalized Patterns ◽

Navigation Patterns ◽

The Web

The Web is a continuously evolving environment, since its content is updated on a regular basis. As a result, the traditional usage-based approach to generate recommendations that takes as input the navigation paths recorded on the Web page level, is not as effective. Moreover, most of the content available online is either explicitly or implicitly characterized by a set of categories organized in a taxonomy, allowing the page-level navigation patterns to be generalized to a higher, aggregate level. In this direction, the authors present the Frequent Generalized Pattern (FGP) algorithm. FGP takes as input the transaction data and a hierarchy of categories and produces generalized association rules that contain transaction items and/or item categories. The results can be used to generate association rules and subsequently recommendations for the users. The algorithm can be applied to the log files of a typical Web site; however, it can be more helpful in a Web 2.0 application, such as a feed aggregator or a digital library mediator, where content is semantically annotated and the taxonomic nature is more complex, requiring us to extend FGP in a version called FGP+. The authors experimentally evaluate both algorithms using Web log data collected from a newspaper Web site.

Download Full-text

Humanitites Data Warehousing

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch108 ◽

2011 ◽

pp. 570-574

Author(s):

Janet Delve

Keyword(s):

Data Warehouse ◽

Relational Databases ◽

Data Warehousing ◽

Numerical Data ◽

Complex Nature ◽

Data Warehouses ◽

Textual Data ◽

Numeric Data ◽

First Time ◽

And Linguistics

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.

Download Full-text

Implementasi Sentimen Analysis Pengolahan Kata Berbasis Algoritma Map Reduce Menggunakan Hadoop

Systemic Information System and Informatics Journal ◽

10.29080/systemic.v4i1.337 ◽

2018 ◽

Vol 4 (1) ◽

pp. 11-16

Author(s):

Fawaid Badri

Keyword(s):

Sentiment Analysis ◽

Input Data ◽

Research Data ◽

Distributed File System ◽

Data Sets ◽

Text Documents ◽

Data Set ◽

Map Algorithm ◽

Hadoop Distributed File System ◽

The Web

Sentiment analysis is a field of text and information based research. Text documents in this language come from the web about socialization issues. The method used in this study uses algorithmic maps to calculate from a word that will be used to find a meaning in the context of public opinion. The map algorithm reduces the retrieval of data sets and converts them into a data set, data collection of individuals separated into tuples. The stages of the map algorithm reduce reading input data in the form of text stored in HDFS (Hadoop Distributed File System) then it will be processed according to the key and the value has been changed into tuple form. The next step is to process the shuffel and reduce it which will then produce a process from the data set that is processed. Furthermore, the research data uses sentiment analysis by using a map algorithm to reduce the amount of data that is very good

Download Full-text

Woolf, creativity, and madness: From Freud to fMRI---The web site

PsycEXTRA Dataset ◽

10.1037/e524362008-001 ◽

2008 ◽

Author(s):

Michele T. Wick ◽

Karen Kukil

Keyword(s):

Web Site ◽

The Web

Download Full-text