Data warehouse clustering on the web

Data warehouses have established themselves as necessary components of an effective Information Technology (IT) strategy for large businesses. In addition to utilizing operational databases data warehouses must also integrate increasing amounts of external data to assist in decision support. An important source of such external data is the Web. In an effort to ensure the availability and quality of Web data for the data warehouse we propose an intermediate data-staging layer called the Meta-Data Engine (M-DE). A major challenge, however, is the conversion of data originating in the Web, and brought in by robust search engines, to data in the data warehouse. The authors therefore also propose a framework, the Semantic Web Application (SEMWAP) framework, which facilitates semi-automatic matching of instance data from opaque web databases using ontology terms. Their framework combines Information Retrieval (IR), Information Extraction (IE), Natural Language Processing (NLP), and ontology techniques to produce a matching and thus provide a viable building block for Semantic Web (SW) Applications.

Download Full-text

Semantic Web and Digital Libraries

Teaching in the Knowledge Society ◽

10.4018/978-1-59140-953-3.ch018 ◽

2011 ◽

pp. 271-285

Author(s):

Giorgio Poletti

Keyword(s):

Semantic Web ◽

Data Warehouse ◽

Digital Library ◽

Digital Libraries ◽

Knowledge Society ◽

Specialized Software ◽

Close Relationship ◽

Definition Of ◽

Access To Documents ◽

The Web

An analysis of the reality surrounding us clearly reveals the great amount of information, available in different forms and through different media. Volumes of information available in real time and via the Web are concepts perceived as closely related. This perception is supported by the remark that the objective of the Web was the definition and construction of a universal archive, a virtual site in which the access to documents was possible with no limits of time or space. In this digital library, documents have to be equipped with logical connections making possible for each user the definition of a reading map that expands according to the demand for knowledge gradually built up. This perspective is pointing now in the direction of the Semantic Web, a network satisfying our requests while understanding them, not by some magic telepathic communication between browser and navigator, but rather a data warehouse in which documents are matched to meta-data,1 letting specialized software to distinguish fields, importance, and correlation between documents. Semantic Web and library terms have an ever increasing close relationship, fundamental for the progress and the didactic efficiency in knowledge society.

Download Full-text

Data warehouse clustering on the Web

Proceedings. 13th International Workshop on Database and Expert Systems Applications ◽

10.1109/dexa.2002.1045995 ◽

2004 ◽

Author(s):

A. Triantafillakis ◽

P. Kanellis ◽

D. Martakos

Keyword(s):

Data Warehouse ◽

The Web

Download Full-text

Aligning the Warehouse and the Web

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch004 ◽

2011 ◽

pp. 18-24 ◽

Cited By ~ 1

Author(s):

Hadrian Peter

Keyword(s):

Semantic Web ◽

Data Warehouse ◽

Business Environment ◽

Global Business ◽

Meta Data ◽

External Data ◽

Extensible Markup ◽

Modern Business ◽

The Impact ◽

The Web

Data warehouses have established themselves as necessary components of an effective IT strategy for large businesses. To augment the streams of data being siphoned from transactional/operational databases warehouses must also integrate increasing amounts of external data to assist in decision support. Modern warehouses can be expected to handle up to 100 Terabytes or more of data. (Berson and Smith, 1997; Devlin, 1998; Inmon 2002; Imhoff et al, 2003; Schwartz, 2003; Day 2004; Peter and Greenidge, 2005; Winter and Burns 2006; Ladley, 2007). The arrival of newer generations of tools and database vendor support has smoothed the way for current warehouses to meet the needs of the challenging global business environment ( Kimball and Ross, 2002; Imhoff et al, 2003; Ross, 2006). We cannot ignore the role of the Internet in modern business and the impact on data warehouse strategies. The web represents the richest source of external data known to man ( Zhenyu et al, 2002; Chakrabarti, 2002; Laender et al, 2002) but we must be able to couple raw text or poorly structured data on the web with descriptions, annotations and other forms of summary meta-data (Crescenzi et al, 2001). In recent years the Semantic Web initiative has focussed on the production of “smarter data”. The basic idea is that instead of making programs with near human intelligence, we rather carefully add meta-data to existing stores so that the data becomes “marked up” with all the information necessary to allow not-sointelligent software to perform analysis with minimal human intervention. (Kalfoglou et al, 2004) The Semantic Web builds on established building block technologies such as Unicode, URIs(Uniform Resource Indicators) and XML (Extensible Markup Language) (Dumbill, 2000; Daconta et al, 2003; Decker et al, 2000). The modern data warehouse must embrace these emerging web initiatives. In this paper we propose a model which provides mechanisms for sourcing external data resources for analysts in the warehouse.

Download Full-text

Web Page Extension of Data Warehouses

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch320 ◽

2011 ◽

pp. 2090-2095

Author(s):

Anthony Scime

Keyword(s):

Decision Making ◽

Data Warehouse ◽

Decision Maker ◽

Search Engine ◽

External Information ◽

Data Warehouses ◽

Strategic Decisions ◽

Web Searches ◽

To Come ◽

The Web

Data warehouses are constructed to provide valuable and current information for decision-making. Typically this information is derived from the organization’s functional databases. The data warehouse is then providing a consolidated, convenient source of data for the decision-maker. However, the available organizational information may not be sufficient to come to a decision. Information external to the organization is also often necessary for management to arrive at strategic decisions. Such external information may be available on the World Wide Web; and when added to the data warehouse extends decision-making power. The Web can be considered as a large repository of data. This data is on the whole unstructured and must be gathered and extracted to be made into something valuable for the organizational decision maker. To gather this data and place it into the organization’s data warehouse requires an understanding of the data warehouse metadata and the use of Web mining techniques (Laware, 2005). Typically when conducting a search on the Web, a user initiates the search by using a search engine to find documents that refer to the desired subject. This requires the user to define the domain of interest as a keyword or a collection of keywords that can be processed by the search engine. The searcher may not know how to break the domain down, thus limiting the search to the domain name. However, even given the ability to break down the domain and conduct a search, the search results have two significant problems. One, Web searches return information about a very large number of documents. Two, much of the returned information may be marginally relevant or completely irrelevant to the domain. The decision maker may not have time to sift through results to find the meaningful information. A data warehouse that has already found domain relevant Web pages can relieve the decision maker from having to decide on search keywords and having to determine the relevant documents from those found in a search. Such a data warehouse requires previously conducted searches to add Web information.

Download Full-text

Enhanced Knowledge Warehouse

Encyclopedia of Information Science and Technology, First Edition ◽

10.4018/978-1-59140-553-5.ch187 ◽

2005 ◽

pp. 1057-1062

Author(s):

Krzysztof Wecel ◽

Witold Abramowicz ◽

Pawel Jan Kalczynski

Keyword(s):

Web Services ◽

Data Warehouse ◽

Distributed Applications ◽

Software Components ◽

Automatic Retrieval ◽

External Software ◽

Automatic Filtering ◽

The Creation ◽

The Web

Enhanced knowledge warehouse (eKW) is an extension of the enhanced data warehouse (eDW) system (Abramowicz, 2002). eKW is a Web services-based system that allows the automatic filtering of information from the Web to the data warehouse and automatic retrieval through the data warehouse. Web services technology extends eKW beyond the organization. It makes the system open and allows utilization of external software components, thus enabling the creation of distributed applications.

Download Full-text

Complementing the Data Warehouse with Information Filtered from the Web

Data Warehousing and Web Engineering ◽

10.4018/978-1-931777-02-5.ch011 ◽

2011 ◽

pp. 206-218

Author(s):

Witold Abramowicz ◽

Pawel Jan Kalczynski ◽

Krzysztof Wecel

Keyword(s):

Data Warehouse ◽

Claim Data ◽

Textual Information ◽

User Models ◽

Transactional Data ◽

The Web

The data warehouse is considered to be the best way to organize transactional data. However, as many researches claim data warehouse should be augmented with external textual information. The objective of this chapter is to examine the requirements for profiling in the data warehouse environment. Profiles created in the data warehouse are then utilized to filter information. The goal of the sketched system is to support users in his situated actions. We explore many issues concerning personalization, such as information overflow, user models, and situatedness. We also analyze the factors that influence the filtering process. Finally, we draw some conclusions that should be considered during extension of the evaluated system.

Download Full-text

Data Reliability Assessment in a Data Warehouse Opened on the Web

Flexible Query Answering Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-642-24764-4_16 ◽

2011 ◽

pp. 174-185 ◽

Cited By ~ 1

Author(s):

Sébastien Destercke ◽

Patrice Buche ◽

Brigitte Charnomordic

Keyword(s):

Data Warehouse ◽

Reliability Assessment ◽

Data Reliability ◽

The Web

Download Full-text

Innovative Approaches for Efficiently Warehousing Complex Data from the Web

Data Mining ◽

10.4018/978-1-4666-2455-9.ch074 ◽

2013 ◽

pp. 1422-1448

Author(s):

Fadila Bentayeb ◽

Nora Maïz ◽

Hadj Mahboubi ◽

Cécile Favre ◽

Sabine Loudcher ◽

...

Keyword(s):

Data Mining ◽

Decision Support ◽

Data Warehouse ◽

Design Management ◽

Complex Data ◽

Data Warehouses ◽

Process Data ◽

Access Methods ◽

Olap Analysis ◽

The Web

Research in data warehousing and OLAP has produced important technologies for the design, management, and use of Information Systems for decision support. With the development of Internet, the availability of various types of data has increased. Thus, users require applications to help them obtaining knowledge from the Web. One possible solution to facilitate this task is to extract information from the Web, transform and load it to a Web Warehouse, which provides uniform access methods for automatic processing of the data. In this chapter, we present three innovative researches recently introduced to extend the capabilities of decision support systems, namely (1) the use of XML as a logical and physical model for complex data warehouses, (2) associating data mining to OLAP to allow elaborated analysis tasks for complex data and (3) schema evolution in complex data warehouses for personalized analyses. Our contributions cover the main phases of the data warehouse design process: data integration and modeling, and user driven-OLAP analysis.

Download Full-text

TUGAS JARINGAN KOMPUTER 1 (YUDA FAHROZI 175100011)

10.31219/osf.io/y4kcd ◽

2019 ◽

Author(s):

yuda fahrozi

Keyword(s):

Data Base ◽

Real Time ◽

Data Warehouse ◽

Distributed Database ◽

Sql Server ◽

End User ◽

External Database ◽

Database Server ◽

The Web

Database Server adalah sebuah program komputer yang menyediakan layanan pengelolaan basis data dan melayani komputer atau program aplikasi basis data yang menggunakan model klien/server. Istilah ini juga merujuk kepada sebuah komputer (umumnya merupakan server) yang didedikasikan untuk menjalankan program yang bersangkutan. Sistem manajemen basis data (SMBD) pada umumnya menyediakan fungsi-fungsi server basis data, dan beberapa SMBD (seperti halnya MySQL atau Microsoft SQL Server) sangat bergantung kepada model klien-server untukmengakses basis datanya.Legenda Terbentuknya Istilah DatabaseIstilah “database” berawal dari ilmu komputer. Meskipun kemudianartinya semakin luas, memasukkan hal-hal di luar bidang elektronika, artikel inimengenai database komputer. Catatan yang mirip dengan database sebenarnyasudah ada sebelum revolusi industri yaitu dalam bentuk buku besar, kuitansi dan kumpulan data yang berhubungan dengan bisnisJenis DatabaseTerdapat 12 tipe database, antara lainOperational database,Analytical database,Data warehouse,Distributed database,End-user database,External database,Hypermedia databases on the web,Navigational database,In-memory databases,Document-oriented databases,Real-time databases,danRelational Database.Kata Kunci : Kapasitas Server Dan Data Base

Download Full-text