Web Data Management Practices

Clustering Web Information Services

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch002 ◽

2007 ◽

pp. 34-55 ◽

Cited By ~ 1

Author(s):

Athena Vakali ◽

George Pallis ◽

Lefteris Angelis

Keyword(s):

Data Mining ◽

Data Management ◽

Information Sources ◽

Theoretical Background ◽

Information Services ◽

Information Availability ◽

Web Information ◽

Information Circulation ◽

The Web ◽

Management Issues

The explosive growth of the Web scale has drastically increased information circulation and dissemination rates. As the number of both Web users and Web sources grows significantly everyday, crucial data management issues, such as clustering on the Web, should be addressed and analyzed. Clustering has been proposed towards improving both the information availability and the Web users’ personalization. Clusters on the Web are either users’ sessions or Web information sources, which are managed in a variation of applications and implementations testbeds. This chapter focuses on the topic of clustering information over the Web, in an effort to overview and survey on the theoretical background and the adopted practices of most popular emerging and challenging clustering research efforts. An up-to-date survey of the existing clustering schemes is given, to be of use for both researchers and practitioners interested in the area of Web data mining.

Data Clustering

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch001 ◽

2007 ◽

pp. 1-33 ◽

Cited By ~ 4

Author(s):

Dušan Husek ◽

Jaroslav Pokorny ◽

Hana Rezankova ◽

Václav Snasel

Keyword(s):

Information Retrieval ◽

Data Clustering ◽

Important Task ◽

Clustering Methods ◽

Web Documents ◽

Web Communities

Document and information retrieval (IR) is an important task for Web communities. In this chapter, we introduce some clustering methods and focus on their use for the clustering, classification, and retrieval of Web documents.

Web Services

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch011 ◽

2007 ◽

pp. 244-267

Author(s):

Bernd Aman ◽

Salima Benbernou ◽

Benjamin Nguyen

Keyword(s):

Web Services ◽

Web Service ◽

Service Composition ◽

Web Applications ◽

Loosely Coupled ◽

Composition Problem ◽

Service Oriented ◽

Service Paradigm ◽

Current Standards ◽

The Web

Unlike traditional applications, which depend upon a tight interconnection of all program elements, Web service applications are composed of loosely coupled, autonomous and independent services published on the Web. In this chapter, we first introduces the concept of service oriented computing (SOC) on the Web and the current standards enabling the definition and publication of Web services. This technology’s next evolution is to facilitate the creation and maintenance of Web applications. This can be achieved by exploiting the self-descriptive nature of Web services combined with more powerful models and languages for composing Web services. A second objective of this chapter is to illustrate the complexity of the Web service composition problem and to provide a representative overview of the existing approaches. The chapter concludes with a short presentation of two research projects exploiting and extending the Web service paradigm.

Information-Theoretic Methods for Prediction in the Wireless and Wired Web

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch007 ◽

2007 ◽

pp. 159-178 ◽

Cited By ~ 1

Author(s):

Dimitrios Katsaros

Keyword(s):

Information Theoretic ◽

Mobile Location ◽

Sequence Prediction ◽

Sequence Modeling ◽

Discrete Sequence ◽

Modeling And Prediction ◽

Critical Issues ◽

Mobility Tracking ◽

Caching Mechanism ◽

Data Request

Discrete sequence modeling and prediction is an important goal and a challenge for Web environments, both wired and wireless. Web client’s data request forecasting and mobile location tracking in wireless cellular networks are characteristic application areas of sequence prediction in such environments. Accurate data request prediction results in effective data prefetching which combined with a caching mechanism can reduce user-preceived latencies as well as server and network loads. Also, effective solutions to the mobility tracking/prediction problem can reduce the update and paging costs, freeing the network from exceesive signaling traffic. Therefore, sequence prediction comprises a very important study and development area . This article presents information-theoretic techniques for discrete sequence prediction. It surveys, classifies, and compares the state-of-the-art solutions, suggesting routes for further research by discussing the critical issues and challenges of prediction in wired and wireless networks

Caching on the Web

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch006 ◽

2007 ◽

pp. 124-158

Author(s):

Mehregan Mahdavi ◽

Boualem Bentallah

Keyword(s):

Response Time ◽

World Wide ◽

Web Applications ◽

Fast Response ◽

Web Caching ◽

Web Portals ◽

Dynamic Data ◽

Dynamic Content ◽

The World ◽

The Web

The World Wide Web provides a means for sharing data and applications among users. However, its performance and in particular providing fast response time is still an issue. Caching is a key technique that addresses some of the performance issues in today’s Web-enabled applications. Deploying dynamic data especially in an emerging class of Web applications, called Web Portals, makes caching even more interesting. In this chapter, we study Web caching techniques with focus on dynamic content. We also discuss the limitations of caching in Web portals and study a solution that addresses these limitations. The solution is based on the collaboration between the portal and its providers.

Integrating Heterogeneous Data Sources in the Web

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch009 ◽

2007 ◽

pp. 199-219

Author(s):

Angelo Brayner ◽

Macelo Meireles ◽

José de Aguiar Moraes Filho

Keyword(s):

Query Language ◽

Heterogeneous Data ◽

Data Sources ◽

Distributed Data ◽

Local Data ◽

Multidatabase System ◽

Integration Strategy ◽

Heterogeneous Data Sources ◽

Integration Problems ◽

The Web

Integrating data sources published on the web requires an integration strategy that guarantees local data sources autonomy. Multidatabase System (MDBS) has been consolidated as an approach to integrate multiple heterogeneous and distributed data sources in flexible and dynamic environments such as the Web. A key property of MDBSs is to guarantee a higher degree of local autonomy. In order to adopt the MDBS strategy, it is necessary to use a query language, called multidatabase language (MDL), which provides the necessary constructs for jointly manipulating and accessing data in heterogeneous data sources. In other words, the MDL is responsible for solving integration conflicts. This chapter describes an extension to the XQuery language, called MXQuery, which supports queries over several data sources and solves integration problems as semantic heterogeneity and incomplete information.

Designing and Mining Web Applications

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch008 ◽

2007 ◽

pp. 179-198

Author(s):

Rosa Meo ◽

Maristella Matera

Keyword(s):

Data Mining ◽

Web Applications ◽

Conceptual Modeling ◽

Modeling Language ◽

Frequent Patterns ◽

Application Management ◽

Modeling Methods ◽

Web Logs ◽

Dynamic Web

In this Chapter we present the usage of a modeling language, WebML, for the design and the management of dynamic Web applications. WebML also makes easier the analysis of the usage of the application contents by the users, even if applications are dynamic. In fact, it makes use of some special-purpose logs, called conceptual logs, generated by the application runtime engine. In this Chapter we report on a case study about the analysis of the conceptual logs for testifying to the effectiveness of WebML and of its conceptual modeling methods. The methodology of analysis of Web logs is based on the data mining paradigm of itemsets and frequent patterns and makes full use of constraints on the conceptual logs content. As a consequence, we could obtain many interesting patterns for the application management such as recurrent navigation paths, the most frequently visited page contents, and anomalies.

Mining Association Rules from XML Documents

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch004 ◽

2007 ◽

pp. 79-103 ◽

Cited By ~ 4

Author(s):

Laura Irina Rusu ◽

Wenny Rahayu ◽

David Taniar

Keyword(s):

Knowledge Discovery ◽

Association Rules ◽

Large Volume ◽

Web Application ◽

Markup Language ◽

Xml Documents ◽

Rapid Changes ◽

Extensible Markup ◽

Hidden Knowledge ◽

The Web

This chapter presents some of the existing mining techniques for extracting association rules out of XML documents, in the context of rapid changes in the Web knowledge discovery area. The initiative of this study was driven by the fast emergence of XML (eXtensible Markup Language) as a standard language for representing semi-structured data and as a new standard of exchanging information between different applications. The data exchanged as XML documents becomes every day richer and richer, so the necessity to not only store these large volume of XML data for later use, but to mine them as well, to discover interesting information, has became obvious. The hidden knowledge can be used in various ways, for example to decide on a business issue or to make predictions about future e-customer behaviour in a web-application. One type of knowledge which can be discovered in a collection of XML documents relates to association rules between parts of the document, and this chapter presents some of the top techniques for extracting them.

An Overviewof Similarity Measures for Clustering XML Documents

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch003 ◽

2007 ◽

pp. 56-78 ◽

Cited By ~ 13

Author(s):

Giovanna Guerrini ◽

Marco Mesiti ◽

Ismael Sanz

Keyword(s):

Similarity Measures ◽

Content Delivery ◽

Heterogeneous Data ◽

Heterogeneous Data Integration ◽

Clustering Techniques ◽

Xml Documents ◽

Xml Document ◽

Definition Of ◽

Content Similarity ◽

The Web

The large amount and heterogeneity of XML documents on the Web require the development of clustering techniques to group together similar documents. Documents can be grouped together according to their content, their structure, and links inside and among documents. For instance, grouping together documents with similar structures has interesting applications in the context of information extraction, of heterogeneous data integration, of personalized content delivery, of access control definition, of web site structural analysis, of comparison of RNA secondary structures. Many approaches have been proposed for evaluating the structural and content similarity between tree-based and vector-based representations of XML documents. Link-based similarity approaches developed for Web data clustering have been adapted for XML documents. This chapter discusses and compares the most relevant similarity measures and their employment for XML document clustering.

Dynamically Generated Web Content

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch005 ◽

2007 ◽

pp. 104-123

Author(s):

Stavros Papastavrou ◽

George Samaras ◽

Paraskevas Evripidou ◽

Panos K. Chrysanthis

Keyword(s):

Web Content ◽

Historical Aspects ◽

Low Level ◽

Content Caching ◽

Dynamic Content ◽

Acceleration Techniques ◽

Dynamic Web ◽

The Web

This chapter takes a tutorial approach to present the Web-related technologies and content middlewares that attempt to accelerate the generation and optimize the delivery of dynamic content. It covers the historical aspects of dynamic content and presents the reasoning behind its introduction while discussing early content middlewares such as the CGI and FastCGI. It then presents the evolution of content middlewares along the lines of contacted research. The discussion focuses on popular techniques that mostly include content caching and content fragmentation. It also discusses a variety of other research efforts such as hardware and low-level acceleration techniques, active caching, and delta encoding. Finally, the authors hope that this chapter will server as an introductory tutorial to students and researchers in the field of dynamic Web content technology.

Web Data Management Practices
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Clustering Web Information Services

Data Clustering

Web Services

Information-Theoretic Methods for Prediction in the Wireless and Wired Web

Caching on the Web

Integrating Heterogeneous Data Sources in the Web

Designing and Mining Web Applications

Mining Association Rules from XML Documents

An Overviewof Similarity Measures for Clustering XML Documents

Dynamically Generated Web Content

Export Citation Format

Web Data Management PracticesLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Clustering Web Information Services

Data Clustering

Web Services

Information-Theoretic Methods for Prediction in the Wireless and Wired Web

Caching on the Web

Integrating Heterogeneous Data Sources in the Web

Designing and Mining Web Applications

Mining Association Rules from XML Documents

An Overviewof Similarity Measures for Clustering XML Documents

Dynamically Generated Web Content

Web Data Management Practices
Latest Publications