Mining Association Rules from XML Documents

This chapter presents some of the existing mining techniques for extracting association rules out of XML documents in the context of rapid changes in the Web knowledge discovery area. The initiative of this study was driven by the fast emergence of XML (eXtensible Markup Language) as a standard language for representing semistructured data and as a new standard of exchanging information between different applications. The data exchanged as XML documents become richer and richer every day, so the necessity to not only store these large volumes of XML data for later use, but to mine them as well to discover interesting information has became obvious. The hidden knowledge can be used in various ways, for example, to decide on a business issue or to make predictions about future e-customer behaviour in a Web application. One type of knowledge that can be discovered in a collection of XML documents relates to association rules between parts of the document, and this chapter presents some of the top techniques for extracting them.

Download Full-text

Mining Association Rules from XML Documents

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch004 ◽

2007 ◽

pp. 79-103 ◽

Cited By ~ 4

Author(s):

Laura Irina Rusu ◽

Wenny Rahayu ◽

David Taniar

Keyword(s):

Knowledge Discovery ◽

Association Rules ◽

Large Volume ◽

Web Application ◽

Markup Language ◽

Xml Documents ◽

Rapid Changes ◽

Extensible Markup ◽

Hidden Knowledge ◽

The Web

This chapter presents some of the existing mining techniques for extracting association rules out of XML documents, in the context of rapid changes in the Web knowledge discovery area. The initiative of this study was driven by the fast emergence of XML (eXtensible Markup Language) as a standard language for representing semi-structured data and as a new standard of exchanging information between different applications. The data exchanged as XML documents becomes every day richer and richer, so the necessity to not only store these large volume of XML data for later use, but to mine them as well, to discover interesting information, has became obvious. The hidden knowledge can be used in various ways, for example to decide on a business issue or to make predictions about future e-customer behaviour in a web-application. One type of knowledge which can be discovered in a collection of XML documents relates to association rules between parts of the document, and this chapter presents some of the top techniques for extracting them.

Download Full-text

XML and Semantics

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v5i5.pp1174-1179 ◽

2015 ◽

Vol 5 (5) ◽

pp. 1174

Author(s):

Mohammad Moradi ◽

MohammadReza Keyvanpour

Keyword(s):

Semantic Web ◽

Building Blocks ◽

The State ◽

Extensible Markup Language ◽

Markup Language ◽

Xml Documents ◽

Extensible Markup ◽

Presentation Methods ◽

The Web

Since the early days of introducing eXtensible Markup Language (XML), owing to its expressive capabilities and flexibilities, it became the defacto standard for representing, storing, and interchanging data on the Web. Such features have made XML one of the building blocks of the Semantic Web. From another viewpoint, since XML documents could be considered from content, structural, and semantic aspects, leveraging their semantics is very useful and applicable in different domains. However, XML does not by itself introduce any built-in mechanisms for governing semantics. For this reason, many studies have been conducted on the representation of semantics within/from XML documents. This paper studies and discusses different aspects of the mentioned topic in the form of an overview with an emphasis on the state of semantics in XML and its presentation methods.

Download Full-text

OWL

Encyclopedia of Information Science and Technology, Second Edition ◽

10.4018/978-1-60566-026-4.ch481 ◽

2011 ◽

pp. 3009-3017

Author(s):

Adélia Gouveia ◽

Jorge Cardoso

Keyword(s):

World Wide Web ◽

Semantic Web ◽

Particle Physics ◽

World Wide ◽

Visual Presentation ◽

Web Pages ◽

Markup Language ◽

Extensible Markup ◽

Description Framework ◽

The Web

The World Wide Web (WWW) emerged in 1989, developed by Tim Berners-Lee who proposed to build a system for sharing information among physicists of the CERN (Conseil Européen pour la Recherche Nucléaire), the world’s largest particle physics laboratory. Currently, the WWW is primarily composed of documents written in HTML (hyper text markup language), a language that is useful for visual presentation (Cardoso & Sheth, 2005). HTML is a set of “markup” symbols contained in a Web page intended for display on a Web browser. Most of the information on the Web is designed only for human consumption. Humans can read Web pages and understand them, but their inherent meaning is not shown in a way that allows their interpretation by computers (Cardoso & Sheth, 2006). Since the visual Web does not allow computers to understand the meaning of Web pages (Cardoso, 2007), the W3C (World Wide Web Consortium) started to work on a concept of the Semantic Web with the objective of developing approaches and solutions for data integration and interoperability purpose. The goal was to develop ways to allow computers to understand Web information. The aim of this chapter is to present the Web ontology language (OWL) which can be used to develop Semantic Web applications that understand information and data on the Web. This language was proposed by the W3C and was designed for publishing, sharing data and automating data understood by computers using ontologies. To fully comprehend OWL we need first to study its origin and the basic blocks of the language. Therefore, we will start by briefly introducing XML (extensible markup language), RDF (resource description framework), and RDF Schema (RDFS). These concepts are important since OWL is written in XML and is an extension of RDF and RDFS.

Download Full-text

Non-Extensible Markup Language

Proceedings of the Symposium on HTML5 and XML ◽

10.4242/balisagevol14.denicola01 ◽

2014 ◽

Cited By ~ 2

Author(s):

Domenic Denicola

Keyword(s):

Extensible Markup Language ◽

Markup Language ◽

Domain Specific ◽

Extensible Markup ◽

The Web

XML's steady descent into obscurity has become more and more apparent over the last few years. Developers, tool vendors, and browser implementers have all embraced HTML as the web's markup language, built on a substrate of JavaScript. Nothing epitomizes this shift more than the recent rise of web components: instead of standards committees dreaming up domain-specific XML vocabularies and hoping one day browsers would incorporate them, web components and the extensible web principles they embody allow authors to empower HTML with the same abilities XML once promised. The HTML of today is a truly extensible markup language. Where XML failed in this mission, both historically and practically, the web ecosystem routed around the damage of XML's influence by making HTML better suited for extensibility than ever before.

Download Full-text

A Narrative Review of Storing and Querying XML Documents Using Relational Database

Journal of Information & Knowledge Management ◽

10.1142/s0219649219500485 ◽

2019 ◽

Vol 18 (04) ◽

pp. 1950048

Author(s):

Amjad Qtaish ◽

Mohammad T. Alshammari

Keyword(s):

Relational Database ◽

File Systems ◽

Data Representation ◽

Xml Data ◽

Xml Documents ◽

Data Interchange ◽

Extensible Markup ◽

Data Environment ◽

Object Oriented Database ◽

The Web

Extensible Markup Language (XML) has become a common language for data interchange and data representation in the Web. The evolution of the big data environment and the large volume of data which is being represented by XML on the Web increase the challenges in effectively managing such data in terms of storing and querying. Numerous solutions have been introduced to store and query XML data, including the file systems, Object-Oriented Database (OODB), Native XML Database (NXD), and Relational Database (RDB). Previous research attempts indicate that RDB is the most powerful technology for managing XML data to date. Because of the structure variations of XML and RDB, the need to map XML data to an RDB scheme is increased. This growth has prompted numerous researchers and database vendors to propose different approaches to map XML documents to an RDB, translating different types of XPath queries to SQL queries and returning the results to an XML format. This paper aims to comprehensively review most cited and latest mapping approaches and database vendors that use RDB solution to store and query XML documents, in a narrative manner. The advantages and the drawbacks of each approach is discussed, particularly in terms of storing and querying. The paper also provides some insight into managing XML documents using RDB solution in terms of storing and querying and contributes to the XML community.

Download Full-text

On an Enhancement of XML Applied for Mobile E-Commerce

Journal of Electronic Commerce in Organizations ◽

10.4018/jeco.2012070102 ◽

2012 ◽

Vol 10 (3) ◽

pp. 13-26

Author(s):

Xiaomin Zhu ◽

Zhongxiang He ◽

Shengbo Shi

Keyword(s):

Data Processing ◽

Mobile Computing ◽

Web Service ◽

Size Effects ◽

Processing Time ◽

The Internet ◽

Markup Language ◽

Mobile Web ◽

Xml Documents ◽

Extensible Markup

Extensible Markup Language (XML) is a textual markup language which becomes more and more important in the Internet web service. However, some distinct disadvantages exist in XML, such as its nature of redundancy, which consumes the limited network’s bandwidth greatly especially in mobile computing. Considering the characteristics of the mobile commerce, the handsets’ memory capability and data processing time are two problems for XML being applied. This paper studies an enhancement of XML for the purpose of application in mobile e-commerce, called SXML, which means Simple XML to enhance the XML used in mobile web service. It helps XML producers minimizing the size effects of XML, e.g., the size overhead and slow implementation speed. Comprehensive simulations show that the SXML could reduce the size of XML documents and reduce the time of implementation, consequently utilize the bandwidth effectively.

Download Full-text

An Approach to Extracting Knowledge From Legacy Documents

Volume 4: 24th Computers and Information in Engineering Conference ◽

10.1115/detc2004-57677 ◽

2004 ◽

Author(s):

Richard Crowder ◽

Yee-Wie Sim

Keyword(s):

Engineering Design ◽

Human Resource ◽

Extensible Markup Language ◽

Markup Language ◽

Design Environment ◽

Text Search ◽

Xml Documents ◽

Extensible Markup ◽

Access To Data ◽

Efficiency And Reliability

Organisations are increasingly information intensive; hence providing access to data that is trapped in various proprietary forms including catalogues, databases, human resource systems and internally generated documents is now becoming a significant and challenging task. The authors have undertaken research into approaches to capture relevant knowledge from legacy documents. This is achieved by converting the legacy documents to XML, (eXtensible Markup Language), documents where the output is semantically tagged. Once in an XML form, the data can be easily transformed. This paper describes the development of tools to automate the process of converting legacy documents to XML documents. The purpose of this work is improve the efficiency and reliability of Expertise Finder suitable for use within an engineering design environment. We will also show that by querying the resultant XML versions of legacy documents provides better results than a basic text search over the identical documents when applied used within an Expertise Finder.

Download Full-text

Storing XML Documents in Databases

Encyclopedia of Database Technologies and Applications ◽

10.4018/978-1-59140-560-3.ch108 ◽

2005 ◽

pp. 658-664

Author(s):

Albrecht Schmidt ◽

Stefan Manegold ◽

Martin Kersten

Keyword(s):

Data Management ◽

Relational Database ◽

Query Language ◽

Transaction Management ◽

Markup Language ◽

Xml Documents ◽

Management Technology ◽

Extensible Markup ◽

Database Technology ◽

Exchange Data

Ever since the Extensible Markup Language (XML) (W3C, 1998b) began to be used to exchange data between diverse sources, interest has grown in deploying data management technology to store and query XML documents. A number of approaches propose to adapt relational database technology to store and maintain XML documents (Deutsch, Fernandez & Suciu, 1999; Florescu & Kossmann, 1999; Klettke & Meyer, 2000; Shanmugasundaram et al., 1999; Tatarinov et al., 2002; O’Neil et al., 2004). The advantage is that the XML repository inherits all the power of mature relational technology like indexes and transaction management. For XML-enabled querying, a declarative query language (Chamberlin et al., 2001) is available.

Download Full-text

Introduction

Advances in Data Mining and Database Management - Design, Performance, and Analysis of Innovative Information Retrieval ◽

10.4018/978-1-4666-1975-3.ch007 ◽

2013 ◽

pp. 91-95

Author(s):

Badya Al-Hamadani ◽

Joan Lu

Keyword(s):

World Wide Web ◽

World Wide ◽

Extensible Markup Language ◽

Markup Language ◽

Xml Documents ◽

Extensible Markup

The eXtensible Markup Language (XML) is a World Wide Web Consortium (W3C) recommendation which has widely been used in both commerce and research. As the importance of XML documents increase, the need to deal with these documents increases as well. This chapter illustrates the methodology that has been used throughout the research, discussing all its parts and how these parts were adopted in the research.

Download Full-text