Introduction

Author(s):  
Badya Al-Hamadani ◽  
Joan Lu

The eXtensible Markup Language (XML) is a World Wide Web Consortium (W3C) recommendation which has widely been used in both commerce and research. As the importance of XML documents increase, the need to deal with these documents increases as well. This chapter illustrates the methodology that has been used throughout the research, discussing all its parts and how these parts were adopted in the research.

Author(s):  
Adélia Gouveia ◽  
Jorge Cardoso

The World Wide Web (WWW) emerged in 1989, developed by Tim Berners-Lee who proposed to build a system for sharing information among physicists of the CERN (Conseil Européen pour la Recherche Nucléaire), the world’s largest particle physics laboratory. Currently, the WWW is primarily composed of documents written in HTML (hyper text markup language), a language that is useful for visual presentation (Cardoso & Sheth, 2005). HTML is a set of “markup” symbols contained in a Web page intended for display on a Web browser. Most of the information on the Web is designed only for human consumption. Humans can read Web pages and understand them, but their inherent meaning is not shown in a way that allows their interpretation by computers (Cardoso & Sheth, 2006). Since the visual Web does not allow computers to understand the meaning of Web pages (Cardoso, 2007), the W3C (World Wide Web Consortium) started to work on a concept of the Semantic Web with the objective of developing approaches and solutions for data integration and interoperability purpose. The goal was to develop ways to allow computers to understand Web information. The aim of this chapter is to present the Web ontology language (OWL) which can be used to develop Semantic Web applications that understand information and data on the Web. This language was proposed by the W3C and was designed for publishing, sharing data and automating data understood by computers using ontologies. To fully comprehend OWL we need first to study its origin and the basic blocks of the language. Therefore, we will start by briefly introducing XML (extensible markup language), RDF (resource description framework), and RDF Schema (RDFS). These concepts are important since OWL is written in XML and is an extension of RDF and RDFS.


Author(s):  
Richard Crowder ◽  
Yee-Wie Sim

Organisations are increasingly information intensive; hence providing access to data that is trapped in various proprietary forms including catalogues, databases, human resource systems and internally generated documents is now becoming a significant and challenging task. The authors have undertaken research into approaches to capture relevant knowledge from legacy documents. This is achieved by converting the legacy documents to XML, (eXtensible Markup Language), documents where the output is semantically tagged. Once in an XML form, the data can be easily transformed. This paper describes the development of tools to automate the process of converting legacy documents to XML documents. The purpose of this work is improve the efficiency and reliability of Expertise Finder suitable for use within an engineering design environment. We will also show that by querying the resultant XML versions of legacy documents provides better results than a basic text search over the identical documents when applied used within an Expertise Finder.


Author(s):  
Michael Lang

Although its conceptual origins can be traced back a few decades (Bush, 1945), it is only recently that hypermedia has become popularized, principally through its ubiquitous incarnation as the World Wide Web (WWW). In its earlier forms, the Web could only properly be regarded a primitive, constrained hypermedia implementation (Bieber & Vitali, 1997). Through the emergence in recent years of standards such as eXtensible Markup Language (XML), XLink, Document Object Model (DOM), Synchronized Multimedia Integration Language (SMIL) and WebDAV, as well as additional functionality provided by the Common Gateway Interface (CGI), Java, plug-ins and middleware applications, the Web is now moving closer to an idealized hypermedia environment. Of course, not all hypermedia systems are Web based, nor can all Web-based systems be classified as hypermedia (see Figure 1). See the terms and definitions at the end of this article for clarification of intended meanings. The focus here shall be on hypermedia systems that are delivered and used via the platform of the WWW; that is, Web-based hypermedia systems.


Author(s):  
Mohammad Moradi ◽  
MohammadReza Keyvanpour

Since the early days of introducing eXtensible Markup Language (XML), owing to its expressive capabilities and flexibilities, it became the defacto standard for representing, storing, and interchanging data on the Web. Such features have made XML one of the building blocks of the Semantic Web. From another viewpoint, since XML documents could be considered from content, structural, and semantic aspects, leveraging their semantics is very useful and applicable in different domains. However, XML does not by itself introduce any built-in mechanisms for governing semantics. For this reason, many studies have been conducted on the representation of semantics within/from XML documents. This paper studies and discusses different aspects of the mentioned topic in the form of an overview with an emphasis on the state of semantics in XML and its presentation methods.


Author(s):  
Yangjun Chen

With the rapid advance of the Internet, management of structured documents such as XML documents has become more and more important (Marchiori, 1998). As a simplified version of SGML, XML is recommended by W3C (World Wide Web Consortium, 1998a; World Wide Web Consortium, 1998b) as a document description meta-language to exchange and manipulate data and documents on the WWW. It has been used to code various types of data in a wide range of application domains, including a Chemical Markup Language for exchanging data about molecules, the Open Financial Exchange for swapping financial data between banks and banks and customers, as well as a Geographical Markup Language for searching geographical information (Bosak, 1997; Zhang & Gruenwald, 2001). Also, a growing number of legacy systems are adapted to output data in the form of XML documents.


2012 ◽  
Author(s):  
Ren Hui Gong ◽  
Ziv Yaniv

The Insight Segmentation and Registration Toolkit (ITK) previously provided a framework for parsing Extensible Markup Language (XML) documents using the Simple API for XML (SAX) framework. While this programming model is memory efficient, it places most of the implementation burden on the user. We provide an implementation of the Document Object Model (DOM) framework for parsing XML documents. Using this model, user code is greatly simplified, shifting most of the implementation burden from the user to the framework. The provided implementation consists of two tiers. The lower level tier provides functionality for parsing XML documents and loading the tree structure into memory. It then allows the user to query and retrieve specific entries. The upper tier uses this functionality to provide an interface for mimicking a serialization and de-serialization mechanism for ITK objects. The implementation described in this document was incorporated into ITK as part of release 4.2.


Sign in / Sign up

Export Citation Format

Share Document