XML Native Storage and Query Processing

Author(s):  
Ning Zhang ◽  
Tamer M. Özsu

As XML has evolved as a data model for semi-structured data and the de facto standard for data exchange (e.g., Atom, RSS, and XBRL), XML data management has been the subject of extensive research and development in both academia and industry. Among the XML data management issues, storage and query processing are the most critical ones with respect to system performance. Different storage schemes have their own pros and cons. Some storage schemes are more amenable to fast navigation, and some schemes perform better in fragment extraction and document reconstruction. Therefore, based on their own requirements, different systems adopt different storage schemes to tradeoff one set of features over the others. In this chapter, the authors review different native storage formats and query processing techniques that have been developed in both academia and industry. Various XML indexing techniques are also presented since they can be treated as specialized storage and query processing tools.

Author(s):  
Zhen Hua Liu ◽  
Anguel Novoselsky ◽  
Vikas Arora

Since the advent of XML, there has been significant research into integrating XML data management with Relational DBMS and Object Relational DBMS (ORDBMS). This chapter describes the XML data management capabilities in ORDBMS, various design approaches and implementation techniques to support these capabilities, as well as the pros and cons of each design and implementation approach. Key topics such as XML storage, XML Indexing, XQuery and SQL/XML processing, are discussed in depth presenting both academic and industrial research work in these areas.


Author(s):  
Pasquale De Meo ◽  
Antonino Nocera ◽  
Domenico Ursino

Handling the interoperability issues in multiple, heterogeneous XML sources is central in XML data management and mining. In this chapter, we present a framework for the intensional integration and exploration of XML sources. Specifically, we propose a three-layer framework aimed at extracting interschema knowledge from the available sources, constructing a hierarchy based on the extracted knowledge to represent the sources at different abstraction levels, and finally organizing and exploring the sources through the constructed hierarchy. We also describe possible implementations of each of the three layers, focusing on the extraction of intensional interschema properties, the intensional integration of XML sources, and the clustering of XML schemas. In order to better handle the complexity of its activities, the proposed framework has been designed by means of the layers architecture patterns and the component-based development paradigm.


Author(s):  
Alessandro Campi

This Chapter describes a visual framework; called XQBE; that covers the most important aspects of XML data management; spanning the visualization of XML documents; the formulation of queries; the representation and specification of document schemata; the definition of integrity constraints; the formulation of updates; and the expression of reactive behaviors in response to data modifications. All these features are strongly unified by a common visual abstraction and a few recurrent paradigms; so as to provide a homogeneous and comprehensive environment that allows even users without advanced programming skills to deal with nontrivial XML data management and transformation tasks. The intrinsic ambiguity inherent in any visual representation of richly expressive languages required a considerable effort of formalization in the semantics of XQBE that eventually lead to a solution with major advantages in terms of intuitiveness. In other words; this means that the unique (and unambiguous) effect of a statement is the one the user would expect.


Author(s):  
Albrecht Schmidt ◽  
Florian Waas ◽  
Martin Kersten ◽  
Michael J. Carey ◽  
Ioana Manolescu ◽  
...  

Author(s):  
Chin-Wan Chung ◽  
Myung-Jae Park ◽  
Jihyun Lee

To effectively reduce the redundancy and verbosity of XML data, various studies for XML compression have been conducted. Especially, XML data management systems and applications require the support of direct query processing and update on compressed XML data, the stream based compression/decompression, and the reduction of the size of the compressed data. In order to fully support the various aspects of XML compression, existing XML compression techniques should be carefully examined and the additional requirements for XML compression techniques should be considered. In this chapter, the authors first classify existing representative XML compression techniques according to their characteristics. Second, they explain the details of XML specific compression techniques. Third, they summarize the performance of the compression techniques in terms of the compression ratio and the compression and decompression time. Lastly, they present some future research directions.


Sign in / Sign up

Export Citation Format

Share Document