scholarly journals My document object model can do more than yours

Author(s):  
Alain Couthures

Document object models, specifically the browser DOM, were designed to represent HTML and XML documents. Languages such as XPath were designed to access and traverse the DOM of HTML and XML documents. But suppose we wanted to bring the power and convenience of XML technologies like XPath to new data types. Could we extend the DOM to support CSV files? JSON? ZIP files? Yes we can! This paper explores a number of ways in which the DOM can be made to do more. We can loosen restrictions, describe new sequence types, and even define new XPath axes to make the DOM better and more useful.

Author(s):  
Martin Probst

As the adoption of XML reaches more and more application domains, data sizes increase, and efficient XML handling gets more and more important. Many applications face scalability problems due to the overhead of XML parsing, the difficulty of effectively finding particular XML nodes, or the sheer size of XML documents, which nowadays can easily exceed gigabytes of data. In particular the latter issue can make certain tasks seemingly impossible to handle, as many applications depend on parsing XML documents completely into a Document Object Model (DOM) memory structure. Parsing XML into a DOM typically requires close to or even more memory as the serialized XML would consume, thus making it prohibitively expensive to handle XML documents in the gigabyte range. Recent research and development suggests that it is possible to modify these applications to run a wide range of tasks in a streaming fashion, thus limiting the memory consumption of individual applications. However this requires not only changes in the underlying tools, but often also in user code, such as XSLT style sheets. These required changes can often be unintuitive and complicate user code. A different approach is to run applications against an efficient, persistent, hard-disk backed DOM implementation that does not require entire documents to be in memory at a time. This talk will discuss such a DOM implementation, EMC's xDB, showing how to use binary XML and efficient backend structures to provide a standards compliant, non-memory-backed, transactional DOM implementation, with little overhead compared to regular memory-based DOMs. It will also give performance comparisons and show how to run existing applications transparently against xDB's DOM implementation, using XSLT stylesheets as an example.


2012 ◽  
Author(s):  
Ren Hui Gong ◽  
Ziv Yaniv

The Insight Segmentation and Registration Toolkit (ITK) previously provided a framework for parsing Extensible Markup Language (XML) documents using the Simple API for XML (SAX) framework. While this programming model is memory efficient, it places most of the implementation burden on the user. We provide an implementation of the Document Object Model (DOM) framework for parsing XML documents. Using this model, user code is greatly simplified, shifting most of the implementation burden from the user to the framework. The provided implementation consists of two tiers. The lower level tier provides functionality for parsing XML documents and loading the tree structure into memory. It then allows the user to query and retrieve specific entries. The upper tier uses this functionality to provide an interface for mimicking a serialization and de-serialization mechanism for ITK objects. The implementation described in this document was incorporated into ITK as part of release 4.2.


Author(s):  
Yasser Hachaichi ◽  
Jamel Feki ◽  
Hanene Ben-Abdallah

Due to the international economic competition, enterprises are ever looking for efficient methods to build data marts/warehouses to analyze the large data volume in their decision making process. On the other hand, even though the relational data model is the most commonly used model, any data mart/ warehouse construction method must now deal with other data types and in particular XML documents which represent the dominant type of data exchanged between partners and retrieved from the Web. This chapter presents a data mart design method that starts from both a relational database source and XML documents compliant to a given DTD. Besides considering these two types of data structures, the originality of our method lies in its being decision maker centered, its automatic extraction of loadable data mart schemas and its genericity.


Author(s):  
Abad Shah ◽  
Jacob Adeniyi ◽  
Tariq Al Tuwairqi

The Web and XML have influenced all walks of lives of those who transact business over the Internet. People like to do their transactions from their homes to save time and money. For example, customers like to pay their utility bills and other banking transactions from their homes through the Internet. Most companies, including banks, maintain their records using relational database technology. But the traditional relational database technology is unable to provide all these new facilities to the customers. To make the traditional relational database technology cope with the Web and XML technologies, we need a transformation between the XML technology and the relational database technology as middleware. In this chapter, we present a new and simpler algorithm for this purpose. This algorithm transforms a schema of a XML document into a relational database schema, taking into consideration the requirement of relational database technology.


Sign in / Sign up

Export Citation Format

Share Document