Reverse Engineering from an XML Document into an Extended DTD Graph

2009 ◽  
pp. 1313-1333
Author(s):  
Herbert Shiu ◽  
Joseph Fong

The extensible markup language (XML) has become a standard for persistent storage and data interchange via the Internet due to its openness, self-descriptiveness, and flexibility. This article proposes a systematic approach to reverse engineer arbitrary XML documents to their conceptual schema, extended DTD graphs, which are DTD graphs with data semantics. The proposed approach not only determines the structure of the XML document, but also derives candidate data semantics from the XML element instances by treating each XML element instance as a record in a table of a relational database. One application of the determined data semantics is to verify the linkages among elements. Implicit and explicit referential linkages are among XML elements modeled by the parent-children structure and ID/IDREF(S), respectively. As a result, an arbitrary XML document can be reverse engineered into its conceptual schema in an extended DTD graph format.

Author(s):  
Joseph Fong ◽  
Herbert Shiu

Extensible Markup Language (XML) has become a standard for persistent storage and data interchange via the Internet due to its openness, self-descriptiveness and flexibility. This chapter proposes a systematic approach to reverse engineer arbitrary XML documents to their conceptual schema – Extended DTD Graphs ? which is a DTD Graph with data semantics. The proposed approach not only determines the structure of the XML document, but also derives candidate data semantics from the XML element instances by treating each XML element instance as a record in a table of a relational database. One application of the determined data semantics is to verify the linkages among elements. Implicit and explicit referential linkages are among XML elements modeled by the parent-children structure and ID/IDREF(S) respectively. As a result, an arbitrary XML document can be reverse engineered into its conceptual schema in an Extended DTD Graph format.


2009 ◽  
pp. 2489-2509 ◽  
Author(s):  
Herbert Shiu ◽  
Joseph Fong

The extensible markup language (XML) has become a standard for persistent storage and data interchange via the Internet due to its openness, self-descriptiveness, and flexibility. This article proposes a systematic approach to reverse engineer arbitrary XML documents to their conceptual schema, extended DTD graphs, which are DTD graphs with data semantics. The proposed approach not only determines the structure of the XML document, but also derives candidate data semantics from the XML element instances by treating each XML element instance as a record in a table of a relational database. One application of the determined data semantics is to verify the linkages among elements. Implicit and explicit referential linkages are among XML elements modeled by the parent-children structure and ID/IDREF(S), respectively. As a result, an arbitrary XML document can be reverse engineered into its conceptual schema in an extended DTD graph format.


Author(s):  
Herbert Shiu ◽  
Joseph Fong

Extensible Markup Language (XML) has become a standard for persistent storage and data interchange via the Internet due to its openness, self-descriptiveness and flexibility. This paper proposes a systematic approach to reverse engineer arbitrary XML documents to their conceptual schema – Extended DTD Graphs ? which is a DTD Graph with data semantics. The proposed approach not only determines the structure of the XML document, but also derives candidate data semantics from the XML element instances by treating each XML element instance as a record in a table of a relational database. One application of the determined data semantics is to verify the linkages among elements. Implicit and explicit referential linkages are among XML elements modeled by the parent-children structure and ID/IDREF(S) respectively. As a result, an arbitrary XML document can be reverse engineered into its conceptual schema in an Extended DTD Graph format.


Author(s):  
JOSEPH FONG ◽  
ANTHONY FONG ◽  
H. K. WONG ◽  
PHILIP YU

With XML adopted as the technology trend on the Internet, and with investment in the current relational database systems, companies must convert their relational data into XML documents for data transmission on the Internet. In the process, to preserve the users' relational data requirements of data constraints into the converted XML documents, we must define a meaningful root element for each XML document. The construction of an XML document is based on the root element and its relevant elements. The root element can be selected from a relational entity table in the existing relational database, which depends on the requirements to present the business behind. The relevant elements are mapped from the related entities, based on the navigability of the chosen entity. The derived root and relevant elements can form a Data Type Definition Graph (DTD-graph) of an XML conceptual schema diagram which can be mapped into a Data Type Definition (DTD) of an XML schema. The result is a translated XML schema with semantic constraints transferred from a relational conceptual schema of an Extended Entity Relationship (EER) model. The data conversion from relational data to the XML documents can be done after the schema translation. The relational data are loaded into XML documents according to the translated DTD.


Author(s):  
Abad Shah ◽  
Jacob Adeniyi ◽  
Tariq Al Tuwairqi

The Web and XML have influenced all walks of lives of those who transact business over the Internet. People like to do their transactions from their homes to save time and money. For example, customers like to pay their utility bills and other banking transactions from their homes through the Internet. Most companies, including banks, maintain their records using relational database technology. But the traditional relational database technology is unable to provide all these new facilities to the customers. To make the traditional relational database technology cope with the Web and XML technologies, we need a transformation between the XML technology and the relational database technology as middleware. In this chapter, we present a new and simpler algorithm for this purpose. This algorithm transforms a schema of a XML document into a relational database schema, taking into consideration the requirement of relational database technology.


2019 ◽  
Vol 18 (04) ◽  
pp. 1950048
Author(s):  
Amjad Qtaish ◽  
Mohammad T. Alshammari

Extensible Markup Language (XML) has become a common language for data interchange and data representation in the Web. The evolution of the big data environment and the large volume of data which is being represented by XML on the Web increase the challenges in effectively managing such data in terms of storing and querying. Numerous solutions have been introduced to store and query XML data, including the file systems, Object-Oriented Database (OODB), Native XML Database (NXD), and Relational Database (RDB). Previous research attempts indicate that RDB is the most powerful technology for managing XML data to date. Because of the structure variations of XML and RDB, the need to map XML data to an RDB scheme is increased. This growth has prompted numerous researchers and database vendors to propose different approaches to map XML documents to an RDB, translating different types of XPath queries to SQL queries and returning the results to an XML format. This paper aims to comprehensively review most cited and latest mapping approaches and database vendors that use RDB solution to store and query XML documents, in a narrative manner. The advantages and the drawbacks of each approach is discussed, particularly in terms of storing and querying. The paper also provides some insight into managing XML documents using RDB solution in terms of storing and querying and contributes to the XML community.


Author(s):  
Ibrahim Dweib ◽  
Joan Lu

Extensible Markup Language (XML) nowadays is one of the most important standard media used for exchanging and representing data through the Internet. Storing, updating, and retrieving the huge amount of web services data such as XML is an attractive area of research for researchers and database vendors. In this chapter, the authors propose and develop a new mapping model, called MAXDOR, for storing, rebuilding, updating, and querying XML documents using a relational database without making use of any XML schemas in the mapping process. The model addressed the problem of solving the structural hole between ordered hierarchical XML and unordered tabular relational database to enable us to use relational database systems for storing, updating, and querying XML data. A multiple link list is used to maintain XML document structure, manage the process of updating document contents, and retrieve document contents efficiently. Experiments are done to evaluate MAXDOR model. MAXDOR will be compared with other well-known models available in the literature (Tatarinov et al., 2002) and (Torsten et al., 2004) using total expected value of rebuilding XML document execution time and insertion of token execution time.


2012 ◽  
Vol 10 (3) ◽  
pp. 13-26
Author(s):  
Xiaomin Zhu ◽  
Zhongxiang He ◽  
Shengbo Shi

Extensible Markup Language (XML) is a textual markup language which becomes more and more important in the Internet web service. However, some distinct disadvantages exist in XML, such as its nature of redundancy, which consumes the limited network’s bandwidth greatly especially in mobile computing. Considering the characteristics of the mobile commerce, the handsets’ memory capability and data processing time are two problems for XML being applied. This paper studies an enhancement of XML for the purpose of application in mobile e-commerce, called SXML, which means Simple XML to enhance the XML used in mobile web service. It helps XML producers minimizing the size effects of XML, e.g., the size overhead and slow implementation speed. Comprehensive simulations show that the SXML could reduce the size of XML documents and reduce the time of implementation, consequently utilize the bandwidth effectively.


2008 ◽  
Vol 8 (3) ◽  
pp. 323-361 ◽  
Author(s):  
J. M. ALMENDROS-JIMÉNEZ ◽  
A. BECERRA-TERÓN ◽  
F. J. ENCISO-BAÑOS

AbstractExtensible Markup Language (XML) is a simple, very flexible text format derived from SGML. Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere. XPath language is the result of an effort to provide address parts of an XML document. In support of this primary purpose, it becomes in a query language against an XML document. In this paper we present a proposal for the implementation of the XPath language in logic programming. With this aim we will describe the representation of XML documents by means of a logic program. Rules and facts can be used for representing the document schema and the XML document itself. In particular, we will present how to index XML documents in logic programs: rules are supposed to be stored in main memory, however facts are stored in secondary memory by using two kind of indexes: one for each XML tag, and other for each group of terminal items. In addition, we will study how to query by means of the XPath language against a logic program representing an XML document. It evolves the specialization of the logic program with regard to the XPath expression. Finally, we will also explain how to combine the indexing and the top-down evaluation of the logic program.


Author(s):  
Albrecht Schmidt ◽  
Stefan Manegold ◽  
Martin Kersten

Ever since the Extensible Markup Language (XML) (W3C, 1998b) began to be used to exchange data between diverse sources, interest has grown in deploying data management technology to store and query XML documents. A number of approaches propose to adapt relational database technology to store and maintain XML documents (Deutsch, Fernandez & Suciu, 1999; Florescu & Kossmann, 1999; Klettke & Meyer, 2000; Shanmugasundaram et al., 1999; Tatarinov et al., 2002; O’Neil et al., 2004). The advantage is that the XML repository inherits all the power of mature relational technology like indexes and transaction management. For XML-enabled querying, a declarative query language (Chamberlin et al., 2001) is available.


Sign in / Sign up

Export Citation Format

Share Document