XML Dataset and Benchmarks for Performance Testing of  the CLS Labelling Scheme

Extensible Markup Language (XML) has become a significant technology for transferring data through the world of the Internet. XML labelling schemes are an essential technique used to handle XML data effectively. Labelling XML data is performed by assigning labels to all nodes in that XML document. CLS labelling scheme is a hybrid labelling scheme that was developed to address some limitations of indexing XML data. Moreover, datasets are used to test XML labelling schemes. There are many XML datasets available nowadays. Some of them are from real life datasets and others are from artificial datasets. These datasets and benchmarks are used for testing the XML labelling schemes. This paper discusses and considers these datasets and benchmarks and their specifications in order to determine the most appropriate one for testing the CLS labelling scheme. This research found out that the XMark benchmark is the most appropriate choice for the testing performance of the CLS labelling scheme.

Download Full-text

Analisis Empiris Atas ORDPATH Encoding Untuk Kinerja Insert Node Pada Ordered XML Tree

Jurnal Buana Informatika ◽

10.24002/jbi.v6i4.457 ◽

2015 ◽

Vol 6 (4) ◽

Author(s):

Irvanizam Zamanhuri

Keyword(s):

Data Structure ◽

Data Exchange ◽

Interval Number ◽

Extensible Markup Language ◽

Markup Language ◽

Xml Data ◽

Xml Document ◽

Extensible Markup ◽

Ordered Tree

Abstract. The eXtensible Markup Language (XML) has quickly become the de facto standard for data exchange via web. An XML document can be viewed as an ordered tree that has at least one node. Each node must be labeled by using a scheme approach to describe the XML data structure. There are two famous existing encodings, namely Dewey and Inteval Encodings. In this paper, ORDPATH encoding based on Dewey together with the two other encodings are empirically demontrated on dblp, nasa, and treebank datasets. The results show that while a new node was inserted into the tree, Dewey and Interval have to relabel the inserted nodeâ€™s siblings and modify the interval number of the sibling nodes, respectively. Whereas, the ORDPATH eliminates this problem by adding an even number used as a caret for the new insertion node.Keywords: Ordered Tree, XML, ORDPATH.Â Abstrak. EXtensible Markup Language (XML) terus menjadi standard untuk penukaran data melalui web. Sebuah dokumen XML dapat ditinjau menjadi tree terurut yang berisikan sedikitnya satu node. Setiap node harus dilabelkan menggunakan sebuah algoritma pelabelan untuk mendeskripsikan struktur data XML tersebut. Ada dua algoritma encoding yang terkenal selama ini, Dewey dan Interval encoding. Pada tulisan ini, metode ORDPATH yang berbasiskan Dewey bersama-sama dengan Dewey dan Interval didemontrasikan secara empiris dengan menggunakan dataset dblp, nasa, dan treebank. Hasil menunjukkan bahwa ketika node baru dimasukkan ke dalam tree, Dewey dan Interval harus melakukan pelabelan kembali dan memodifikasi interval sibling node. Akan tetapi, ORDPATH dapat mengatasi masalah ini dengan memberikan angka genap yang digunakan sebagai penanda untuk node baru.Kata Kunci: Ordered Tree, XML, ORDPATH.

Download Full-text

Discovering XML Conditional Dependencies for Data Quality Issues

European Journal of Electrical Engineering and Computer Science ◽

10.24018/ejece.2020.4.1.156 ◽

2020 ◽

Vol 4 (1) ◽

Author(s):

Mohammed Ragheb Hakawati ◽

Yasmin Yacob ◽

Amiza Amir ◽

Jabiry M. Mohammed ◽

Khalid Jamal Jadaa

Keyword(s):

Data Quality ◽

Primary Standard ◽

Markup Language ◽

Document Type ◽

Data Dependencies ◽

Master Data ◽

Xml Document ◽

Extensible Markup ◽

Quality Issues ◽

Mining Algorithms

Extensible Markup Language (XML) is emerging as the primary standard for representing and exchanging data, with more than 60% of the total; XML considered the most dominant document type over the web; nevertheless, their quality is not as expected. XML integrity constraint especially XFD plays an important role in keeping the XML dataset as consistent as possible, but their ability to solve data quality issues is still intangible. The main reason is that old-fashioned data dependencies were basically introduced to maintain the consistency of the schema rather than that of the data. The purpose of this study is to introduce a method for discovering pattern tableaus for XML conditional dependencies to be used for enhancing XML document consistency as a part of data quality improvement phases. The notations of the conditional dependencies as new rules are designed mainly for improving data instance and extended traditional XML dependencies by enforcing pattern tableaus of semantically related constants. Subsequent to this, a set of minimal approximate conditional dependencies (XCFD, XCIND) is discovered and learned from the XML tree using a set of mining algorithms. The discovered patterns can be used as a Master data in order to detect inconsistencies that don’t respect the majority of the dataset.

Download Full-text

The Globalization of Crystallographic Knowledge

Acta Crystallographica Section D Biological Crystallography ◽

10.1107/s0907444998009366 ◽

1998 ◽

Vol 54 (6) ◽

pp. 1065-1070 ◽

Cited By ~ 4

Author(s):

Peter Murray-Rust

Keyword(s):

World Wide ◽

Distributed Databases ◽

Quality Data ◽

Markup Language ◽

High Quality ◽

High Quality Data ◽

The World ◽

Extensible Markup ◽

Structured Documents ◽

New Generation

The rapid growth of the World Wide Web provides major new opportunities for distributed databases, especially in macromolecular science. A new generation of technology, based on structured documents (SD), is being developed which will integrate documents and data in a seamless manner. This offers experimentalists the chance to publish and archive high-quality data from any discipline. Data and documents from different disciplines can be combined and searched using technology such as eXtensible Markup Language (XML) and its associated support for hypermedia (XLL), metadata (RDF) and stylesheets (XSL). Opportunities in crystallography and related disciplines are described.

Download Full-text

Extranets and XML: The Next Internal Control Challenge

Review of Business Information Systems (RBIS) ◽

10.19030/rbis.v4i1.5388 ◽

2000 ◽

Vol 4 (1) ◽

pp. 47-50

Author(s):

Rick Elam ◽

Zabihollah Rezaee

Keyword(s):

Internal Control ◽

World Wide ◽

Electronic Data Interchange ◽

Markup Language ◽

Data Interchange ◽

The World ◽

Extensible Markup ◽

Flexible Programming ◽

Business To Business ◽

Control Challenge

The purpose of this article is to describe the shift of business-to-business trading from Electronic Data Interchange (EDI) to extranets and to discuss some of the internal con-trol challenges created by extranets and the eXtensible Markup Language (XML). This technology raises internal control issues because extranets use the World Wide Web to communicate and because XML is such a powerful and flexible programming language.

Download Full-text

Transforming data-centric eXtensible markup language into relational databases using hybrid approach

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i6.2865 ◽

2021 ◽

Vol 10 (6) ◽

pp. 3256-3264

Author(s):

Su-Cheng Haw ◽

Emyliana Song

Keyword(s):

Relational Databases ◽

Hybrid Approach ◽

Data Representation ◽

Extensible Markup Language ◽

Markup Language ◽

Seamless Integration ◽

Structural Relationships ◽

Extensible Markup ◽

Core Framework ◽

Labelling Scheme

eXtensible markup language (XML) appeared internationally as the format for data representation over the web. Yet, most organizations are still utilising relational databases as their database solutions. As such, it is crucial to provide seamless integration via effective transformation between these database infrastructures. In this paper, we propose XML-REG to bridge these two technologies based on node-based and path-based approaches. The node-based approach is good to annotate each positional node uniquely, while the path-based approach provides summarised path information to join the nodes. On top of that, a new range labelling is also proposed to annotate nodes uniquely by ensuring the structural relationships are maintained between nodes. If a new node is to be added to the document, re-labelling is not required as the new label will be assigned to the node via the new proposed labelling scheme. Experimental evaluations indicated that the performance of XML-REG exceeded XMap, XRecursive, XAncestor and Mini-XML concerning storing time, query retrieval time and scalability. This research produces a core framework for XML to relational databases (RDB) mapping, which could be adopted in various industries.

Download Full-text

Application of XML in the Remote Temperature Monitoring System

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.6509 ◽

2012 ◽

Vol 433-440 ◽

pp. 6509-6513

Author(s):

Li Min Cai

Keyword(s):

Monitoring System ◽

Temperature Sensor ◽

Temperature Monitoring ◽

Markup Language ◽

The Core ◽

Xml Document ◽

Embedded Linux System ◽

Extensible Markup ◽

Digital Temperature Sensor ◽

System Program

This paper Introduces XML (Extensible Markup Language), describes the Remote Temperature Monitoring System program. Samsung S3C2440 Microprocessor as the core of this system, Embedded Linux System and web server are transplanted, accomplish the on-site collection of temperature by the digital temperature sensor DS18B20, acquired data is saved in XML document, on-site real-time temperature can be displayed on a browser by a remote end. The results of actual runs show the effectiveness.

Download Full-text

An Extensible Approach for Modeling Ontologies in RDF(S)

Knowledge Media in Healthcare ◽

10.4018/978-1-930708-13-6.ch013 ◽

2011 ◽

pp. 234-253 ◽

Cited By ~ 5

Author(s):

Steffen Staab ◽

Michael Erdmann ◽

Alexander Maedche ◽

Stefan Decker

Keyword(s):

Food Chain ◽

World Wide ◽

Markup Language ◽

Information Presentation ◽

General Methodology ◽

The World ◽

Information Contents ◽

Extensible Markup ◽

Hypertext Markup Language ◽

Technical Platform

The development of the World Wide Web is about to mature from a technical platform that allows for the transportation of information from sources to humans (albeit in many syntactic formats) to the communication of knowledge from Web sources to machines. The knowledge food chain has started with technical protocols and preliminary formats for information presentation (HTML–HyperText Markup Language) over a general methodology for separating information contents from layout (XML–eXtensible Markup Language, XSL–eXtensible Stylesheet Language) to reach the realms of knowledge provisioning by the means of RDF and RDFS.

Download Full-text

XML Warehousing and OLAP

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch323 ◽

2011 ◽

pp. 2109-2116 ◽

Cited By ~ 7

Author(s):

Hadj Mahboubi

Keyword(s):

Data Warehouse ◽

Data Warehousing ◽

Markup Language ◽

Future Trends ◽

Xml Data ◽

Easy Task ◽

Extensible Markup ◽

On Line ◽

Analytical Processing ◽

Business Data

With the eXtensible Markup Language (XML) becoming a standard for representing business data (Beyer et al., 2005), a new trend toward XML data warehousing has been emerging for a couple of years, as well as efforts for extending the XQuery language with near On-Line Analytical Processing (OLAP) capabilities (grouping, aggregation, etc.). Though this is not an easy task, these new approaches, techniques and architectures aim at taking specificities of XML into account (e.g., heterogeneous number and order of dimensions or complex measures in facts, ragged dimension hierarchies…) that would be intricate to handle in a relational environment. The aim of this article is to present an overview of the major XML warehousing approaches from the literature, as well as the existing approaches for performing OLAP analyses over XML data (which is termed XML-OLAP or XOLAP; Wang et al., 2005). We also discuss the issues and future trends in this area and illustrate this topic by presenting the design of a unified, XML data warehouse architecture and a set of XOLAP operators expressed in an XML algebra.

Download Full-text

Requirements for the Testable Specification and Test Case Derivation in Conformance Testing

Verification, Validation and Testing in Software Engineering ◽

10.4018/978-1-59140-851-2.ch006 ◽

2007 ◽

pp. 136-156

Author(s):

Tanja Toroi ◽

Anne Eerola

Keyword(s):

Software Industry ◽

Conformance Testing ◽

Software Systems ◽

Test Case ◽

Test Cases ◽

Markup Language ◽

Testing Environment ◽

Clinical Document ◽

The World ◽

Extensible Markup

Interoperability of software systems is a critical, ever-increasing requirement in software industry. Conformance testing is needed to assure conformance of software and interfaces to standards and other specifications. In this chapter we shortly refer to what has been done in conformance testing around the world and in Finland. Also, testability requirements for the specifications utilized in conformance testing are proposed and test-case derivation from different kinds of specifications is examined. Furthermore, we present a conformance-testing environment for the healthcare domain, developed in an OpenTE project, consisting of different service-specific and shared testing services. In our testing environment testing is performed against open interfaces, and test cases can, for example, be in XML (extensible markup language) or CDA R2 (clinical document architecture, Release 2) form.

Download Full-text

Designing Web-Based Hypermedia Systems

Encyclopedia of Multimedia Technology and Networking ◽

10.4018/978-1-59140-561-0.ch026 ◽

2011 ◽

pp. 173-179 ◽

Cited By ~ 1

Author(s):

Michael Lang

Keyword(s):

World Wide Web ◽

World Wide ◽

Common Gateway Interface ◽

Markup Language ◽

Document Object Model ◽

Web Based ◽

The World ◽

Extensible Markup ◽

The Common ◽

The Web

Although its conceptual origins can be traced back a few decades (Bush, 1945), it is only recently that hypermedia has become popularized, principally through its ubiquitous incarnation as the World Wide Web (WWW). In its earlier forms, the Web could only properly be regarded a primitive, constrained hypermedia implementation (Bieber & Vitali, 1997). Through the emergence in recent years of standards such as eXtensible Markup Language (XML), XLink, Document Object Model (DOM), Synchronized Multimedia Integration Language (SMIL) and WebDAV, as well as additional functionality provided by the Common Gateway Interface (CGI), Java, plug-ins and middleware applications, the Web is now moving closer to an idealized hypermedia environment. Of course, not all hypermedia systems are Web based, nor can all Web-based systems be classified as hypermedia (see Figure 1). See the terms and definitions at the end of this article for clarification of intended meanings. The focus here shall be on hypermedia systems that are delivered and used via the platform of the WWW; that is, Web-based hypermedia systems.

Download Full-text