A XML Document Coding Schema Based on Binary

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.496-500.1877 ◽

2014 ◽

Vol 496-500 ◽

pp. 1877-1880

Author(s):

Dong Juan Gu ◽

Li Yong Wan

Keyword(s):

Improved Method ◽

Xml Data ◽

Dynamic Update ◽

Data Query ◽

Binary Encoding ◽

Xml Document ◽

Encoding Strategy ◽

Dynamic Updates

In order to resolve the inefficiency for XML data query and support dynamic updates, etc, this paper has proposed an improved method to encode XML document nodes. On the basic of region encoding and the prefix encoding, it introduces a XML document coding schema base on binary (CSBB). The CSBB code use binary encoding strategy and make the bit string inserted in order. The bit string inserted algorithm can generate ordered bit string to reserve space for the inserted new nodes, and not influence on the others. Experiments shows the CSBB code can effectively avoid re-encoding of nodes, and supports the nodes Dynamic Update.

Download Full-text

The Research of XML Keyword Retrieval Algorithms Based on MapReduce

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.556-562.3347 ◽

2014 ◽

Vol 556-562 ◽

pp. 3347-3349

Author(s):

Yao Wen Xia ◽

Ji Li Xie

Keyword(s):

Retrieval Algorithm ◽

Keyword Query ◽

Data Set ◽

Xml Data ◽

Design And Implementation ◽

Data Query ◽

Xml Document ◽

Retrieval Algorithms ◽

Retrieval Problem ◽

Query System

In this paper, from the perspective of XML data management, first in the HDFS store large amount of data and XML data based on XML data query rewrite the traditional framework of MapReduce process, the design of large amount of data XML data set keywords retrieval algorithm, contain XML data classification and coding, index and search a four parts, solve the large amount of data of the XML document keywords retrieval problem. Then the design and implementation based on MapReduce of large amount of data XML keyword query system.

Download Full-text

Keyword Search over Probabilistic XML Documents Based on Node Classification

Mathematical Problems in Engineering ◽

10.1155/2015/210961 ◽

2015 ◽

Vol 2015 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Yue Zhao ◽

Ye Yuan ◽

Guoren Wang

Keyword(s):

Keyword Search ◽

Possible World ◽

Xml Data ◽

Fast Learning ◽

Probabilistic Xml ◽

Learning Speed ◽

Xml Document ◽

Probability Threshold ◽

Node Classification ◽

Learning Machine

This paper describes a keyword search measure on probabilistic XML data based on ELM (extreme learning machine). We use this method to carry out keyword search on probabilistic XML data. A probabilistic XML document differs from a traditional XML document to realize keyword search in the consideration of possible world semantics. A probabilistic XML document can be seen as a set of nodes consisting of ordinary nodes and distributional nodes. ELM has good performance in text classification applications. As the typical semistructured data; the label of XML data possesses the function of definition itself. Label and context of the node can be seen as the text data of this node. ELM offers significant advantages such as fast learning speed, ease of implementation, and effective node classification. Set intersection can compute SLCA quickly in the node sets which is classified by using ELM. In this paper, we adopt ELM to classify nodes and compute probability. We propose two algorithms that are based on ELM and probability threshold to improve the overall performance. The experimental results verify the benefits of our methods according to various evaluation metrics.

Download Full-text

Mapping Bitemporal XML Data Model to XML Document

Lecture Notes in Computer Science - Computer Supported Cooperative Work in Design IV ◽

10.1007/978-3-540-92719-8_31 ◽

2008 ◽

pp. 342-352 ◽

Cited By ~ 1

Author(s):

Na Tang ◽

Yong Tang

Keyword(s):

Data Model ◽

Xml Data ◽

Xml Document

Download Full-text

An XML Data Query Method Based on Structure-Encoded

Web Information Systems and Mining - Lecture Notes in Computer Science ◽

10.1007/978-3-642-33469-6_87 ◽

2012 ◽

pp. 706-713

Author(s):

Zhaohui Xu ◽

Jie Qin ◽

Fuliang Yan

Keyword(s):

Xml Data ◽

Data Query

Download Full-text

Hepatitis C virus contact map prediction based on binary encoding strategy

Computational Biology and Chemistry ◽

10.1016/j.compbiolchem.2007.03.009 ◽

2007 ◽

Vol 31 (3) ◽

pp. 233-238 ◽

Cited By ~ 10

Author(s):

Guang-Zheng Zhang ◽

Kyungsook Han

Keyword(s):

Hepatitis C Virus ◽

Hepatitis C ◽

Contact Map ◽

Binary Encoding ◽

Encoding Strategy

Download Full-text

XML Object Identification

Advances in Data Mining and Database Management - Innovative Techniques and Applications of Entity Resolution ◽

10.4018/978-1-4666-5198-2.ch007 ◽

2014 ◽

pp. 140-170

Keyword(s):

Entity Resolution ◽

Data Representation ◽

Object Identification ◽

Document Management ◽

Future Research ◽

Xml Data ◽

Xml Document ◽

Peer To Peer Systems ◽

Structured Information ◽

Algorithm Base

For the ability to represent data from a wide variety of sources, XML is rapidly emerging as the new standard for data representation and exchange on Web and e-government. To effectively use XML data in practice, entity resolution, which has been proven extremely useful in data fusion, inconsistency detection, and data repairing, must be in place to improve the quality of the XML data. In this chapter, the authors deal specifically with object identification on XML data, the application of which includes XML document management in highly dynamic applications like the Web and peer-to-peer systems, detection of duplicate elements in nested XML data, and finding similar identities among objects from multiple Web sources. The authors survey techniques of pairwise and groupwise entity resolution for XML data, which adopt structured information to describe the similarity or distance of XML data, like XML document and XML elements in document, and find the matching pairs which describe same object or classify them into separate groups, each group corresponding to the same object in real world. There are a lot of ways to describe the XML structure and content, such as a tree, Bayesian network, and set. The authors introduce some well-known algorithm base on these structures to solve matching XML data problems. Finally, the authors discuss directions for future research.

Download Full-text

A Method of XML to RDB Mapping Based on XML Schema

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1044-1045.995 ◽

2014 ◽

Vol 1044-1045 ◽

pp. 995-998

Author(s):

Jia Ying

Keyword(s):

Xml Schema ◽

Improved Method ◽

Xml Document ◽

New Type

The article analyzed the shortage of P_schema, and brought forward an improved method P_schema++,.Nesting structure.multi_citing element, alternative element was picked up to format a new type, and then to a relation table. P_Schema++ provided a method for storage of Complex XML document in RDB.

Download Full-text

Document transformation system from papers to XML data based on pivot XML document method

Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings. ◽

10.1109/icdar.2003.1227668 ◽

2004 ◽

Cited By ~ 11

Author(s):

Y. Ishitani

Keyword(s):

Transformation System ◽

Xml Data ◽

Xml Document ◽

Document Transformation

Download Full-text

Analisis Empiris Atas ORDPATH Encoding Untuk Kinerja Insert Node Pada Ordered XML Tree

Jurnal Buana Informatika ◽

10.24002/jbi.v6i4.457 ◽

2015 ◽

Vol 6 (4) ◽

Author(s):

Irvanizam Zamanhuri

Keyword(s):

Data Structure ◽

Data Exchange ◽

Interval Number ◽

Extensible Markup Language ◽

Markup Language ◽

Xml Data ◽

Xml Document ◽

Extensible Markup ◽

Ordered Tree

Abstract. The eXtensible Markup Language (XML) has quickly become the de facto standard for data exchange via web. An XML document can be viewed as an ordered tree that has at least one node. Each node must be labeled by using a scheme approach to describe the XML data structure. There are two famous existing encodings, namely Dewey and Inteval Encodings. In this paper, ORDPATH encoding based on Dewey together with the two other encodings are empirically demontrated on dblp, nasa, and treebank datasets. The results show that while a new node was inserted into the tree, Dewey and Interval have to relabel the inserted nodeâ€™s siblings and modify the interval number of the sibling nodes, respectively. Whereas, the ORDPATH eliminates this problem by adding an even number used as a caret for the new insertion node.Keywords: Ordered Tree, XML, ORDPATH.Â Abstrak. EXtensible Markup Language (XML) terus menjadi standard untuk penukaran data melalui web. Sebuah dokumen XML dapat ditinjau menjadi tree terurut yang berisikan sedikitnya satu node. Setiap node harus dilabelkan menggunakan sebuah algoritma pelabelan untuk mendeskripsikan struktur data XML tersebut. Ada dua algoritma encoding yang terkenal selama ini, Dewey dan Interval encoding. Pada tulisan ini, metode ORDPATH yang berbasiskan Dewey bersama-sama dengan Dewey dan Interval didemontrasikan secara empiris dengan menggunakan dataset dblp, nasa, dan treebank. Hasil menunjukkan bahwa ketika node baru dimasukkan ke dalam tree, Dewey dan Interval harus melakukan pelabelan kembali dan memodifikasi interval sibling node. Akan tetapi, ORDPATH dapat mengatasi masalah ini dengan memberikan angka genap yang digunakan sebagai penanda untuk node baru.Kata Kunci: Ordered Tree, XML, ORDPATH.

Download Full-text

XML Data Binding for C++ Using Metadata

Innovations, Standards and Practices of Web Services ◽

10.4018/978-1-61350-104-7.ch011 ◽

2013 ◽

pp. 232-249

Author(s):

Szabolcs Payrits ◽

Péter Dornbach ◽

István Zólyomi

Keyword(s):

Web Service ◽

Programming Languages ◽

Extended Version ◽

Code Size ◽

Xml Data ◽

C Programming Language ◽

Data Binding ◽

C Programming ◽

Xml Document ◽

And Performance

Mapping XML document schemas and Web Service interfaces to programming languages has an important role in effective creation of quality Web Service implementations. The authors present a novel way to map XML data to the C++ programming language. The proposed solution offers more flexibility and more compact code that makes it ideal for embedded environments. The article describes the concept and the architecture of the solution and compares it with existing solutions. This article is an extended version of the paper from ICWS 2006. The authors include a broader comparison with existing tools on Symbian and Linux platforms and evaluate the code size and performance.

Download Full-text