Algorithms for Generating XML Documents from Probabilistic XML

Uncertain relational data management has been investigated for a few years, but few works on uncertain XML. The natural structures with high flexibility make XML more appropriate for representing uncertain information. Based on the semantic of possible world and probabilistic models with independent distribution and mutual exclusive distribution nodes, the problem of how to generate instance from a probabilistic XML and calculate its probability was studied, which is one of the key problems of uncertain XML management. Moreover, an algorithm for a generating XML document from a probabilistic XML and calculating its probability are also proposed, which has linear time complexity. Finally, experiment results are made to show up the correct and efficiency of the algorithm.

Download Full-text

Keyword Search over Probabilistic XML Documents Based on Node Classification

Mathematical Problems in Engineering ◽

10.1155/2015/210961 ◽

2015 ◽

Vol 2015 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Yue Zhao ◽

Ye Yuan ◽

Guoren Wang

Keyword(s):

Keyword Search ◽

Possible World ◽

Xml Data ◽

Fast Learning ◽

Probabilistic Xml ◽

Learning Speed ◽

Xml Document ◽

Probability Threshold ◽

Node Classification ◽

Learning Machine

This paper describes a keyword search measure on probabilistic XML data based on ELM (extreme learning machine). We use this method to carry out keyword search on probabilistic XML data. A probabilistic XML document differs from a traditional XML document to realize keyword search in the consideration of possible world semantics. A probabilistic XML document can be seen as a set of nodes consisting of ordinary nodes and distributional nodes. ELM has good performance in text classification applications. As the typical semistructured data; the label of XML data possesses the function of definition itself. Label and context of the node can be seen as the text data of this node. ELM offers significant advantages such as fast learning speed, ease of implementation, and effective node classification. Set intersection can compute SLCA quickly in the node sets which is classified by using ELM. In this paper, we adopt ELM to classify nodes and compute probability. We propose two algorithms that are based on ELM and probability threshold to improve the overall performance. The experimental results verify the benefits of our methods according to various evaluation metrics.

Download Full-text

Efficiently Subtree Matching between XML and Probabilistic XML Documents

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.571-572.575 ◽

2014 ◽

Vol 571-572 ◽

pp. 575-579

Author(s):

Hai Tao Ma ◽

Chang Yong Yu ◽

Chang Ming Xu ◽

Miao Fang

Keyword(s):

Edit Distance ◽

Distribution Model ◽

Tree Edit Distance ◽

Matching Problem ◽

Distance Computation ◽

Xml Documents ◽

Probabilistic Xml ◽

Xml Document ◽

Query Tree ◽

Minimum Solution

We explored the subtree matching problem of probabilistic XML documents: finding the matches of an XML query tree over a probabilistic XML document, using the canonical tree edit distance as a similarity measure between subtrees. Probabilistic XML is a probability distribution model capturing uncertainty of both value and structure. Query over probabilistic XML documents is difficult: an naivie algorithm has exponential complexity by directly compute the tree edit distance between the query tree and each certain XML tree represented by the probabilistic XML document. Based on the method of tree edit distance computation over certain XML subtrees, we defined a minimum-solution to the edit distance computation, which means the minimum cost to translate the query tree to the probabilistic XML tree. Furthermore, we developed an algorithm---ASM (Algorithm of Subtree Matching) to compute the minimum solution. Finally, we proved the complexity of ASM is linear in the size of the probabilistic XML document.

Download Full-text

Linear time complexity GF(256) RaptorQ implementation on GPU

2017 International Conference on Information and Communication Technology Convergence (ICTC) ◽

10.1109/ictc.2017.8190987 ◽

2017 ◽

Cited By ~ 1

Author(s):

Sunwoong Joo

Keyword(s):

Time Complexity ◽

Linear Time

Download Full-text

Adaptive Multiresolution and Dedicated Elastic Matching in Linear Time Complexity for Time Series Data Mining

Sixth International Conference on Intelligent Systems Design and Applications ◽

10.1109/isda.2006.84 ◽

2006 ◽

Cited By ~ 3

Author(s):

Pierre-francois Marteau ◽

Gildas Menier

Keyword(s):

Data Mining ◽

Time Series ◽

Time Complexity ◽

Time Series Data ◽

Linear Time ◽

Series Data ◽

Time Series Data Mining

Download Full-text

An Area Preserving Transformation Algorithm for Press Forming Blank Development

19th Design Automation Conference: Volume 1 — Mechanical System Dynamics; Concurrent and Robust Design; Design for Assembly and Manufacture; Genetic Algorithms in Design and Structural Optimization ◽

10.1115/detc1993-0304 ◽

1993 ◽

Author(s):

Nirmal K. Nair ◽

James H. Oliver

Keyword(s):

Time Complexity ◽

Linear Time ◽

Geometric Interpretation ◽

Surface Geometry ◽

Forming Process ◽

Surface Design ◽

Design Assessment ◽

Press Forming ◽

Blank Shape ◽

Area Preserving

Abstract An efficient algorithm is presented to determine the blank shape necessary to manufacture a surface by press forming. The technique is independent of material properties and instead uses surface geometry and an area conservation constraint to generate a geometrically feasible blank shape. The algorithm is formulated as an approximate geometric interpretation of the reversal of the forming process. The primary applications for this technique are in preliminary surface design, assessment of manufacturability, and location of binder wrap. Since the algorithm exhibits linear time complexity, it is amenable to implementation as an interactive design aid. The algorithm is applied to two example surfaces and the results are discussed.

Download Full-text

Entropy-Penalized Semidefinite Programming

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/157 ◽

2019 ◽

Cited By ~ 2

Author(s):

Mikhail Krechetov ◽

Jakub Marecek ◽

Yury Maximov ◽

Martin Takac

Keyword(s):

Machine Learning ◽

Time Complexity ◽

Optimization Problems ◽

Linear Time ◽

Broad Class ◽

Low Rank ◽

Learning Problems ◽

Unified Framework ◽

Gradient Computation ◽

Machine Learning Applications

Low-rank methods for semi-definite programming (SDP) have gained a lot of interest recently, especially in machine learning applications. Their analysis often involves determinant-based or Schatten-norm penalties, which are difficult to implement in practice due to high computational efforts. In this paper, we propose Entropy-Penalized Semi-Definite Programming (EP-SDP), which provides a unified framework for a broad class of penalty functions used in practice to promote a low-rank solution. We show that EP-SDP problems admit an efficient numerical algorithm, having (almost) linear time complexity of the gradient computation; this makes it useful for many machine learning and optimization problems. We illustrate the practical efficiency of our approach on several combinatorial optimization and machine learning problems.

Download Full-text

Predictive Uncertainty Estimation for Tractable Deep Probabilistic Models

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/745 ◽

2020 ◽

Author(s):

Julissa Villanueva Llerena

Keyword(s):

Probabilistic Models ◽

Linear Time ◽

Generative Models ◽

Uncertainty Estimation ◽

Image Completion ◽

Predictive Uncertainty ◽

Learning Tasks ◽

Challenging Tasks ◽

Statistical Support ◽

Marginal Inference

Tractable Deep Probabilistic Models (TPMs) are generative models based on arithmetic circuits that allow for exact marginal inference in linear time. These models have obtained promising results in several machine learning tasks. Like many other models, TPMs can produce over-confident incorrect inferences, especially on regions with small statistical support. In this work, we will develop efficient estimators of the predictive uncertainty that are robust to data scarcity and outliers. We investigate two approaches. The first approach measures the variability of the output to perturbations of the model weights. The second approach captures the variability of the prediction to changes in the model architecture. We will evaluate the approaches on challenging tasks such as image completion and multilabel classification.

Download Full-text

XML Documents Normalization Using GN-DTD

Information Retrieval Methods for Multidisciplinary Applications ◽

10.4018/978-1-4666-3898-3.ch005 ◽

2013 ◽

pp. 54-77

Author(s):

Zurinahni Zainol ◽

Bing Wang

Keyword(s):

Normal Forms ◽

Document Type Definition ◽

Document Type ◽

Graphical Notation ◽

Xml Documents ◽

Type Definition ◽

Xml Document ◽

Multiple Levels ◽

Normalization Process

Designing a well-structured XML document is important for the sake of readability, maintainability and more importantly to avoid both data redundancies and update anomalies. This paper proposes to improve and simplify XML structural design using a normalization process. To achieve this, Graphical Notation for Document Type Definition (GN-DTD) is used to describe the structure of XML document at the schema level. Multiple levels of normal forms for GN-DTD are proposed and the corresponding normalization rules to transform from poorly designed into well-designed XML documents. A case study is presented to show the application of these normal forms and normalization algorithm.

Download Full-text

Probabilistic Models for the Semantic Web

The Semantic Web for Knowledge and Data Management ◽

10.4018/978-1-60566-028-8.ch005 ◽

2010 ◽

pp. 74-105 ◽

Cited By ~ 5

Author(s):

Livia Predoiu

Keyword(s):

Semantic Web ◽

Knowledge Representation ◽

Classical Logic ◽

Probabilistic Models ◽

Expressive Power ◽

Knowledge Representation And Reasoning ◽

Uncertain Information ◽

New Approaches ◽

The Web

Recently, there has been an increasing interest in formalisms for representing uncertain information on the Semantic Web. This interest is triggered by the observation that knowledge on the web is not always crisp and we have to be able to deal with incomplete, inconsistent and vague information. The treatment of this kind of information requires new approaches for knowledge representation and reasoning on the web as existing Semantic Web languages are based on classical logic which is known to be inadequate for representing uncertainty in many cases. While different general approaches for extending Semantic Web languages with the ability to represent uncertainty are explored, we focus our attention on probabilistic approaches. We survey existing proposals for extending semantic web languages or formalisms underlying Semantic Web languages in terms of their expressive power, reasoning capabilities as well as their suitability for supporting typical tasks associated with the Semantic Web.

Download Full-text

Further Study on Reverse 1-Center Problem on Trees

Asia Pacific Journal of Operational Research ◽

10.1142/s0217595920500347 ◽

2020 ◽

Vol 37 (06) ◽

pp. 2050034

Author(s):

Ali Reza Sepasian ◽

Javad Tayyebi

Keyword(s):

Dynamic Programming ◽

Time Complexity ◽

Numerical Experiments ◽

Linear Time ◽

Minimum Cost ◽

Fixed Number ◽

The Other ◽

Center Problem ◽

Objective Value ◽

Linear Cost

This paper studies two types of reverse 1-center problems under uniform linear cost function where edge lengths are allowed to reduce. In the first type, the aim is that the objective value is bounded by a prescribed fixed value [Formula: see text] at minimum cost. The aim of the other is to improve the objective value as much as possible within a given budget. An algorithm based on dynamic programming is proposed to solve the first problem in linear time. Then, this algorithm is applied as a subroutine to design an algorithm to solve the second type of the problem in [Formula: see text] time in which [Formula: see text] is a fixed number dependent on the problem parameters. Under the similarity assumption, this algorithm has a better complexity than the Nguyen algorithm (2013) with quadratic-time complexity. Some numerical experiments are conducted to validate this fact in practice.

Download Full-text