Algorithms for Generating XML Documents from Probabilistic XML

2012 ◽  
Vol 263-266 ◽  
pp. 1578-1583
Author(s):  
Yan Zhu ◽  
Hai Tao Ma

Uncertain relational data management has been investigated for a few years, but few works on uncertain XML. The natural structures with high flexibility make XML more appropriate for representing uncertain information. Based on the semantic of possible world and probabilistic models with independent distribution and mutual exclusive distribution nodes, the problem of how to generate instance from a probabilistic XML and calculate its probability was studied, which is one of the key problems of uncertain XML management. Moreover, an algorithm for a generating XML document from a probabilistic XML and calculating its probability are also proposed, which has linear time complexity. Finally, experiment results are made to show up the correct and efficiency of the algorithm.

2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Yue Zhao ◽  
Ye Yuan ◽  
Guoren Wang

This paper describes a keyword search measure on probabilistic XML data based on ELM (extreme learning machine). We use this method to carry out keyword search on probabilistic XML data. A probabilistic XML document differs from a traditional XML document to realize keyword search in the consideration of possible world semantics. A probabilistic XML document can be seen as a set of nodes consisting of ordinary nodes and distributional nodes. ELM has good performance in text classification applications. As the typical semistructured data; the label of XML data possesses the function of definition itself. Label and context of the node can be seen as the text data of this node. ELM offers significant advantages such as fast learning speed, ease of implementation, and effective node classification. Set intersection can compute SLCA quickly in the node sets which is classified by using ELM. In this paper, we adopt ELM to classify nodes and compute probability. We propose two algorithms that are based on ELM and probability threshold to improve the overall performance. The experimental results verify the benefits of our methods according to various evaluation metrics.


2014 ◽  
Vol 571-572 ◽  
pp. 575-579
Author(s):  
Hai Tao Ma ◽  
Chang Yong Yu ◽  
Chang Ming Xu ◽  
Miao Fang

We explored the subtree matching problem of probabilistic XML documents: finding the matches of an XML query tree over a probabilistic XML document, using the canonical tree edit distance as a similarity measure between subtrees. Probabilistic XML is a probability distribution model capturing uncertainty of both value and structure. Query over probabilistic XML documents is difficult: an naivie algorithm has exponential complexity by directly compute the tree edit distance between the query tree and each certain XML tree represented by the probabilistic XML document. Based on the method of tree edit distance computation over certain XML subtrees, we defined a minimum-solution to the edit distance computation, which means the minimum cost to translate the query tree to the probabilistic XML tree. Furthermore, we developed an algorithm---ASM (Algorithm of Subtree Matching) to compute the minimum solution. Finally, we proved the complexity of ASM is linear in the size of the probabilistic XML document.


Author(s):  
Nirmal K. Nair ◽  
James H. Oliver

Abstract An efficient algorithm is presented to determine the blank shape necessary to manufacture a surface by press forming. The technique is independent of material properties and instead uses surface geometry and an area conservation constraint to generate a geometrically feasible blank shape. The algorithm is formulated as an approximate geometric interpretation of the reversal of the forming process. The primary applications for this technique are in preliminary surface design, assessment of manufacturability, and location of binder wrap. Since the algorithm exhibits linear time complexity, it is amenable to implementation as an interactive design aid. The algorithm is applied to two example surfaces and the results are discussed.


Author(s):  
Mikhail Krechetov ◽  
Jakub Marecek ◽  
Yury Maximov ◽  
Martin Takac

Low-rank methods for semi-definite programming (SDP) have gained a lot of interest recently, especially in machine learning applications. Their analysis often involves determinant-based or Schatten-norm penalties, which are difficult to implement in practice due to high computational efforts. In this paper, we propose Entropy-Penalized Semi-Definite Programming (EP-SDP), which provides a unified framework for a broad class of penalty functions used in practice to promote a low-rank solution. We show that EP-SDP problems admit an efficient numerical algorithm, having (almost) linear time complexity of the gradient computation; this makes it useful for many machine learning and optimization problems. We illustrate the practical efficiency of our approach on several combinatorial optimization and machine learning problems.


Author(s):  
Julissa Villanueva Llerena

Tractable Deep Probabilistic Models (TPMs) are generative models based on arithmetic circuits that allow for exact marginal inference in linear time. These models have obtained promising results in several machine learning tasks. Like many other models, TPMs can produce over-confident incorrect inferences, especially on regions with small statistical support. In this work, we will develop efficient estimators of the predictive uncertainty that are robust to data scarcity and outliers. We investigate two approaches. The first approach measures the variability of the output to perturbations of the model weights. The second approach captures the variability of the prediction to changes in the model architecture. We will evaluate the approaches on challenging tasks such as image completion and multilabel classification.


Author(s):  
Zurinahni Zainol ◽  
Bing Wang

Designing a well-structured XML document is important for the sake of readability, maintainability and more importantly to avoid both data redundancies and update anomalies. This paper proposes to improve and simplify XML structural design using a normalization process. To achieve this, Graphical Notation for Document Type Definition (GN-DTD) is used to describe the structure of XML document at the schema level. Multiple levels of normal forms for GN-DTD are proposed and the corresponding normalization rules to transform from poorly designed into well-designed XML documents. A case study is presented to show the application of these normal forms and normalization algorithm.


Author(s):  
Livia Predoiu

Recently, there has been an increasing interest in formalisms for representing uncertain information on the Semantic Web. This interest is triggered by the observation that knowledge on the web is not always crisp and we have to be able to deal with incomplete, inconsistent and vague information. The treatment of this kind of information requires new approaches for knowledge representation and reasoning on the web as existing Semantic Web languages are based on classical logic which is known to be inadequate for representing uncertainty in many cases. While different general approaches for extending Semantic Web languages with the ability to represent uncertainty are explored, we focus our attention on probabilistic approaches. We survey existing proposals for extending semantic web languages or formalisms underlying Semantic Web languages in terms of their expressive power, reasoning capabilities as well as their suitability for supporting typical tasks associated with the Semantic Web.


2020 ◽  
Vol 37 (06) ◽  
pp. 2050034
Author(s):  
Ali Reza Sepasian ◽  
Javad Tayyebi

This paper studies two types of reverse 1-center problems under uniform linear cost function where edge lengths are allowed to reduce. In the first type, the aim is that the objective value is bounded by a prescribed fixed value [Formula: see text] at minimum cost. The aim of the other is to improve the objective value as much as possible within a given budget. An algorithm based on dynamic programming is proposed to solve the first problem in linear time. Then, this algorithm is applied as a subroutine to design an algorithm to solve the second type of the problem in [Formula: see text] time in which [Formula: see text] is a fixed number dependent on the problem parameters. Under the similarity assumption, this algorithm has a better complexity than the Nguyen algorithm (2013) with quadratic-time complexity. Some numerical experiments are conducted to validate this fact in practice.


Sign in / Sign up

Export Citation Format

Share Document