scholarly journals On the complexity of managing probabilistic XML data

Author(s):  
Pierre Senellart ◽  
Serge Abiteboul
Keyword(s):  
Xml Data ◽  
2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Yue Zhao ◽  
Ye Yuan ◽  
Guoren Wang

This paper describes a keyword search measure on probabilistic XML data based on ELM (extreme learning machine). We use this method to carry out keyword search on probabilistic XML data. A probabilistic XML document differs from a traditional XML document to realize keyword search in the consideration of possible world semantics. A probabilistic XML document can be seen as a set of nodes consisting of ordinary nodes and distributional nodes. ELM has good performance in text classification applications. As the typical semistructured data; the label of XML data possesses the function of definition itself. Label and context of the node can be seen as the text data of this node. ELM offers significant advantages such as fast learning speed, ease of implementation, and effective node classification. Set intersection can compute SLCA quickly in the node sets which is classified by using ELM. In this paper, we adopt ELM to classify nodes and compute probability. We propose two algorithms that are based on ELM and probability threshold to improve the overall performance. The experimental results verify the benefits of our methods according to various evaluation metrics.


Author(s):  
Chenjing Zhang ◽  
Le Chang ◽  
Chaofeng Sha ◽  
Xiaoling Wang ◽  
Aoying Zhou
Keyword(s):  
Xml Data ◽  

2014 ◽  
Vol 26 (4) ◽  
pp. 957-969 ◽  
Author(s):  
Jianxin Li ◽  
Chengfei Liu ◽  
Rui Zhou ◽  
Jeffrey Xu Yu

2009 ◽  
Vol 37 (4) ◽  
pp. 69-77 ◽  
Author(s):  
Benny Kimelfeld ◽  
Yehoshua Sagiv
Keyword(s):  
Xml Data ◽  

Data Mining ◽  
2013 ◽  
pp. 669-691 ◽  
Author(s):  
Evgeny Kharlamov ◽  
Pierre Senellart

This chapter deals with data mining in uncertain XML data models, whose uncertainty typically comes from imprecise automatic processes. We first review the literature on modeling uncertain data, starting with well-studied relational models and moving then to their semistructured counterparts. We focus on a specific probabilistic XML model, which allows representing arbitrary finite distributions of XML documents, and has been extended to also allow continuous distributions of data values. We summarize previous work on querying this uncertain data model and show how to apply the corresponding techniques to several data mining tasks, exemplified through use cases on two running examples.


Sign in / Sign up

Export Citation Format

Share Document