ERX: a data model for collections of XML documents

Author(s):  
G. Psaila
Keyword(s):  
Data Mining ◽  
2013 ◽  
pp. 1-27
Author(s):  
Sangeetha Kutty ◽  
Richi Nayak ◽  
Tien Tran

With the increasing number of XML documents in varied domains, it has become essential to identify ways of finding interesting information from these documents. Data mining techniques can be used to derive this interesting information. However, mining of XML documents is impacted by the data model used in data representation due to the semi-structured nature of these documents. In this chapter, we present an overview of the various models of XML documents representations, how these models are used for mining, and some of the issues and challenges inherent in these models. In addition, this chapter also provides some insights into the future data models of XML documents for effectively capturing its two important features, structure and content, for mining.


Author(s):  
Sangeetha Kutty ◽  
Richi Nayak ◽  
Tien Tran

With the increasing number of XML documents in varied domains, it has become essential to identify ways of finding interesting information from these documents. Data mining techniques can be used to derive this interesting information. However, mining of XML documents is impacted by the data model used in data representation due to the semi-structured nature of these documents. In this chapter, we present an overview of the various models of XML documents representations, how these models are used for mining, and some of the issues and challenges inherent in these models. In addition, this chapter also provides some insights into the future data models of XML documents for effectively capturing its two important features, structure and content, for mining.


Data Mining ◽  
2013 ◽  
pp. 669-691 ◽  
Author(s):  
Evgeny Kharlamov ◽  
Pierre Senellart

This chapter deals with data mining in uncertain XML data models, whose uncertainty typically comes from imprecise automatic processes. We first review the literature on modeling uncertain data, starting with well-studied relational models and moving then to their semistructured counterparts. We focus on a specific probabilistic XML model, which allows representing arbitrary finite distributions of XML documents, and has been extended to also allow continuous distributions of data values. We summarize previous work on querying this uncertain data model and show how to apply the corresponding techniques to several data mining tasks, exemplified through use cases on two running examples.


Author(s):  
Luis Arevalo Rosado ◽  
Antonio Polo Marquez ◽  
Miryam Salas Sanchez
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document