XML Document Clustering Using Structure-Preserving Flat Representation of XML Content and Structure

Author(s):  
Fedja Hadzic ◽  
Michael Hecker ◽  
Andrea Tagarelli



Author(s):  
Panagiotis Antonellis

The wide use of XML as the de facto standard of storing and exchanging information through Internet has led a wide spectrum of heterogeneous applications to adopt XML as their information representation model. The heterogeneity of XML data sources has brought up the problem of efficiently clustering a set of XML documents. However, traditional clustering algorithms cannot be applied due to the semistructured nature of XML, which contains both structure and content features. Hence, special techniques should be used that would take into account the XML semantics in order to address the problem of XML clustering. The described approaches, based on either the structure or the content or both, manage to successfully address the problem and can be applied efficiently in real-world applications.



2013 ◽  
Vol 43 (3) ◽  
pp. 417-436 ◽  
Author(s):  
Elaheh Asghari ◽  
MohammadReza KeyvanPour




2011 ◽  
Vol 31 (2) ◽  
pp. 446-449
Author(s):  
Yong JIANG ◽  
Huai-liang TAN ◽  
Guang-wen LI


Author(s):  
Jean-Yves Vion-Dury

Externalizing document management is a problem when individual or corporate privacy is to be ensured. Provided that the decryption key is not known from service side, pure storage/archiving of encrypted documents is highly secure, but of poor interest as no operation can be performed on hosted data. Thus, current document management systems offer restricted privacy mechanisms, roughly based on secured communication channels and sometimes encrypted storage. However, many value-added processing operations require decrypting the document, and no formal guaranty is granted regarding the safety of system behaviors. As an example of known issue, there is the problem of data remanence (persisting information on disk after file system deletion), bugs or viruses acting on various level of the software architecture. This paper describes a method to allow restricted (but yet meaningful) ways of processing encrypted XML documents without needing decryption phase. The encryption process we propose allows isomorphic encryption of data (XML document owned by customers) and operator transformations (verification and transformation operated by the Service Provider) in such a way that full secrecy is ensured simply because the decoding key is not known by the Service Provider. Once transformed, operators can handle encrypted documents with equivalent results up to the decryption operation.



Sign in / Sign up

Export Citation Format

Share Document