STRUCTURAL CLASSIFICATION OF XML DOCUMENTS USING MULTISETS
2008 ◽
Vol 17
(05)
◽
pp. 1003-1022
◽
Keyword(s):
In this paper, we investigate the problem of clustering XML documents based on their structure. We represent the paths in an XML document as a multiset and use the symmetric difference operation on multisets to define certain metrics. These metrics are then used to obtain a measure of similarity between any two documents in a collection. Our technique was successfully applied to real and synthesized XML documents yielding high-quality clusterings.
2010 ◽
Vol 66
(10)
◽
pp. 1190-1197
◽
1995 ◽
Vol 59
(2)
◽
pp. 336-338
◽
1917 ◽
Vol 28
(1)
◽
pp. 553-602
◽
Keyword(s):
2020 ◽
Vol 8
(4)
◽
pp. 1408-1415