Mining frequent patterns from XML data: Efficient algorithms and design trade-offs

With the growing usage of XML data for data storage and exchange, there is an imminent need to develop efficient algorithms to perform data mining on semistructured XML data. Mining on XML data is much more difficult than mining on relational data because of the complexity of structure in XML data. A naïve approach to mining on XML data is to first convert XML data into relational format. However the structure information may be lost during the conversion. It is desired to develop efficient and effective data mining algorithms that can be directly applied on XML data.

Download Full-text

Mining frequent patterns from XML data

6th Asia-Pacific Symposium on Information and Telecommunication Technologies ◽

10.1109/apsitt.2005.203658 ◽

2005 ◽

Cited By ~ 2

Author(s):

C.N. Win ◽

Khin Haymar Saw Hla

Keyword(s):

Frequent Patterns ◽

Xml Data

Download Full-text

Frequent Pattern Discovery and Association Rule Mining of XML Data

Data Mining ◽

10.4018/978-1-4666-2455-9.ch044 ◽

2013 ◽

pp. 859-879

Author(s):

Qin Ding ◽

Gnanasekaran Sundarraj

Keyword(s):

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Pattern Discovery ◽

Frequent Pattern ◽

Future Research ◽

Frequent Patterns ◽

Rule Mining ◽

Xml Data ◽

Art Research

Finding frequent patterns and association rules in large data has become a very important task in data mining. Various algorithms have been proposed to solve such problems, but most algorithms are only applicable to relational data. With the increasing use and popularity of XML representation, it is of importance yet challenging to find solutions to frequent pattern discovery and association rule mining of XML data. The challenge comes from the complexity of the structure in XML data. In this chapter, we provide an overview of the state-of-the-art research in content-based and structure-based mining of frequent patterns and association rules from XML data. We also discuss the challenges and issues, and provide our insight for solutions and future research directions.

Download Full-text

Efficient algorithms for the mining of constrained frequent patterns from uncertain data

ACM SIGKDD Explorations Newsletter ◽

10.1145/1809400.1809425 ◽

2010 ◽

Vol 11 (2) ◽

pp. 123-130 ◽

Cited By ~ 9

Author(s):

Carson Kai-Sang Leung ◽

Dale A. Brajczuk

Keyword(s):

Uncertain Data ◽

Efficient Algorithms ◽

Frequent Patterns

Download Full-text

Frequent Pattern Discovery and Association Rule Mining of XML Data

Advances in Data Mining and Database Management - XML Data Mining ◽

10.4018/978-1-61350-356-0.ch011 ◽

2011 ◽

pp. 243-263 ◽

Cited By ~ 1

Author(s):

Qin Ding ◽

Gnanasekaran Sundarraj

Keyword(s):

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Pattern Discovery ◽

Frequent Pattern ◽

Future Research ◽

Frequent Patterns ◽

Rule Mining ◽

Xml Data ◽

Art Research

Finding frequent patterns and association rules in large data has become a very important task in data mining. Various algorithms have been proposed to solve such problems, but most algorithms are only applicable to relational data. With the increasing use and popularity of XML representation, it is of importance yet challenging to find solutions to frequent pattern discovery and association rule mining of XML data. The challenge comes from the complexity of the structure in XML data. In this chapter, we provide an overview of the state-of-the-art research in content-based and structure-based mining of frequent patterns and association rules from XML data. We also discuss the challenges and issues, and provide our insight for solutions and future research directions.

Download Full-text

Efficient Algorithms for Scheduling XML Data in a Mobile Wireless Broadcast Environment

2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS) ◽

10.1109/icpads.2015.96 ◽

2015 ◽

Author(s):

Yongrui Qin ◽

Hua Wang ◽

Ji Zhang ◽

Xiaohui Tao ◽

Wei Emma Zhang ◽

...

Keyword(s):

Efficient Algorithms ◽

Mobile Wireless ◽

Xml Data ◽

Wireless Broadcast

Download Full-text

Efficient algorithms for mining constrained frequent patterns from uncertain data

Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data - U '09 ◽

10.1145/1610555.1610557 ◽

2009 ◽

Cited By ~ 12

Author(s):

Carson Kai-Sang Leung ◽

Dale A. Brajczuk

Keyword(s):

Uncertain Data ◽

Efficient Algorithms ◽

Frequent Patterns

Download Full-text

Building XML Data Warehouse Based on Frequent Patterns in User Queries

Data Warehousing and Knowledge Discovery - Lecture Notes in Computer Science ◽

10.1007/978-3-540-45228-7_11 ◽

2003 ◽

pp. 99-108 ◽

Cited By ~ 6

Author(s):

Ji Zhang ◽

Tok Wang Ling ◽

Robert M. Bruckner ◽

A Min Tjoa

Keyword(s):

Data Warehouse ◽

Frequent Patterns ◽

Xml Data ◽

User Queries

Download Full-text

EFFICIENT GLOBAL COMBINE OPERATIONS IN MULTI-PORT MESSAGE-PASSING SYSTEMS

Parallel Processing Letters ◽

10.1142/s012962649300037x ◽

1993 ◽

Vol 03 (04) ◽

pp. 335-346 ◽

Cited By ~ 14

Author(s):

JEHOSHUA BRUCK ◽

CHING-TIEN HO

Keyword(s):

Message Passing ◽

Efficient Algorithms ◽

Communication Model ◽

Communication Round ◽

Trade Offs ◽

Reduction Function

We present a class of efficient algorithms for global combine operations in k-port message-passing systems. In the k-port communication model, in each communication round, a processor can send data to k other processors and simultaneously receive data from k other processors. We consider algorithms for global combine operations in n processors with respect to a commutative and associative reduction function. Initially, each processor holds a vector of m data items and finally the result of the reduction function over the n vectors of data items, which is also a vector of m data items, is known to all n processors. We present three efficient algorithms that employ various trade-offs between the number of communication rounds and the number of data items transferred in sequence. For the case m=1, we have an algorithm which is optimal in both the number of communication rounds and the number of data items transferred in sequence.

Download Full-text