Efficient mining of association rules in large dynamic databases

Author(s):  
Edward Omiecinski ◽  
Ashok Savasere
2017 ◽  
Vol 26 (1) ◽  
pp. 69-85
Author(s):  
Mohammed M. Fouad ◽  
Mostafa G.M. Mostafa ◽  
Abdulfattah S. Mashat ◽  
Tarek F. Gharib

AbstractAssociation rules provide important knowledge that can be extracted from transactional databases. Owing to the massive exchange of information nowadays, databases become dynamic and change rapidly and periodically: new transactions are added to the database and/or old transactions are updated or removed from the database. Incremental mining was introduced to overcome the problem of maintaining previously generated association rules in dynamic databases. In this paper, we propose an efficient algorithm (IMIDB) for incremental itemset mining in large databases. The algorithm utilizes the trie data structure for indexing dynamic database transactions. Performance comparison of the proposed algorithm to recently cited algorithms shows that a significant improvement of about two orders of magnitude is achieved by our algorithm. Also, the proposed algorithm exhibits linear scalability with respect to database size.


2005 ◽  
Vol 44 (05) ◽  
pp. 639-646 ◽  
Author(s):  
G. Gogou ◽  
P. D. Bamidis ◽  
I. Vlahavas ◽  
N. Maglaveras ◽  
S. Konias

Summary Objectives: Contemporary literature illustrates an abundance of adaptive algorithms for mining association rules. However, most literature is unable to deal with the peculiarities, such as missing values and dynamic data creation, that are frequently encountered in fields like medicine. This paper proposes an uncertainty rule method that uses an adaptive threshold for filling missing values in newly added records. A new approach for mining uncertainty rules and filling missing values is proposed, which is in turn particularly suitable for dynamic databases, like the ones used in home care systems. Methods: In this study, a new data mining method named FiMV (Filling Missing Values) is illustrated based on the mined uncertainty rules. Uncertainty rules have quite a similar structure to association rules and are extracted by an algorithm proposed in previous work, namely AURG (Adaptive Uncertainty Rule Generation). The main target was to implement an appropriate method for recovering missing values in a dynamic database, where new records are continuously added, without needing to specify any kind of thresholds beforehand. Results: The method was applied to a home care monitoring system database. Randomly, multiple missing values for each record’s attributes (rate 5-20% by 5% increments) were introduced in the initial dataset. FiMV demonstrated 100% completion rates with over 90% success in each case, while usual approaches, where all records with missing values are ignored or thresholds are required, experienced significantly reduced completion and success rates. Conclusions: It is concluded that the proposed method is appropriate for the data-cleaning step of the Knowledge Discovery process in databases. The latter, containing much significance for the output efficiency of any data mining technique, can improve the quality of the mined information.


2010 ◽  
Vol 58 (Supplement 1) ◽  
pp. 1-5 ◽  
Author(s):  
M. Jolánkai ◽  
F. Nyárai ◽  
K. Kassai

Long-term trials have a twofold role in life sciences, acting as both live laboratories and public collections. Long-term trials are not simply scientific curios or the honoured relics of a museum, but highly valuable live ecological models that can never be replaced or restarted if once terminated or suspended. These trials provide valuable and dynamic databases for solving scientific problems. The present paper is intended to give a brief summary of the crop production aspects of long-term trials.


2014 ◽  
Vol 1 (1) ◽  
pp. 339-342
Author(s):  
Mirela Danubianu ◽  
Dragos Mircea Danubianu

AbstractSpeech therapy can be viewed as a business in logopaedic area that aims to offer services for correcting language. A proper treatment of speech impairments ensures improved efficiency of therapy, so, in order to do that, a therapist must continuously learn how to adjust its therapy methods to patient's characteristics. Using Information and Communication Technology in this area allowed collecting a lot of data regarding various aspects of treatment. These data can be used for a data mining process in order to find useful and usable patterns and models which help therapists to improve its specific education. Clustering, classification or association rules can provide unexpected information which help to complete therapist's knowledge and to adapt the therapy to patient's needs.


Sign in / Sign up

Export Citation Format

Share Document