A FUZZY DATA MINING ALGORITHM FOR INCREMENTAL MINING OF QUANTITATIVE SEQUENTIAL PATTERNS

Author(s):  
R. B. V. SUBRAMANYAM ◽  
A. GOSWAMI

In real world applications, the databases are constantly added with a large number of transactions and hence maintaining latest sequential patterns valid on the updated database is crucial. Existing data mining algorithms can incrementally mine the sequential patterns from databases with binary values. Temporal transactions with quantitative values are commonly seen in real world applications. In addition, several methods have been proposed for representing uncertain data in a database. In this paper, a fuzzy data mining algorithm for incremental mining of sequential patterns from quantitative databases is proposed. Proposed algorithm called IQSP algorithm uses the fuzzy grid notion to generate fuzzy sequential patterns validated on the updated database containing the transactions in the original database and in the incremental database. It uses the information about sequential patterns that are already mined from original database and avoids start-from-scratch process. Also, it minimizes the number of candidates to check as well as number of scans to original database by identifying the potential sequences in incremental database.

Author(s):  
Yi-Chung Hu ◽  
Ruey-Shun Chen ◽  
Gwo-Hshiung Tzeng ◽  
Jia-Hourng Shieh

Since fuzzy knowledge representation can facilitate interaction between an expert system and its users, the effective construction of a fuzzy knowledge base is important. Fuzzy sequential patterns described by natural language are one type of fuzzy knowledge representation, and can thus be helpful in building a prototype fuzzy knowledge base. We define that a fuzzy sequence is an ordered list of frequent fuzzy grids, and the length of a fuzzy sequence is the number of frequent fuzzy grids in the frequent fuzzy sequence. Frequent fuzzy grids and frequent fuzzy sequences can be determined by comparing individual fuzzy supports with the user-specified minimum fuzzy support. A fuzzy sequential pattern is just a frequent fuzzy sequence, but it is not contained in any other frequent fuzzy sequence. In this paper, an effective algorithm called the Fuzzy Grids Based Sequential Patterns Mining Algorithm (FGBSPMA) is proposed to generate fuzzy sequential patterns. A numerical example is used to show an analysis of the user visit to websites, demonstrating the usefulness of the proposed algorithm.


2011 ◽  
pp. 44-60 ◽  
Author(s):  
Tzung-Pei Hong ◽  
Ching-Yao Wang

Developing an efficient mining algorithm that can incrementally maintain discovered information as a database grows is quite important in the field of data mining. In the past, we proposed an incremental mining algorithm for maintenance of association rules as new transactions were inserted. Deletion of records in databases is, however, commonly seen in real-world applications. In this chapter, we first review the maintenance of association rules from data insertion and then attempt to extend it to solve the data deletion issue. The concept of pre-large itemsets is used to reduce the need for rescanning the original database and to save maintenance costs. A novel algorithm is proposed to maintain discovered association rules for deletion of records. The proposed algorithm doesn’t need to rescan the original database until a number of records have been deleted. If the database is large, then the number of deleted records allowed will be large too. Therefore, as the database grows, our proposed approach becomes increasingly efficient. This characteristic is especially useful for real-world applications.


2006 ◽  
Vol 05 (03) ◽  
pp. 243-257
Author(s):  
R. B. V. Subramanyam ◽  
A. Goswami

Incremental mining algorithms that derive the latest mining output by making use of previous mining results are attractive to business organisations. In this paper, a fuzzy data mining algorithm for incremental mining of frequent fuzzy grids from quantitative dynamic databases is proposed. It extends the traditional association rule problem by allowing a weight to be associated with each item in a transaction and with each transaction in a database to reflect the interest/intensity of items and transactions. It uses the information about fuzzy grids that are already mined from original database and avoids start-from-scratch process. In addition, we deal with "weights-of-significance" which are automatically regulated as the incremental databases are evolved and implant themselves in the original database. We maintain "hopeful fuzzy grids" and "frequent fuzzy grids" and our algorithm changes the status of the grids which have been discovered earlier so that they reflect the pattern drift in the updated quantitative databases. Our heuristic approach avoids maintaining many "hopeful fuzzy grids" at the initial level. The algorithm is illustrated with one numerical example and demonstration of experimental results are also incorporated.


Author(s):  
Zhi-Hua Zhou

Data mining attempts to identify valid, novel, potentially useful, and ultimately understandable patterns from huge volume of data. The mined patterns must be ultimately understandable because the purpose of data mining is to aid decision-making. If the decision-makers cannot understand what does a mined pattern mean, then the pattern cannot be used well. Since most decision-makers are not data mining experts, ideally, the patterns should be in a style comprehensible to common people. So, comprehensibility of data mining algorithms, that is, the ability of a data mining algorithm to produce patterns understandable to human beings, is an important factor.


2022 ◽  
Vol 14 (1) ◽  
pp. 0-0

Utility mining with negative item values has recently received interest in the data mining field due to its practical considerations. Previously, the values of utility item-sets have been taken into consideration as positive. However, in real-world applications an item-set may be related to negative item values. This paper presents a method for redesigning the ordering policy by including high utility item-sets with negative items. Initially, utility mining algorithm is used to find high utility item-sets. Then, ordering policy is estimated for high utility items considering defective and non-defective items. A numerical example is illustrated to validate the results


Author(s):  
TZUNG-PEI HONG ◽  
CHAN-SHENG KUO ◽  
SHENG-CHAI CHI

Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Most conventional data-mining algorithms identify the relationships among transactions using binary values. Transactions with quantitative values are however commonly seen in real-world applications. We proposed a fuzzy mining algorithm by which each attribute used only the linguistic term with the maximum cardinality int he mining process. The number of items was thus the same as that of the original attributes, making the processing time reduced. The fuzzy association rules derived in this way are not complete. This paper thus modifies it and proposes a new fuzzy data-mining algorithm for extrating interesting knowledge from transactions stored as quantitative values. The proposed algorithm can derive a more complete set of rules but with more computation time than the method proposed. Trade-off thus exists between the computation time and the completeness of rules. Choosing an appropriate learning method thus depends on the requirement of the application domains.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Jiangang Sun ◽  
Xiaoran Jiang ◽  
Guoliang Yuan ◽  
Zhenhuai Chen

With the continuous improvement of living standards, the level of physical development of adolescents has improved significantly. The physical functions and healthy development of adolescents are relatively slow and even appear to decline. This paper proposes a novel data mining algorithm based on big data for monitoring of adolescent student’s physical health to overcome this problem and enhance young people’s physical fitness and mental health. Since big data technology has positive practical significance in promoting young people’s healthy development and promoting individual health rights, this article will implement commonly used data mining algorithms and Hadoop/Spark big data processing. The algorithm on different platforms verified that the big data platform has good computing performance for the data mining algorithm by comparing the running time. The current work will prove to be a complete physical health data management system and effectively save, process, and analyze adolescents’ physical test data.


Sign in / Sign up

Export Citation Format

Share Document