scholarly journals A review of associative classification mining

2007 ◽  
Vol 22 (1) ◽  
pp. 37-65 ◽  
Author(s):  
FADI THABTAH

AbstractAssociative classification mining is a promising approach in data mining that utilizes the association rule discovery techniques to construct classification systems, also known as associative classifiers. In the last few years, a number of associative classification algorithms have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative classification techniques with regards to the above criteria. Finally, future directions in associative classification, such as incremental learning and mining low-quality data sets, are also highlighted in this paper.

2020 ◽  
Vol 24 ◽  
pp. 105-122
Author(s):  
González-Méndez Andy ◽  
Martín Diana ◽  
Morales Eduardo ◽  
García-Borroto Milton

Associative classification is a pattern recognition approach that integrates classification and association rule discovery to build accurate classification models. These models are formed by a collection of contrast patterns that fulfill some restrictions. In this paper, we introduce an experimental comparison of the impact of using different restrictions in the classification accuracy. To the best of our knowledge, this is the first time that such analysis is performed, deriving some interesting findings about how restrictions impact on the classification results. Contrasting these results with previously published papers, we found that their conclusions could be unintentionally biased by the restrictions they used. We found, for example, that the jumping restriction could severely damage the pattern quality in the presence of dataset noise. We also found that the minimal support restriction has a different effect in the accuracy of two associative classifiers, therefore deciding which one is the best depends on the support value. This paper opens some interesting lines of research, mainly in the creation of new restrictions and new pattern types by joining different restrictions.


2012 ◽  
Vol 11 (02) ◽  
pp. 1250011 ◽  
Author(s):  
Neda Abdelhamid ◽  
Aladdin Ayesh ◽  
Fadi Thabtah ◽  
Samad Ahmadi ◽  
Wael Hadi

Associative classification (AC) is a data mining approach that uses association rule discovery methods to build classification systems (classifiers). Several research studies reveal that AC normally generates higher accurate classifiers than classic classification data mining approaches such as rule induction, probabilistic and decision trees. This paper proposes a new multiclass AC algorithm called MAC. The proposed algorithm employs a novel method for building the classifier that normally reduces the resulting classifier size in order to enable end-user to more understand and maintain it. Experimentations against 19 different data sets from the UCI data repository and using different common AC and traditional learning approaches have been conducted with reference to classification accuracy and the number of rules derived. The results show that the proposed algorithm is able to derive higher predictive classifiers than rule induction (RIPPER) and decision tree (C4.5) algorithms and very competitive to a known AC algorithm named MCAR. Furthermore, MAC is also able to produce less number of rules than MCAR in normal circumstances (standard support and confidence thresholds) and in sever circumstances (low support and confidence thresholds) and for most of the data sets considered in the experiments.


2014 ◽  
Vol 13 (03) ◽  
pp. 1450027 ◽  
Author(s):  
Neda Abdelhamid ◽  
Fadi Thabtah

Associative classification (AC) is a promising data mining approach that integrates classification and association rule discovery to build classification models (classifiers). In the last decade, several AC algorithms have been proposed such as Classification based Association (CBA), Classification based on Predicted Association Rule (CPAR), Multi-class Classification using Association Rule (MCAR), Live and Let Live (L3) and others. These algorithms use different procedures for rule learning, rule sorting, rule pruning, classifier building and class allocation for test cases. This paper sheds the light and critically compares common AC algorithms with reference to the abovementioned procedures. Moreover, data representation formats in AC mining are discussed along with potential new research directions.


2010 ◽  
Vol 09 (01) ◽  
pp. 55-64 ◽  
Author(s):  
Fadi Thabtah ◽  
Qazafi Mahmood ◽  
Lee McCluskey ◽  
Hussein Abdel-Jaber

Associative classification is a branch in data mining that employs association rule discovery methods in classification problems. In this paper, we introduce a novel data mining method called Looking at the Class (LC), which can be utilised in associative classification approach. Unlike known algorithms in associative classification such as Classification based on Association rule (CBA), which combine disjoint itemsets regardless of their class labels in the training phase, our method joins only itemsets with similar class labels. This saves too many unnecessary itemsets combining during the learning step, and consequently results in massive saving in computational time and memory. Moreover, a new prediction method that utilises multiple rules to make the prediction decision is also developed in this paper. The experimental results on different UCI datasets reveal that LC algorithm outperformed CBA with respect to classification accuracy, memory usage, and execution time on most datasets we consider.


2006 ◽  
Vol 05 (01) ◽  
pp. 13-20 ◽  
Author(s):  
Fadi Thabtah

Classification based on association rule mining, also known as associative classification, is a promising approach in data mining that builds accurate classifiers. In this paper, a rule ranking process within the associative classification approach is investigated. Specifically, two common rule ranking methods in associative classification are compared with reference to their impact on accuracy. We also propose a new rule ranking procedure that adds more tie breaking conditions to the existing methods in order to reduce rule random selection. In particular, our method looks at the class distribution frequency associated with the tied rules and favours those that are associated with the majority class. We compare the impact of the proposed rule ranking method and two other methods presented in associative classification against 14 highly dense classification data sets. Our results indicate the effectiveness of the proposed rule ranking method on the quality of the resulting classifiers for the majority of the benchmark problems, which we consider. This provides evidence that adding more appropriate constraints to break ties between rules positively affects the predictive power of the resulting associative classifiers.


Algorithms ◽  
2021 ◽  
Vol 14 (3) ◽  
pp. 76
Author(s):  
Estrella Lucena-Sánchez ◽  
Guido Sciavicco ◽  
Ionel Eduard Stan

Air quality modelling that relates meteorological, car traffic, and pollution data is a fundamental problem, approached in several different ways in the recent literature. In particular, a set of such data sampled at a specific location and during a specific period of time can be seen as a multivariate time series, and modelling the values of the pollutant concentrations can be seen as a multivariate temporal regression problem. In this paper, we propose a new method for symbolic multivariate temporal regression, and we apply it to several data sets that contain real air quality data from the city of Wrocław (Poland). Our experiments show that our approach is superior to classical, especially symbolic, ones, both in statistical performances and the interpretability of the results.


2012 ◽  
Vol 24 (06) ◽  
pp. 513-524
Author(s):  
Mohsen Alavash Shooshtari ◽  
Keivan Maghooli ◽  
Kambiz Badie

One of the main objectives of data mining as a promising multidisciplinary field in computer science is to provide a classification model to be used for decision support purposes. In the medical imaging domain, mammograms classification is a difficult diagnostic task which calls for development of automated classification systems. Associative classification, as a special case of association rules mining, has been adopted in classification problems for years. In this paper, an associative classification framework based on parallel mining of image blocks is proposed to be used for mammograms discrimination. Indeed, association rules mining is applied to a commonly used mammography image database to classify digital mammograms into three categories, namely normal, benign and malign. In order to do so, first images are preprocessed and then features are extracted from non-overlapping image blocks and discretized for rule discovery. Association rules are then discovered through parallel mining of transactional databases which correspond to the image blocks, and finally are used within a unique decision-making scheme to predict the class of unknown samples. Finally, experiments are conducted to assess the effectiveness of the proposed framework. Results show that the proposed framework proved successful in terms of accuracy, precision, and recall, and suggest that the framework could be used as the core of any future associative classifier to support mammograms discrimination.


1994 ◽  
Vol 84 (3) ◽  
pp. 668-691 ◽  
Author(s):  
David J. Wald ◽  
Thomas H. Heaton

Abstract We have determined a source rupture model for the 1992 Landers earthquake (MW 7.2) compatible with multiple data sets, spanning a frequency range from zero to 0.5 Hz. Geodetic survey displacements, near-field and regional strong motions, broadband teleseismic waveforms, and surface offset measurements have been used explicitly to constrain both the spatial and temporal slip variations along the model fault surface. Our fault parameterization involves a variable-slip, multiple-segment, finite-fault model which treats the diverse data sets in a self-consistent manner, allowing them to be inverted both independently and in unison. The high-quality data available for the Landers earthquake provide an unprecedented opportunity for direct comparison of rupture models determined from independent data sets that sample both a wide frequency range and a diverse spatial station orientation with respect to the earthquake slip and radiation pattern. In all models, consistent features include the following: (1) similar overall dislocation patterns and amplitudes with seismic moments of 7 to 8 × 1026 dyne-cm (seismic potency of 2.3 to 2.7 km3); (2) very heterogeneous, unilateral strike slip distributed over a fault length of 65 km and over a width of at least 15 km, though slip is limited to shallower regions in some areas; (3) a total rupture duration of 24 sec and an average rupture velocity of 2.7 km/sec; and (4) substantial variations of slip with depth relative to measured surface offsets. The extended rupture length and duration of the Landers earthquake also allowed imaging of the propagating rupture front with better resolution than for those of prior shorter-duration, strike-slip events. Our imaging allows visualization of the rupture evolution, including local differences in slip durations and variations in rupture velocity. Rupture velocity decreases markedly at shallow depths, as well as near regions of slip transfer from one fault segment to the next, as rupture propagates northwestward along the multiply segmented fault length. The rupture front slows as it reaches the northern limit of the Johnson Valley/Landers faults where slip is transferred to the southern Homestead Valley fault; an abrupt acceleration is apparent following the transfer. This process is repeated, and is more pronounced, as slip is again passed from the northern Homestead Valley fault to the Emerson fault. Although the largest surface offsets were observed at the northern end of the rupture, our modeling indicates that substantial rupture was also relatively shallow (less than 10 km) in this region.


Sign in / Sign up

Export Citation Format

Share Document