rule pruning
Recently Published Documents


TOTAL DOCUMENTS

25
(FIVE YEARS 7)

H-INDEX

6
(FIVE YEARS 2)

Semantic Web ◽  
2021 ◽  
pp. 1-34
Author(s):  
Václav Zeman ◽  
Tomáš Kliegr ◽  
Vojtěch Svátek

AMIE+ is a state-of-the-art algorithm for learning rules from RDF knowledge graphs (KGs). Based on association rule learning, AMIE+ constituted a breakthrough in terms of speed on large data compared to the previous generation of ILP-based systems. In this paper we present several algorithmic extensions to AMIE+, which make it faster, and the support for data pre-processing and model post-processing, which provides a more comprehensive coverage of the linked data mining process than does the original AMIE+ implementation. The main contributions are related to performance improvement: (1) the top-k approach, which addresses the problem of combinatorial explosion often resulting from a hand-set minimum support threshold, (2) a grammar that allows to define fine-grained patterns reducing the size of the search space, and (3) a faster projection binding reducing the number of repetitive calculations. Other enhancements include the possibility to mine across multiple graphs, the support for discretization of continuous values, and the selection of the most representative rules using proven rule pruning and clustering algorithms. Benchmarks show reductions in mining time of up to several orders of magnitude compared to AMIE+. An open-source implementation is available under the name RDFRules at https://github.com/propi/rdfrules.


Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 794
Author(s):  
Ganesan Jothi ◽  
Hannah H. Inbarani ◽  
Ahmad Taher Azar ◽  
Anis Koubaa ◽  
Nashwa Ahmad Kamal ◽  
...  

Acute lymphoblastic leukemia is a well-known type of pediatric cancer that affects the blood and bone marrow. If left untreated, it ends in fatal conditions due to its proliferation into the circulation system and other indispensable organs. All over the world, leukemia primarily attacks youngsters and grown-ups. The early diagnosis of leukemia is essential for the recovery of patients, particularly in the case of children. Computational tools for medical image analysis, therefore, have significant use and become the focus of research in medical image processing. The particle swarm optimization algorithm (PSO) is employed to segment the nucleus in the leukemia image. The texture, shape, and color features are extracted from the nucleus. In this article, an improved dominance soft set-based decision rules with pruning (IDSSDRP) algorithm is proposed to predict the blast and non-blast cells of leukemia. This approach proceeds with three distinct phases: (i) improved dominance soft set-based attribute reduction using AND operation in multi-soft set theory, (ii) generation of decision rules using dominance soft set, and (iii) rule pruning. The efficiency of the proposed system is compared with other benchmark classification algorithms. The research outcomes demonstrate that the derived rules efficiently classify cancer and non-cancer cells. Classification metrics are applied along with receiver operating characteristic (ROC) curve analysis to evaluate the efficiency of the proposed framework.


Knowledge discovery process deals with two essential data mining techniques, association and classification. Classification produces a set of large number of associative classification rules for a given observation. Pruning removes unnecessary class association rules without losing classification accuracy. These processes are very significant but at the same time very challenging. The experimental results and limitations of existing class association rules mining techniques have shown that there is a requirement to consider more pruning parameters so that the size of classifier can be further optimized. Here through this paper we are presenting a survey various strategies for class association rule pruning and study their effects that enables us to extract efficient compact and high confidence class association rule set and we have also proposed a pruning methodology..


Author(s):  
Hayder Naser Khraibet AL-Behadili ◽  
Ku Ruhana Ku-Mahamud ◽  
Rafid Sagban

<span>Ant colony optimization (ACO) was successfully applied to data mining classification task through ant-mining algorithms. Exploration and exploitation are search strategies that guide the learning process of a classification model and generate a list of rules. Exploitation refers to the process of intensifying the search for neighbors in good regions, </span><span>whereas exploration aims towards new promising regions during a search process. </span><span>The existing balance between exploration and exploitation in the rule construction procedure is limited to the roulette wheel selection mechanism, which complicates rule generation. Thus, low-coverage complex rules with irrelevant terms will be generated. This work proposes an enhancement rule pruning procedure for the ACO algorithm that can be used in rule-based classification. This procedure, called the annealing strategy, is an improvement of ant-mining algorithms in the rule construction procedure. Presented as a pre-pruning technique, the annealing strategy deals first with irrelevant terms before creating a complete rule through an annealing schedule. The proposed improvement was tested through benchmarking experiments, and results were compared with those of four of the most related ant-mining algorithms, namely, Ant-Miner, CAnt-Miner, TACO-Miner, and Ant-Miner with hybrid pruner. </span><span>Results display that our proposed technique achieves better performance in terms of classification accuracy, model size, and </span><span>computational time. </span><span>The proposed annealing schedule can be used in other ACO variants for different applications to improve classification accuracy.</span>


Author(s):  
Emad Alsukhni ◽  
Ahmed AlEroud ◽  
Ahmad A. Saifan

Association rule mining is a very useful knowledge discovery technique to identify co-occurrence patterns in transactional data sets. In this article, the authors proposed an ontology-based framework to discover multi-dimensional association rules at different levels of a given ontology on user defined pre-processing constraints which may be identified using, 1) a hierarchy discovered in datasets; 2) the dimensions of those datasets; or 3) the features of each dimension. The proposed framework has post-processing constraints to drill down or roll up based on the rule level, making it possible to check the validity of the discovered rules in terms of support and confidence rule validity measures without re-applying association rule mining algorithms. The authors conducted several preliminary experiments to test the framework using the Titanic dataset by identifying the association rules after pre- and post-constraints are applied. The results have shown that the framework can be practically applied for rule pruning and discovering novel association rules.


Author(s):  
Feng Guo ◽  
Shaozi Li ◽  
Ying Dai ◽  
Changle Zhou ◽  
Ying Lin

Spirit diagnosing is an important theory in TCM (Traditional Chinese Medicine), by which a TCM doctor can diagnose a patient’s body state. But this theory is complicated and difficult to master simply learned from books. To further the theory and skill of spirit diagnosing, in this paper, the authors propose a remote education system that can accept videos from a user and give the user an auto-diagnosed spirit. The key technology in this system is eye feature computation in spirit diagnosing, for which rules describing “the spirit” (spirit in TCM refers to the human’s mental state which reflects the one’s general physical condition) state are mined by the quantitative features regarding the human eyes. With videos capturing eye condition during a short period, a set of eye features are extracted. On this basis, attribute intervals of the eye feature space is generated by CAIM (class-attribute interdependence maximization). Several of the candidate rules are then mined by the association rule based on the cloud model. Finally, three complementary rule-pruning methods are modified and combined to trim the candidate rules. The cross validation test for mined rules has an average accuracy of 93%, which shows the high performance of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document