A Conformity Measure Using Background Knowledge for Association Rules

Author(s):  
Hacène Cherfi ◽  
Amedeo Napoli ◽  
Yannick Toussaint

A text mining process using association rules generates a very large number of rules. According to experts of the domain, most of these rules basically convey a common knowledge, that is, rules which associate terms that experts may likely relate to each other. In order to focus on the result interpretation and discover new knowledge units, it is necessary to define criteria for classifying the extracted rules. Most of the rule classification methods are based on numerical quality measures. In this chapter, the authors introduce two classification methods: the first one is based on a classical numerical approach, that is, using quality measures, and the other one is based on domain knowledge. They propose the second original approach in order to classify association rules according to qualitative criteria using domain model as background knowledge. Hence, they extend the classical numerical approach in an effort to combine data mining and semantic techniques for post mining and selection of association rules. The authors mined a corpus of texts in molecular biology and present the results of both approaches, compare them, and give a discussion on the benefits of taking into account a knowledge domain model of the data.

2020 ◽  
Author(s):  
Harith Al-Sahaf ◽  
A Song ◽  
K Neshatian ◽  
Mengjie Zhang

Image classification is a complex but important task especially in the areas of machine vision and image analysis such as remote sensing and face recognition. One of the challenges in image classification is finding an optimal set of features for a particular task because the choice of features has direct impact on the classification performance. However the goodness of a feature is highly problem dependent and often domain knowledge is required. To address these issues we introduce a Genetic Programming (GP) based image classification method, Two-Tier GP, which directly operates on raw pixels rather than features. The first tier in a classifier is for automatically defining features based on raw image input, while the second tier makes decision. Compared to conventional feature based image classification methods, Two-Tier GP achieved better accuracies on a range of different tasks. Furthermore by using the features defined by the first tier of these Two-Tier GP classifiers, conventional classification methods obtained higher accuracies than classifying on manually designed features. Analysis on evolved Two-Tier image classifiers shows that there are genuine features captured in the programs and the mechanism of achieving high accuracy can be revealed. The Two-Tier GP method has clear advantages in image classification, such as high accuracy, good interpretability and the removal of explicit feature extraction process. © 2012 IEEE.


2019 ◽  
Vol 8 (2S11) ◽  
pp. 3448-3453

Classification is a data mining technique that categorizes the items in a database to target classes. The aim of classification is to accurately find the target class for each instance of the data. Associative classification is a classification method that uses Class Association Rules for classification. Associative classification is found to be often more accurate than some traditional classification methods. The major disadvantage of associative classification is the generation of redundant and weak class association rules. Weak class association rules results in increase in size and decrease in accuracy of the classifier. This paper proposes an efficient approach to build a compact and accurate classifier by using interestingness measures for pruning rules. Interestingness measures play a vital role in reducing the size and increasing the accuracy of classifier by pruning redundant or weak rules. Rules which are strong are retained and these rules are further used to build the classifier. The source of the data used in this paper is University of California Irvine Machine Learning Repository. The approach proposed in this paper is effective and the results show that the approach can produce a highly compact and accurate classifier


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Ummul Hanan Mohamad ◽  
Mohammad Nazir Ahmad ◽  
Ahmad Mujahid Ubaidillah Zakaria

PurposeThis systematic literature review (SLR) paper presents the overview and analysis of the existing ontologies application in the SE domain. It discusses the main challenges in terms of its ontologies development and highlights the key knowledge areas for subdomains in the SE domain that provides a direction to develop ontologies application for SE systematically. The SE is not as straightforward as the traditional economy. It transforms the existing economy ecosystem through peer-to-peer collaborations mediated by the technology. Hence, the complexity of the SE domain accentuates the need to make the SE domain knowledge more explicit.Design/methodology/approachFor the review, the authors only focus on the journal articles published from 2010 to 2020 and mentioned ontology as a solution to overcome the issues specific for the SE domain. The initial identification process produced 3,326 papers from 10 different databases.FindingsAfter applying the inclusion and exclusion criteria, a final set of 11 articles were then analyzed and classified. In SE, good ontology design and development is essential to manage digital platforms, deal with data heterogeneity and govern the interoperability of the SE systems. Yet the preference to build an application ontology, lack of perdurant design and minimal use of the existing standard for building SE common knowledge are deterring the ontology development in this domain. From this review, an anatomy of the SE key subdomain areas is visualized as a reference to further develop the domain ontology for the SE domain systematically.Originality/valueWith the arrival of the Fourth Industrial Revolution (4IR), the sharing economy (SE) has become one of the important domains whose impact has been explosive, and its domain knowledge is complex. Yet, a comprehensive overview and analysis of the ontology applications in the SE domain is not available or well presented to the research community.


Author(s):  
Armand Armand ◽  
André Totohasina ◽  
Daniel Rajaonasy Feno

Regarding the existence of more than sixty interestingness measures proposed in the literature since 1993 till today in the topics of association rules mining and facing the importance these last one, the research on normalization probabilistic quality measures of association rules has already led to many tangible results to consolidate the various existing measures in the literature. This article recommends a simple way to perform this normalization. In the interest of a unified presentation, the article offers also a new concept of normalization function as an effective tool for resolution of the problem of normalization measures that have already their own normalization functions.


2020 ◽  
Vol 2020 ◽  
pp. 1-21
Author(s):  
Tian Wang ◽  
Ping Xi ◽  
Bifu Hu

Product modeling has been applied in product engineering with success for geometric representation. With the application of multidisciplinary analysis, application-driven models need specific knowledge and time-consuming adjustment work based on the geometric model. This paper proposes a novel modeling technology named computer-aided design-supporting-simulation (CADSS) to generate multiphysics domain models to support multidisciplinary design optimization processes. Multiphysics model representation was analyzed to verify gaps among different domain models’ parameters. Therefore, multiphysics domain model architecture was integrated by optimization model, design model, and simulation model in consideration of domain model’s parameters. Besides, CADSS uses requirement space, domain knowledge, and software technology to describe the multidisciplinary model’s parameters and its transition. Depending on the domain requirements, the CADSS system extracts the required knowledge by decomposing product functions and then embeds the domain knowledge into functional features using software technology. This research aims to effectively complete the design cycle and improve the design quality by providing a consistent and concurrent modeling environment to generate an adaptable model for multiphysics simulation. This system is demonstrated by modeling turbine blade design with multiphysics simulations including computational fluid dynamics (CFD), conjugate heat transfer (CHT), and finite element analysis (FEA). Moreover, the blade multiphysics simulation model is validated by the optimization design of the film hole. The results show that the high-fidelity multiphysics simulation model generated through CADSS can be adapted to subsequent simulations.


2020 ◽  
Vol 12 (11) ◽  
pp. 1775 ◽  
Author(s):  
Katarzyna Chrobak ◽  
Grzegorz Chrobak ◽  
Jan K. Kazak

A multitude of factors considered necessary for an informed choice of the location of the vineyard can be overwhelming for the decision-maker. Is there still a place for knowledge valuable from the perspective of an experienced winegrower in the era of precise measurements? The informative use of so-called common knowledge is possible owing to fuzzy-based techniques, which allow for the representation of intuitive notions in terms of quantitative measures. The work uses tools based on fuzzy logic to cover the scope of common knowledge within the decision-making process. Owing to its flexibility and ability to deal with imprecise input data while maintaining the simple construction, the fuzzy logic solution filled the gap between GIS data and wine grower’s experience. Based on the data from the thematic literature, a set of rules was created to interpret the relationships between popular site selection criteria. The dynamics and manner of interaction between variables were determined using adequate membership functions. Pre-processing using GIS with remote sensing data was considered as a preliminary stage for the analysis. By using the graphical interface, the system operation facilitates the work of a potential user. The obtained results indicated the possibility of an alternative approach to classical analyses by replacing or extending the meaning of some variables using information based on feelings and perceptions. Research constitutes a premise for the further development of expert systems using widely understood domain knowledge.


Dela ◽  
2021 ◽  
pp. 149-167
Author(s):  
Špela Vintar ◽  
Uroš Stepišnik

We describe a systematic and data-driven approach to karst terminology where knowledge from different textual sources is structured into a comprehensive multilingual knowledge representation. The approach is based on a domain model which is constructed in line with the frame-based approach to terminology and the analytical geomorphological method of describing karst phenomena. The domain model serves as a basis for annotating definitions and aggregating the information obtained from different definitions into a knowledge network. We provide examples of visual knowledge representations and demonstrate the advantages of a systematic and interdisciplinary approach to domain knowledge.


Sign in / Sign up

Export Citation Format

Share Document