The role of semantics in mining frequent patterns from knowledge bases in description logics with rules

AbstractWe propose a new method for mining frequent patterns in a language that combines both Semantic Web ontologies and rules. In particular, we consider the setting of using a language that combines description logics (DLs) with DL-safe rules. This setting is important for the practical application of data mining to the Semantic Web. We focus on the relation of the semantics of the representation formalism to the task of frequent pattern discovery, and for the core of our method, we propose an algorithm that exploits the semantics of the combined knowledge base. We have developed a proof-of-concept data mining implementation of this. Using this we have empirically shown that using the combined knowledge base to perform semantic tests can make data mining faster by pruning useless candidate patterns before their evaluation. We have also shown that the quality of the set of patterns produced may be improved: the patterns are more compact, and there are fewer patterns. We conclude that exploiting the semantics of a chosen representation formalism is key to the design and application of (onto-)relational frequent pattern discovery methods.

Download Full-text

Building an OWL ontology with Xper3

Biodiversity Information Science and Standards ◽

10.3897/biss.2.25614 ◽

2018 ◽

Vol 2 ◽

pp. e25614 ◽

Cited By ~ 1

Author(s):

Florian Pellen ◽

Sylvain Bouquin ◽

Isabelle Mougenot ◽

Régine Vignes-Lebbe

Keyword(s):

Semantic Web ◽

Knowledge Base ◽

Expressive Power ◽

Knowledge Bases ◽

User Friendliness ◽

Standard Format ◽

Ontology Language ◽

Closed World ◽

Web Standards ◽

Description Framework

Xper3 (Vignes Lebbe et al. 2016) is a collaborative knowledge base publishing platform that, since its launch in november 2013, has been adopted by over 2 thousand users (Pinel et al. 2017). This is mainly due to its user friendly interface and the simplicity of its data model. The data are stored in MySQL Relational DBs, but the exchange format uses the TDWG standard format SDD (Structured Descriptive DataHagedorn et al. 2005). However, each Xper3 knowledge base is a closed world that the author(s) may or may not share with the scientific community or the public via publishing content and/or identification key (Kopfstein 2016). The explicit taxonomic, geographic and phenotypic limits of a knowledge base are not always well defined in the metadata fields. Conversely terminology vocabularies, such as Phenotype and Trait Ontology PATO and the Plant Ontology PO, and software to edit them, such as Protégé and Phenoscape, are essential in the semantic web, but difficult to handle for biologist without computer skills. These ontologies constitute open worlds, and are expressed themselves by RDF triples (Resource Description Framework). Protégé offers vizualisation and reasoning capabilities for these ontologies (Gennari et al. 2003, Musen 2015). Our challenge is to combine the user friendliness of Xper3 with the expressive power of OWL (Web Ontology Language), the W3C standard for building ontologies. We therefore focused on analyzing the representation of the same taxonomic contents under Xper3 and under different models in OWL. After this critical analysis, we chose a description model that allows automatic export of SDD to OWL and can be easily enriched. We will present the results obtained and their validation on two knowledge bases, one on parasitic crustaceans (Sacculina) and the second on current ferns and fossils (Corvez and Grand 2014). The evolution of the Xper3 platform and the perspectives offered by this link with semantic web standards will be discussed.

Download Full-text

Data mining for building knowledge bases: techniques, architectures and applications

The Knowledge Engineering Review ◽

10.1017/s0269888916000047 ◽

2016 ◽

Vol 31 (2) ◽

pp. 97-123 ◽

Cited By ~ 4

Author(s):

Alfred Krzywicki ◽

Wayne Wobcke ◽

Michael Bain ◽

John Calvo Martinez ◽

Paul Compton

Keyword(s):

Data Mining ◽

Knowledge Base ◽

Question Answering ◽

Knowledge Bases ◽

Event Extraction ◽

Data Sources ◽

Small Scale ◽

Knowledge Mining ◽

Practical Applications ◽

Unstructured Text

AbstractData mining techniques for extracting knowledge from text have been applied extensively to applications including question answering, document summarisation, event extraction and trend monitoring. However, current methods have mainly been tested on small-scale customised data sets for specific purposes. The availability of large volumes of data and high-velocity data streams (such as social media feeds) motivates the need to automatically extract knowledge from such data sources and to generalise existing approaches to more practical applications. Recently, several architectures have been proposed for what we callknowledge mining: integrating data mining for knowledge extraction from unstructured text (possibly making use of a knowledge base), and at the same time, consistently incorporating this new information into the knowledge base. After describing a number of existing knowledge mining systems, we review the state-of-the-art literature on both current text mining methods (emphasising stream mining) and techniques for the construction and maintenance of knowledge bases. In particular, we focus on mining entities and relations from unstructured text data sources, entity disambiguation, entity linking and question answering. We conclude by highlighting general trends in knowledge mining research and identifying problems that require further research to enable more extensive use of knowledge bases.

Download Full-text

Web knowledge

Agency and the Semantic Web ◽

10.1093/oso/9780199292486.003.0008 ◽

2006 ◽

Author(s):

Christopher Walton

Keyword(s):

Semantic Web ◽

Knowledge Representation ◽

Knowledge Base ◽

Web Applications ◽

Semantic Network ◽

Knowledge Bases ◽

Web Standards ◽

Description Framework ◽

The Web

In the introductory chapter of this book, we discussed the means by which knowledge can be made available on the Web. That is, the representation of the knowledge in a form by which it can be automatically processed by a computer. To recap, we identified two essential steps that were deemed necessary to achieve this task: 1. We discussed the need to agree on a suitable structure for the knowledge that we wish to represent. This is achieved through the construction of a semantic network, which defines the main concepts of the knowledge, and the relationships between these concepts. We presented an example network that contained the main concepts to differentiate between kinds of cameras. Our network is a conceptualization, or an abstract view of a small part of the world. A conceptualization is defined formally in an ontology, which is in essence a vocabulary for knowledge representation. 2. We discussed the construction of a knowledge base, which is a store of knowledge about a domain in machine-processable form; essentially a database of knowledge. A knowledge base is constructed through the classification of a body of information according to an ontology. The result will be a store of facts and rules that describe the domain. Our example described the classification of different camera features to form a knowledge base. The knowledge base is expressed formally in the language of the ontology over which it is defined. In this chapter we elaborate on these two steps to show how we can define ontologies and knowledge bases specifically for the Web. This will enable us to construct Semantic Web applications that make use of this knowledge. The chapter is devoted to a detailed explanation of the syntax and pragmatics of the RDF, RDFS, and OWL Semantic Web standards. The resource description framework (RDF) is an established standard for knowledge representation on the Web. Taken together with the associated RDF Schema (RDFS) standard, we have a language for representing simple ontologies and knowledge bases on the Web.

Download Full-text

Frequent Pattern Discovery from OWL DLP Knowledge Bases

Managing Knowledge in a World of Networks - Lecture Notes in Computer Science ◽

10.1007/11891451_26 ◽

2006 ◽

pp. 287-302 ◽

Cited By ~ 4

Author(s):

Joanna Józefowska ◽

Agnieszka Ławrynowicz ◽

Tomasz Łukaszewski

Keyword(s):

Pattern Discovery ◽

Knowledge Bases ◽

Frequent Pattern

Download Full-text

Ontologies in the Health Field

Data Mining and Medical Knowledge Management ◽

10.4018/978-1-60566-218-3.ch002 ◽

2011 ◽

pp. 37-56 ◽

Cited By ~ 1

Author(s):

Michel Simonet ◽

Radja Messai ◽

Gayo Diallo

Keyword(s):

Data Mining ◽

Information Retrieval ◽

Semantic Web ◽

Knowledge Bases ◽

Health Data ◽

Efficient Manner ◽

Common Definition ◽

Ontology Design ◽

Definition Of ◽

Health Field

Health data and knowledge had been structured through medical classifications and taxonomies long before ontologies had acquired their pivot status of the Semantic Web. Although there is no consensus on a common definition of an ontology, it is necessary to understand their main features to be able to use them in a pertinent and efficient manner for data mining purposes. This chapter introduces the basic notions about ontologies, presents a survey of their use in medicine and explores some related issues: knowledge bases, terminology, and information retrieval. It also addresses the issues of ontology design, ontology representation, and the possible interaction between data mining and ontologies.

Download Full-text

BIG DATA MINING FOR INTERESTING PATTERNS WITH MAP REDUCE TECHNIQUE

Asian Journal of Pharmaceutical and Clinical Research ◽

10.22159/ajpcr.2017.v10s1.19634 ◽

2017 ◽

Vol 10 (13) ◽

pp. 191

Author(s):

Nikhil Jamdar ◽

A Vijayalakshmi

Keyword(s):

Data Mining ◽

Pattern Mining ◽

Uncertain Data ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Map Reduce ◽

Frequent Patterns ◽

Precise Data ◽

Big Data Mining ◽

Transactional Databases

There are many algorithms available in data mining to search interesting patterns from transactional databases of precise data. Frequent pattern mining is a technique to find the frequently occurred items in data mining. Most of the techniques used to find all the interesting patterns from a collection of precise data, where items occurred in each transaction are certainly known to the system. As well as in many real-time applications, users are interested in a tiny portion of large frequent patterns. So the proposed user constrained mining approach, will help to find frequent patterns in which user is interested. This approach will efficiently find user interested frequent patterns by applying user constraints on the collections of uncertain data. The user can specify their own interest in the form of constraints and uses the Map Reduce model to find uncertain frequent pattern that satisfy the user-specified constraints

Download Full-text

A Framework for Reasoning about Dynamic Axioms in Description Logics

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/233 ◽

2020 ◽

Author(s):

Bartosz Bednarczyk ◽

Stephane Demri ◽

Alessio Mansutti

Keyword(s):

Knowledge Representation ◽

Knowledge Base ◽

Data Structures ◽

Description Logics ◽

Knowledge Bases ◽

Dynamic Data Structures ◽

Dynamic Data ◽

Logical Connectives ◽

Consistency Problem ◽

Natural Means

Description logics are well-known logical formalisms for knowledge representation. We propose to enrich knowledge bases (KBs) with dynamic axioms that specify how the satisfaction of statements from the KBs evolves when the interpretation is decomposed or recomposed, providing a natural means to predict the evolution of interpretations. Our dynamic axioms borrow logical connectives from separation logics, well-known specification languages to verify programs with dynamic data structures. In the paper, we focus on ALC and EL augmented with dynamic axioms, or to their subclass of positive dynamic axioms. The knowledge base consistency problem in the presence of dynamic axioms is investigated, leading to interesting complexity results, among which the problem for EL with positive dynamic axioms is tractable, whereas EL with dynamic axioms is undecidable.

Download Full-text

Frequent Pattern Discovery and Association Rule Mining of XML Data

Data Mining ◽

10.4018/978-1-4666-2455-9.ch044 ◽

2013 ◽

pp. 859-879

Author(s):

Qin Ding ◽

Gnanasekaran Sundarraj

Keyword(s):

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Pattern Discovery ◽

Frequent Pattern ◽

Future Research ◽

Frequent Patterns ◽

Rule Mining ◽

Xml Data ◽

Art Research

Finding frequent patterns and association rules in large data has become a very important task in data mining. Various algorithms have been proposed to solve such problems, but most algorithms are only applicable to relational data. With the increasing use and popularity of XML representation, it is of importance yet challenging to find solutions to frequent pattern discovery and association rule mining of XML data. The challenge comes from the complexity of the structure in XML data. In this chapter, we provide an overview of the state-of-the-art research in content-based and structure-based mining of frequent patterns and association rules from XML data. We also discuss the challenges and issues, and provide our insight for solutions and future research directions.

Download Full-text

Preference-Based Frequent Pattern Mining

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch073 ◽

2008 ◽

pp. 1280-1299

Author(s):

Moonjung Cho ◽

Jian Pei ◽

Haixun Wang ◽

Wei Wang

Keyword(s):

Data Mining ◽

General Framework ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Frequent Patterns ◽

Performance Study ◽

Important Data ◽

Mining Algorithms ◽

Extensive Performance

Frequent pattern mining is an important data-mining problem with broad applications. Although there are many in-depth studies on efficient frequent pattern mining algorithms and constraint pushing techniques, the effectiveness of frequent pattern mining remains a serious concern: It is non-trivial and often tricky to specify appropriate support thresholds and proper constraints. In this paper, we propose a novel theme of preference-based frequent pattern mining. A user simply can specify a preference instead of setting detailed parameters in constraints. We identify the problem of preference-based frequent pattern mining and formulate the preferences for mining. We develop an efficient framework to mine frequent patterns with preferences. Interestingly, many preferences can be pushed deep into the mining by properly employing the existing efficient frequent pattern mining techniques. We conduct an extensive performance study to examine our method. The results indicate that preference-based frequent pattern mining is effective and efficient. Furthermore, we extend our discussion from pattern-based frequent pattern mining to preference-based data mining in principle and draw a general framework.

Download Full-text