scholarly journals The role of semantics in mining frequent patterns from knowledge bases in description logics with rules

2010 ◽  
Vol 10 (3) ◽  
pp. 251-289 ◽  
Author(s):  
JOANNA JÓZEFOWSKA ◽  
AGNIESZKA ŁAWRYNOWICZ ◽  
TOMASZ ŁUKASZEWSKI

AbstractWe propose a new method for mining frequent patterns in a language that combines both Semantic Web ontologies and rules. In particular, we consider the setting of using a language that combines description logics (DLs) with DL-safe rules. This setting is important for the practical application of data mining to the Semantic Web. We focus on the relation of the semantics of the representation formalism to the task of frequent pattern discovery, and for the core of our method, we propose an algorithm that exploits the semantics of the combined knowledge base. We have developed a proof-of-concept data mining implementation of this. Using this we have empirically shown that using the combined knowledge base to perform semantic tests can make data mining faster by pruning useless candidate patterns before their evaluation. We have also shown that the quality of the set of patterns produced may be improved: the patterns are more compact, and there are fewer patterns. We conclude that exploiting the semantics of a chosen representation formalism is key to the design and application of (onto-)relational frequent pattern discovery methods.

2018 ◽  
Vol 2 ◽  
pp. e25614 ◽  
Author(s):  
Florian Pellen ◽  
Sylvain Bouquin ◽  
Isabelle Mougenot ◽  
Régine Vignes-Lebbe

Xper3 (Vignes Lebbe et al. 2016) is a collaborative knowledge base publishing platform that, since its launch in november 2013, has been adopted by over 2 thousand users (Pinel et al. 2017). This is mainly due to its user friendly interface and the simplicity of its data model. The data are stored in MySQL Relational DBs, but the exchange format uses the TDWG standard format SDD (Structured Descriptive DataHagedorn et al. 2005). However, each Xper3 knowledge base is a closed world that the author(s) may or may not share with the scientific community or the public via publishing content and/or identification key (Kopfstein 2016). The explicit taxonomic, geographic and phenotypic limits of a knowledge base are not always well defined in the metadata fields. Conversely terminology vocabularies, such as Phenotype and Trait Ontology PATO and the Plant Ontology PO, and software to edit them, such as Protégé and Phenoscape, are essential in the semantic web, but difficult to handle for biologist without computer skills. These ontologies constitute open worlds, and are expressed themselves by RDF triples (Resource Description Framework). Protégé offers vizualisation and reasoning capabilities for these ontologies (Gennari et al. 2003, Musen 2015). Our challenge is to combine the user friendliness of Xper3 with the expressive power of OWL (Web Ontology Language), the W3C standard for building ontologies. We therefore focused on analyzing the representation of the same taxonomic contents under Xper3 and under different models in OWL. After this critical analysis, we chose a description model that allows automatic export of SDD to OWL and can be easily enriched. We will present the results obtained and their validation on two knowledge bases, one on parasitic crustaceans (Sacculina) and the second on current ferns and fossils (Corvez and Grand 2014). The evolution of the Xper3 platform and the perspectives offered by this link with semantic web standards will be discussed.


2016 ◽  
Vol 31 (2) ◽  
pp. 97-123 ◽  
Author(s):  
Alfred Krzywicki ◽  
Wayne Wobcke ◽  
Michael Bain ◽  
John Calvo Martinez ◽  
Paul Compton

AbstractData mining techniques for extracting knowledge from text have been applied extensively to applications including question answering, document summarisation, event extraction and trend monitoring. However, current methods have mainly been tested on small-scale customised data sets for specific purposes. The availability of large volumes of data and high-velocity data streams (such as social media feeds) motivates the need to automatically extract knowledge from such data sources and to generalise existing approaches to more practical applications. Recently, several architectures have been proposed for what we callknowledge mining: integrating data mining for knowledge extraction from unstructured text (possibly making use of a knowledge base), and at the same time, consistently incorporating this new information into the knowledge base. After describing a number of existing knowledge mining systems, we review the state-of-the-art literature on both current text mining methods (emphasising stream mining) and techniques for the construction and maintenance of knowledge bases. In particular, we focus on mining entities and relations from unstructured text data sources, entity disambiguation, entity linking and question answering. We conclude by highlighting general trends in knowledge mining research and identifying problems that require further research to enable more extensive use of knowledge bases.


Author(s):  
Christopher Walton

In the introductory chapter of this book, we discussed the means by which knowledge can be made available on the Web. That is, the representation of the knowledge in a form by which it can be automatically processed by a computer. To recap, we identified two essential steps that were deemed necessary to achieve this task: 1. We discussed the need to agree on a suitable structure for the knowledge that we wish to represent. This is achieved through the construction of a semantic network, which defines the main concepts of the knowledge, and the relationships between these concepts. We presented an example network that contained the main concepts to differentiate between kinds of cameras. Our network is a conceptualization, or an abstract view of a small part of the world. A conceptualization is defined formally in an ontology, which is in essence a vocabulary for knowledge representation. 2. We discussed the construction of a knowledge base, which is a store of knowledge about a domain in machine-processable form; essentially a database of knowledge. A knowledge base is constructed through the classification of a body of information according to an ontology. The result will be a store of facts and rules that describe the domain. Our example described the classification of different camera features to form a knowledge base. The knowledge base is expressed formally in the language of the ontology over which it is defined. In this chapter we elaborate on these two steps to show how we can define ontologies and knowledge bases specifically for the Web. This will enable us to construct Semantic Web applications that make use of this knowledge. The chapter is devoted to a detailed explanation of the syntax and pragmatics of the RDF, RDFS, and OWL Semantic Web standards. The resource description framework (RDF) is an established standard for knowledge representation on the Web. Taken together with the associated RDF Schema (RDFS) standard, we have a language for representing simple ontologies and knowledge bases on the Web.


Author(s):  
Joanna Józefowska ◽  
Agnieszka Ławrynowicz ◽  
Tomasz Łukaszewski

Author(s):  
Michel Simonet ◽  
Radja Messai ◽  
Gayo Diallo

Health data and knowledge had been structured through medical classifications and taxonomies long before ontologies had acquired their pivot status of the Semantic Web. Although there is no consensus on a common definition of an ontology, it is necessary to understand their main features to be able to use them in a pertinent and efficient manner for data mining purposes. This chapter introduces the basic notions about ontologies, presents a survey of their use in medicine and explores some related issues: knowledge bases, terminology, and information retrieval. It also addresses the issues of ontology design, ontology representation, and the possible interaction between data mining and ontologies.


2017 ◽  
Vol 10 (13) ◽  
pp. 191
Author(s):  
Nikhil Jamdar ◽  
A Vijayalakshmi

There are many algorithms available in data mining to search interesting patterns from transactional databases of precise data. Frequent pattern mining is a technique to find the frequently occurred items in data mining. Most of the techniques used to find all the interesting patterns from a collection of precise data, where items occurred in each transaction are certainly known to the system. As well as in many real-time applications, users are interested in a tiny portion of large frequent patterns. So the proposed user constrained mining approach, will help to find frequent patterns in which user is interested. This approach will efficiently find user interested frequent patterns by applying user constraints on the collections of uncertain data. The user can specify their own interest in the form of constraints and uses the Map Reduce model to find uncertain frequent pattern that satisfy the user-specified constraints 


Author(s):  
Bartosz Bednarczyk ◽  
Stephane Demri ◽  
Alessio Mansutti

Description logics are well-known logical formalisms for knowledge representation. We propose to enrich knowledge bases (KBs) with dynamic axioms that specify how the satisfaction of statements from the KBs evolves when the interpretation is decomposed or recomposed, providing a natural means to predict the evolution of interpretations. Our dynamic axioms borrow logical connectives from separation logics, well-known specification languages to verify programs with dynamic data structures. In the paper, we focus on ALC and EL augmented with dynamic axioms, or to their subclass of positive dynamic axioms. The knowledge base consistency problem in the presence of dynamic axioms is investigated, leading to interesting complexity results, among which the problem for EL with positive dynamic axioms is tractable, whereas EL with dynamic axioms is undecidable.


Data Mining ◽  
2013 ◽  
pp. 859-879
Author(s):  
Qin Ding ◽  
Gnanasekaran Sundarraj

Finding frequent patterns and association rules in large data has become a very important task in data mining. Various algorithms have been proposed to solve such problems, but most algorithms are only applicable to relational data. With the increasing use and popularity of XML representation, it is of importance yet challenging to find solutions to frequent pattern discovery and association rule mining of XML data. The challenge comes from the complexity of the structure in XML data. In this chapter, we provide an overview of the state-of-the-art research in content-based and structure-based mining of frequent patterns and association rules from XML data. We also discuss the challenges and issues, and provide our insight for solutions and future research directions.


2008 ◽  
pp. 1280-1299
Author(s):  
Moonjung Cho ◽  
Jian Pei ◽  
Haixun Wang ◽  
Wei Wang

Frequent pattern mining is an important data-mining problem with broad applications. Although there are many in-depth studies on efficient frequent pattern mining algorithms and constraint pushing techniques, the effectiveness of frequent pattern mining remains a serious concern: It is non-trivial and often tricky to specify appropriate support thresholds and proper constraints. In this paper, we propose a novel theme of preference-based frequent pattern mining. A user simply can specify a preference instead of setting detailed parameters in constraints. We identify the problem of preference-based frequent pattern mining and formulate the preferences for mining. We develop an efficient framework to mine frequent patterns with preferences. Interestingly, many preferences can be pushed deep into the mining by properly employing the existing efficient frequent pattern mining techniques. We conduct an extensive performance study to examine our method. The results indicate that preference-based frequent pattern mining is effective and efficient. Furthermore, we extend our discussion from pattern-based frequent pattern mining to preference-based data mining in principle and draw a general framework.


Author(s):  
Qin Ding ◽  
Gnanasekaran Sundarraj

Finding frequent patterns and association rules in large data has become a very important task in data mining. Various algorithms have been proposed to solve such problems, but most algorithms are only applicable to relational data. With the increasing use and popularity of XML representation, it is of importance yet challenging to find solutions to frequent pattern discovery and association rule mining of XML data. The challenge comes from the complexity of the structure in XML data. In this chapter, we provide an overview of the state-of-the-art research in content-based and structure-based mining of frequent patterns and association rules from XML data. We also discuss the challenges and issues, and provide our insight for solutions and future research directions.


Sign in / Sign up

Export Citation Format

Share Document