Semi-Automatic Ontology Development
Latest Publications


TOTAL DOCUMENTS

10
(FIVE YEARS 0)

H-INDEX

2
(FIVE YEARS 0)

Published By IGI Global

9781466601888, 9781466601895

Author(s):  
Mariana Damova ◽  
Atanas Kiryakov ◽  
Maurice Grinberg ◽  
Michael K. Bergman ◽  
Frédérick Giasson ◽  
...  

The chapter introduces the process of design of two upper-level ontologies—PROTON and UMBEL—into reference ontologies and their integration in the so-called Reference Knowledge Stack (RKS). It is argued that RKS is an important step in the efforts of the Linked Open Data (LOD) project to transform the Web into a global data space with diverse real data, available for review and analysis. RKS is intended to make the interoperability between published datasets much more efficient than it is now. The approach discussed in the chapter consists of developing reference layers of upper-level ontologies by mapping them to certain LOD schemata and assigning instance data to them so they cover a reasonable portion of the LOD datasets. The chapter presents the methods (manual and semi-automatic) used in the creation of the RKS and gives examples that illustrate its advantages for managing highly heterogeneous data and its usefulness in real life knowledge intense applications.


Author(s):  
Andreea Diosteanu ◽  
Armando Stellato ◽  
Andrea Turbati

In this chapter, the authors present Service Oriented Data Acquisition (SODA), a service-deployable open-source platform for retrieving and dynamically aggregating information extraction and knowledge acquisition software components. The motivation in creating such a system came from the observed gap between the large availability of Information Analysis components for different frameworks (such as UIMA [Ferrucci & Lally, 2004] and GATE [Cunningham, Maynard, Bontcheva, & Tablan, 2002]) and the difficulties in discovering, retrieving, integrating these components, and embedding them into software systems for knowledge feeding. By analyzing the research area, the authors noticed that there are a few solutions for this problem, though they all lack in assuring a great level of platform independence, collaboration, flexibility, and most of all, openness. The solution that they propose is targeted to different kinds of users, from application developers, benefiting from a semantic repository of inter-connectable information extraction and ontology feeding components, to final users, who can plug and play these components through SODA compliant clients.


Author(s):  
Christian M. Meyer ◽  
Iryna Gurevych

For constructing their ontology OntoWiktionary, the authors present a two-step approach that involves (1) harvesting structured knowledge from Wiktionary and (2) ontologizing this knowledge (i.e., the formation of ontological concepts and relationships from the harvested knowledge). They evaluate their approach based on human judgments and find their new ontology to be of overall good quality. To encourage further research in this field, the authors make the final OntoWiktionary publicly available and suggest integrating this novel resource with the linked data cloud as well as other existing ontology projects.


Author(s):  
Feten Baccar Ben Amar ◽  
Bilel Gargouri ◽  
Abdelmajid Ben Hamadou

In this chapter, the authors propose an approach for generating domain ontologies from LMF standardized dictionaries (ISO-24613). It consists, firstly, of deriving the target ontology core systematically from the explicit information of the LMF dictionary structure. Secondly, it aims at enriching such a core, taking advantage of textual sources with guided semantic fields available in the definitions and the examples of lexical entries. The originality of this work lies not only in the use of a unique and finely-structured source containing multi-domain and lexical knowledge of morphological, syntactic, and semantic levels, lending itself to ontological interpretations, but also in providing ontological elements with linguistic grounding. In addition, the proposed approach has addressed the quality issue that is of a major importance in ontology engineering. They have integrated a validation stage along with the extraction modules in order to maintain the consistency of the generated ontologies. Furthermore, the proposed approach was applied to a case study in the field of astronomy and the experiment has been carried out on the Arabic language. This choice is explained both by the great deficiency of work on Arabic ontology development and the availability within the research team of an LMF standardized Arabic dictionary.


Author(s):  
Ernesto William De Luca

In this chapter, the author presents his approach to aggregating and maintaining Multilingual Linked Data. He describes Lexical Resources and Lexical Linked Data, presenting a hybridization that ports the largest lexical resource EuroWordNet to the Linked Open Data cloud, interlinking it with other lexical resources. Furthermore, he shows the LexiRes RDF/OWL tool that gives the possibility to navigate this lexical information, helping authors of already available lexical resources in deleting or restructuring concepts using automatic merging methods. The chapter is concluded by a discussion on personalizing information according to user preferences, filtering relevant information while taking into account the multilingual background of the user.


Author(s):  
Davide Eynard ◽  
Matteo Matteucci ◽  
Fabio Marfia

Ontologies are the basic block of modern knowledge-based systems; however, the effort and expertise required to develop them often prevents their widespread adoption. In this chapter, the authors present a tool for the automatic discovery of basic ontologies—they call them seed ontologies—starting from a corpus of documents related to a specific domain of knowledge. These seed ontologies are not meant for direct use, but they can be used to bootstrap the knowledge acquisition process by providing a selection of relevant terms and fundamental relationships. The tool is modular and it allows the integration of different methods/strategies in the indexing of the corpus, selection of relevant terms, discovery of hierarchies, and other relationships among terms. Like any induction process, ontology learning from text is prone to errors, so the authors do not expect a 100% correct ontology; according to their evaluation the result is closer to 80%, but this should be enough for a domain expert to complete the work with limited effort and in a short time.


Author(s):  
Silvana Hartmann ◽  
György Szarvas ◽  
Iryna Gurevych

The collection of the specialized vocabulary of a particular domain (terminology) is an important initial step of creating formalized domain knowledge representations (ontologies). Terminology Extraction (TE) aims at automating this process by collecting the relevant domain vocabulary from existing lexical resources or collections of domain texts. In this chapter, the authors address the extraction of multiword terminology, as multiword terms are very frequent in terminology but typically poorly represented in standard lexical resources. They present their method for mining multiword terminology from Wikipedia and the freely available terminology resource that they extracted using the presented method. Terminology extraction based on Wikipedia exploits the advantages of a huge multilingual, domain-transcending knowledge source and large scale structural information that can identify potential multiword units without the need for linguistic processing tools. Thus, while evaluated in English, the proposed method is basically applicable to all languages in Wikipedia.


Author(s):  
Francesca Fallucchi ◽  
Fabio Massimo Zanzotto

The authors propose probabilistic models for learning ontologies that expand existing ontologies taking into account both corpus-extracted evidence and the structure of the generated ontologies. The model exploits structural properties of target relations such as transitivity during learning. They then propose two extensions of the probabilistic models: a model for learning from a generic domain that can be exploited to extract new information in a specific domain and an incremental ontology learning system that puts human validations in the learning loop. This latter provides a graphical user interface and a human-computer interaction workflow supporting the incremental leaning loop.


Author(s):  
Ivan Bedini ◽  
Benjamin Nguyen ◽  
Christopher Matheus ◽  
Peter F. Patel-Schneider ◽  
Aidan Boran

One of the promises of the Semantic Web is to support applications that easily and seamlessly deal with heterogeneous data. Most data in the Web, however, is in the Extensible Markup Language (XML) format, but using XML requires applications to understand the format of each data source that they access. Achieving the benefits of the Semantic Web involves transforming XML into the Semantic Web languages, OWL (the Web Ontology Language) and RDF (the Resource Description Framework), a process that generally has manual or only semi-automatic components. In this chapter, the authors present a set of patterns that enable the automatic transformation from XML Schema into RDF and OWL, enabling the direct use of much XML data in the Semantic Web. They focus on a possible logical representation of the first language and present an implementation, including a comparison with related works.


Author(s):  
Elias Iosif ◽  
Georgios Petasis ◽  
Vangelis Karkaletsis

The authors present an ontology-based information extraction process, which operates in a bootstrapping framework. The novelty of this approach lies in the continuous semantics extraction from textual content in order to evolve the underlying ontology, while the evolved ontology enhances in turn the information extraction mechanism. This process was implemented in the context of the R&D project BOEMIE1. The BOEMIE system was evaluated on the athletics domain.


Sign in / Sign up

Export Citation Format

Share Document