Web knowledge

Author(s):  
Christopher Walton

In the introductory chapter of this book, we discussed the means by which knowledge can be made available on the Web. That is, the representation of the knowledge in a form by which it can be automatically processed by a computer. To recap, we identified two essential steps that were deemed necessary to achieve this task: 1. We discussed the need to agree on a suitable structure for the knowledge that we wish to represent. This is achieved through the construction of a semantic network, which defines the main concepts of the knowledge, and the relationships between these concepts. We presented an example network that contained the main concepts to differentiate between kinds of cameras. Our network is a conceptualization, or an abstract view of a small part of the world. A conceptualization is defined formally in an ontology, which is in essence a vocabulary for knowledge representation. 2. We discussed the construction of a knowledge base, which is a store of knowledge about a domain in machine-processable form; essentially a database of knowledge. A knowledge base is constructed through the classification of a body of information according to an ontology. The result will be a store of facts and rules that describe the domain. Our example described the classification of different camera features to form a knowledge base. The knowledge base is expressed formally in the language of the ontology over which it is defined. In this chapter we elaborate on these two steps to show how we can define ontologies and knowledge bases specifically for the Web. This will enable us to construct Semantic Web applications that make use of this knowledge. The chapter is devoted to a detailed explanation of the syntax and pragmatics of the RDF, RDFS, and OWL Semantic Web standards. The resource description framework (RDF) is an established standard for knowledge representation on the Web. Taken together with the associated RDF Schema (RDFS) standard, we have a language for representing simple ontologies and knowledge bases on the Web.

2018 ◽  
Vol 2 ◽  
pp. e25614 ◽  
Author(s):  
Florian Pellen ◽  
Sylvain Bouquin ◽  
Isabelle Mougenot ◽  
Régine Vignes-Lebbe

Xper3 (Vignes Lebbe et al. 2016) is a collaborative knowledge base publishing platform that, since its launch in november 2013, has been adopted by over 2 thousand users (Pinel et al. 2017). This is mainly due to its user friendly interface and the simplicity of its data model. The data are stored in MySQL Relational DBs, but the exchange format uses the TDWG standard format SDD (Structured Descriptive DataHagedorn et al. 2005). However, each Xper3 knowledge base is a closed world that the author(s) may or may not share with the scientific community or the public via publishing content and/or identification key (Kopfstein 2016). The explicit taxonomic, geographic and phenotypic limits of a knowledge base are not always well defined in the metadata fields. Conversely terminology vocabularies, such as Phenotype and Trait Ontology PATO and the Plant Ontology PO, and software to edit them, such as Protégé and Phenoscape, are essential in the semantic web, but difficult to handle for biologist without computer skills. These ontologies constitute open worlds, and are expressed themselves by RDF triples (Resource Description Framework). Protégé offers vizualisation and reasoning capabilities for these ontologies (Gennari et al. 2003, Musen 2015). Our challenge is to combine the user friendliness of Xper3 with the expressive power of OWL (Web Ontology Language), the W3C standard for building ontologies. We therefore focused on analyzing the representation of the same taxonomic contents under Xper3 and under different models in OWL. After this critical analysis, we chose a description model that allows automatic export of SDD to OWL and can be easily enriched. We will present the results obtained and their validation on two knowledge bases, one on parasitic crustaceans (Sacculina) and the second on current ferns and fossils (Corvez and Grand 2014). The evolution of the Xper3 platform and the perspectives offered by this link with semantic web standards will be discussed.


2010 ◽  
Vol 159 ◽  
pp. 13-16
Author(s):  
Jun Zhai ◽  
Li Li Qu ◽  
Yan Chen ◽  
Jian Feng Li

Ontology is adopted as a standard for knowledge representation on the Semantic Web, and Resource Description Framework (RDF) is used to add structure and meaning to Web applications. In order to incorporate fuzzy systems into the Semantic Web, this paper utilizes fuzzy ontology to represent formally the fuzzy linguistic variables, considering the semantic relationships between fuzzy concepts. Then fuzzy rule is described as a RDF resource with properties: “IF” and “THEN”, and rule's antecedent and consequent is represented in RDF statement. Taking the fuzzy control system of industrial washing machine for example, the fuzzy system with ontology and RDF is built, which shows that this research enables distributed fuzzy applications on the Semantic Web.


Author(s):  
Sebastian Hellmann ◽  
Jens Lehmann ◽  
Sören Auer

The vision of the Semantic Web aims to make use of semantic representations on the largest possible scale - the Web. Large knowledge bases such as DBpedia, OpenCyc, and GovTrack are emerging and freely available as Linked Data and SPARQL endpoints. Exploring and analysing such knowledge bases is a significant hurdle for Semantic Web research and practice. As one possible direction for tackling this problem, the authors present an approach for obtaining complex class expressions from objects in knowledge bases by using Machine Learning techniques. The chapter describes in detail how to leverage existing techniques to achieve scalability on large knowledge bases available as SPARQL endpoints or Linked Data. The algorithms are made available in the open source DL-Learner project and this chapter presents several real-life scenarios in which they can be used by Semantic Web applications.


Author(s):  
Marek Reformat ◽  
Ronald R. Yager ◽  
Zhan Li

The concept of Semantic Web (Berners, 2001) introduces a new form of knowledge representation – an ontology. An ontology is a partially ordered set of words and concepts of a specific domain, and allows for defining different kinds of relationships existing among concepts. Such approach promises formation of an environment where information is easily accessible and understandable for any system, application and/or human. Hierarchy of concepts (Yager, 2000) is a different and very interesting form of knowledge representation. A graph-like structure of the hierarchy provides a user with a suitable tool for identifying variety of different associations among concepts. These associations express user’s perceptions of relations among concepts, and lead to representing definitions of concepts in a human-like way. The Internet becomes an overwhelming repository of documents. This enormous storage of information will be effectively used when users will be equipped with systems capable of finding related documents quickly and correctly. The proposed work addresses that issue. It offers an approach that combines a hierarchy of concepts and ontology for the task of identifying web documents in the environment of the Semantic Web. A user provides a simple query in the form a hierarchy that only partially “describes” documents (s)he wants to retrieve from the web. The hierarchy is treated as a “seed” representing user’s initial knowledge about concepts covered by required documents. Ontologies are treated as supplementary knowledge bases. They are used to instantiate the hierarchy with concrete information, as well as to enhance it with new concepts initially unknown to the user. The proposed approach is used to design a prototype system for document identification in the web environment. The description of the system and the results of preliminary experiments are presented.


Author(s):  
Heiko Paulheim ◽  
Christian Bizer

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.


Author(s):  
Komal Dhulekar ◽  
Madhuri Devrankar

Semantic web is a concept that enables better machine processing of information on the web, by structuring documents written for the web in such a way that they become understandable by machines. This can be used for creating more complex applications (intelligent browsers, more advanced web agents), etc. Semantic modeling languages like the Resource Description Framework (RDF) and topic maps employ XML syntax to achieve this objective. New tools exploit cross domain vocabularies to automatically extract and relate the meta information in a new context. Web Ontology languages like DAML+OIL extend RDF with richer modeling primitives and a provide a technological basis to enable the Semantic Web. The logic languages for Semantic Web are described (which build on the of RDF and ontology languages). They, together with digital signatures, enable a web of trust, which will have levels of trust for its resources and for the rights of access, and will enable generating proofs, for the actions and resources on the web.


Author(s):  
Axel Polleres ◽  
Simon Steyskal

The World Wide Web Consortium (W3C) as the main standardization body for Web standards has set a particular focus on publishing and integrating Open Data. In this chapter, the authors explain various standards from the W3C's Semantic Web activity and the—potential—role they play in the context of Open Data: RDF, as a standard data format for publishing and consuming structured information on the Web; the Linked Data principles for interlinking RDF data published across the Web and leveraging a Web of Data; RDFS and OWL to describe vocabularies used in RDF and for describing mappings between such vocabularies. The authors conclude with a review of current deployments of these standards on the Web, particularly within public Open Data initiatives, and discuss potential risks and challenges.


Author(s):  
Salvador Lima ◽  
José Moreira

The Web is a crucial means for the dissemination of touristic information. However, most touristic information resources are stored directly in Web pages or in relational databases that are accessible through ad-hoc Web applications, and the use of automated processes to search, extract and interpret information can hardly be implemented. The Semantic Web technologies, aiming at representing the background knowledge about Web resources in a computational way, can be an important contribution to the development of such automated processes. This chapter introduces the concept of touristic object, giving special attention to the representation of temporal, spatial, and thematic knowledge. It also proposes a three-layered architecture for the representation of touristic objects in the Web. The central part is the domain layer, defining a Semantic Model for Tourism (SeMoT) to describe concepts, relationships, and constraints using ontologies. The data layer supports the mapping of touristic information in relational databases into Resource Description Framework (RDF) virtual graphs following the SeMoT specification. The application layer deals with the integration of information from different data sources into a unified knowledge model, offering a common vocabulary to describe touristic information resources. Finally, we also show how to use this framework for planning touristic itineraries.


Author(s):  
Leila Zemmouchi-Ghomari

Data play a central role in the effectiveness and efficiency of web applications, such as the Semantic Web. However, data are distributed across a very large number of online sources, due to which a significant effort is needed to integrate this data for its proper utilization. A promising solution to this issue is the linked data initiative, which is based on four principles related to publishing web data and facilitating interlinked and structured online data rather than the existing web of documents. The basic ideas, techniques, and applications of the linked data initiative are surveyed in this paper. The authors discuss some Linked Data open issues and potential tracks to address these pending questions.


Author(s):  
Franck Cotton ◽  
Daniel Gillman

Linked Open Statistical Metadata (LOSM) is Linked Open Data (LOD) applied to statistical metadata. LOD is a model for identifying, structuring, interlinking, and querying data published directly on the web. It builds on the standards of the semantic web defined by the W3C. LOD uses the Resource Description Framework (RDF), a simple data model expressing content as predicates linking resources between them or with literal properties. The simplicity of the model makes it able to represent any data, including metadata. We define statistical data as data produced through some statistical process or intended for statistical analyses, and statistical metadata as metadata describing statistical data. LOSM promotes discovery and the meaning and structure of statistical data in an automated way. Consequently, it helps with understanding and interpreting data and preventing inadequate or flawed visualizations for statistical data. This enhances statistical literacy and efforts at visualizing statistics.


Sign in / Sign up

Export Citation Format

Share Document