scholarly journals Assessing Large-Scale, Cross-Domain Knowledge Bases for Semantic Search

Author(s):  
Aatif Ahmad Khan ◽  
Sanjay Kumar Malik

Semantic Search refers to set of approaches dealing with usage of Semantic Web technologies for information retrieval in order to make the process machine understandable and fetch precise results. Knowledge Bases (KB) act as the backbone for semantic search approaches to provide machine interpretable information for query processing and retrieval of results. These KB include Resource Description Framework (RDF) datasets and populated ontologies. In this paper, an assessment of the largest cross-domain KB is presented that are exploited in large scale semantic search and are freely available on Linked Open Data Cloud. Analysis of these datasets is a prerequisite for modeling effective semantic search approaches because of their suitability for particular applications. Only the large scale, cross-domain datasets are considered, which are having sizes more than 10 million RDF triples. Survey of sizes of the datasets in triples count has been depicted along with triples data format(s) supported by them, which is quite significant to develop effective semantic search models.

2020 ◽  
Vol 1 (1) ◽  
pp. 428-444 ◽  
Author(s):  
Silvio Peroni ◽  
David Shotton

OpenCitations is an infrastructure organization for open scholarship dedicated to the publication of open citation data as Linked Open Data using Semantic Web technologies, thereby providing a disruptive alternative to traditional proprietary citation indexes. Open citation data are valuable for bibliometric analysis, increasing the reproducibility of large-scale analyses by enabling publication of the source data. Following brief introductions to the development and benefits of open scholarship and to Semantic Web technologies, this paper describes OpenCitations and its data sets, tools, services, and activities. These include the OpenCitations Data Model; the SPAR (Semantic Publishing and Referencing) Ontologies; OpenCitations’ open software of generic applicability for searching, browsing, and providing REST APIs over resource description framework (RDF) triplestores; Open Citation Identifiers (OCIs) and the OpenCitations OCI Resolution Service; the OpenCitations Corpus (OCC), a database of open downloadable bibliographic and citation data made available in RDF under a Creative Commons public domain dedication; and the OpenCitations Indexes of open citation data, of which the first and largest is COCI, the OpenCitations Index of Crossref Open DOI-to-DOI Citations, which currently contains over 624 million bibliographic citations and is receiving considerable usage by the scholarly community.


Author(s):  
Fabian Kirstein ◽  
Kyriakos Stefanidis ◽  
Benjamin Dittwald ◽  
Simon Dutkowski ◽  
Sebastian Urbanek ◽  
...  

2021 ◽  
Vol 13 (5) ◽  
pp. 124
Author(s):  
Jiseong Son ◽  
Chul-Su Lim ◽  
Hyoung-Seop Shim ◽  
Ji-Sun Kang

Despite the development of various technologies and systems using artificial intelligence (AI) to solve problems related to disasters, difficult challenges are still being encountered. Data are the foundation to solving diverse disaster problems using AI, big data analysis, and so on. Therefore, we must focus on these various data. Disaster data depend on the domain by disaster type and include heterogeneous data and lack interoperability. In particular, in the case of open data related to disasters, there are several issues, where the source and format of data are different because various data are collected by different organizations. Moreover, the vocabularies used for each domain are inconsistent. This study proposes a knowledge graph to resolve the heterogeneity among various disaster data and provide interoperability among domains. Among disaster domains, we describe the knowledge graph for flooding disasters using Korean open datasets and cross-domain knowledge graphs. Furthermore, the proposed knowledge graph is used to assist, solve, and manage disaster problems.


2017 ◽  
Vol 44 (2) ◽  
pp. 203-229 ◽  
Author(s):  
Javier D Fernández ◽  
Miguel A Martínez-Prieto ◽  
Pablo de la Fuente Redondo ◽  
Claudio Gutiérrez

The publication of semantic web data, commonly represented in Resource Description Framework (RDF), has experienced outstanding growth over the last few years. Data from all fields of knowledge are shared publicly and interconnected in active initiatives such as Linked Open Data. However, despite the increasing availability of applications managing large-scale RDF information such as RDF stores and reasoning tools, little attention has been given to the structural features emerging in real-world RDF data. Our work addresses this issue by proposing specific metrics to characterise RDF data. We specifically focus on revealing the redundancy of each data set, as well as common structural patterns. We evaluate the proposed metrics on several data sets, which cover a wide range of designs and models. Our findings provide a basis for more efficient RDF data structures, indexes and compressors.


Author(s):  
Jose María Alvarez Rodríguez ◽  
José Emilio Labra Gayo ◽  
Patricia Ordoñez de Pablos

The aim of this chapter is to present a proposal and a case study to describe the information about organizations in a standard way using the Linked Data approach. Several models and ontologies have been provided in order to formalize the data, structure and behaviour of organizations. Nevertheless, these tries have not been fully accepted due to some factors: (1) missing pieces to define the status of the organization; (2) tangled parts to specify the structure (concepts and relations) between the elements of the organization; 3) lack of text properties, and other factors. These divergences imply a set of incomplete approaches to formalize data and information about organizations. Taking into account the current trends of applying semantic web technologies and linked data to formalize, aggregate, and share domain specific information, a new model for organizations taking advantage of these initiatives is required in order to overcome existing barriers and exploit the corporate information in a standard way. This work is especially relevant in some senses to: (1) unify existing models to provide a common specification; (2) apply semantic web technologies and the Linked Data approach; (3) provide access to the information via standard protocols, and (4) offer new services that can exploit this information to trace the evolution and behaviour of the organization over time. Finally, this work is interesting to improve the clarity and transparency of some scenarios in which organizations play a key role, like e-procurement, e-health, or financial transactions.


Author(s):  
Floriano Scioscia ◽  
Michele Ruta ◽  
Giuseppe Loseto ◽  
Filippo Gramegna ◽  
Saverio Ieva ◽  
...  

The Semantic Web of Things (SWoT) aims to support smart semantics-enabled applications and services in pervasive contexts. Due to architectural and performance issues, most Semantic Web reasoners are often impractical to be ported: they are resource consuming and are basically designed for standard inference tasks on large ontologies. On the contrary, SWoT use cases generally require quick decision support through semantic matchmaking in resource-constrained environments. This paper describes Mini-ME (the Mini Matchmaking Engine), a mobile inference engine designed from the ground up for the SWoT. It supports Semantic Web technologies and implements both standard (subsumption, satisfiability, classification) and non-standard (abduction, contraction, covering, bonus, difference) inference services for moderately expressive knowledge bases. In addition to an architectural and functional description, usage scenarios and experimental performance evaluation are presented on PC (against other popular Semantic Web reasoners), smartphone and embedded single-board computer testbeds.


2008 ◽  
pp. 3309-3320
Author(s):  
Csilla Farkas

This chapter investigates the threat of unwanted Semantic Web inferences. We survey the current efforts to detect and remove unwanted inferences, identify research gaps, and recommend future research directions. We begin with a brief overview of Semantic Web technologies and reasoning methods, followed by a description of the inference problem in traditional databases. In the context of the Semantic Web, we study two types of inferences: (1) entailments defined by the formal semantics of the Resource Description Framework (RDF) and the RDF Schema (RDFS) and (2) inferences supported by semantic languages like the Web Ontology Language (OWL). We compare the Semantic Web inferences to the inferences studied in traditional databases. We show that the inference problem exists on the Semantic Web and that existing security methods do not fully prevent indirect data disclosure via inference channels.


2011 ◽  
pp. 1437-1461 ◽  
Author(s):  
Rui Lopes ◽  
Luís Carriço

Web Accessibility is a hot topic today. Striving for social inclusion has resulted in the requirement of providing accessible content to all users. However, since each user is unique, and the Web evolves in a decentralized way, little or none is known about the shape of the Web’s accessibility on its own at a large scale, as well as from the point-of-view of each user. In this chapter the authors present the Web Accessibility Knowledge Framework as the foundation for specifying the relevant information about the accessibility of a Web page. This framework leverages Semantic Web technologies, side by side with audience modeling and accessibility metrics, as a way to study the Web as an entity with unique accessibility properties dependent from each user’s point of view. Through this framework, the authors envision a set of queries that can help harnessing and inferring this kind of knowledge from Web graphs.


Author(s):  
José Manuel Gómez-Pérez ◽  
Víctor Méndez

Since the use of electronic invoicing in business transactions was approved by the EU back in 2002, its application in Europe has grown considerably. However, despite the existence of standards like EDIFACT or UBL, widespread take up of electronic invoicing has been hindered by the enormous heterogeneity of proprietary solutions. In this chapter, the authors present an approach towards addressing the interoperability problem in electronic invoice exchange, based on ontologies and Semantic Web technologies. The authors propose methods and provide usable tools that leverage the knowledge of users of electronic invoicing systems by empowering them to define correspondences between sample electronic invoice data and a formal model of electronic invoicing represented as networked ontologies. The chapter follows a learn-by-example approach where, based on such correspondences, networked ontologies serve as a semantic hub for large-scale transformation of e-invoice data between heterogeneous e-invoicing formats and models. The approach has been evaluated through the development of a reference implementation and its deployment in the pharmaceutical sector.


Sign in / Sign up

Export Citation Format

Share Document