scholarly journals Accessing biological data in R with semantic web technologies

Author(s):  
Egon Willighagen

Background. Semantic Web technologies are increasingly used in biological database systems. The improved expressiveness shows advantages in tracking provenance and allowing knowledge to be more explicitly annotated. The list of semantic web standards needs a complementary set of tools to handle data in those formats to use them in bioinformatics workflows. Methods. The approach proposed in this paper uses the Apache Jena library to create an environment where semantic web technologies can be used in the statistical environment R. The code is exposed as two R packages available from the Comprehensive R Archive Network (CRAN). The RJava library and a custom convenience class is used to bridge between R and the Jena library. Results. We here present three examples showing how the Resource Description Framework (RDF) and SPARQL query standards can be employed in R. The first example takes input on BRCA1 SNPs from a BioMart and converts this into a RDF data set. The second example runs a query on an experimental remote SPARQL end point provided by Uniprot, and searches textual annotations of proteins encoded by the BRCA1 gene. The third example shows how the package can be used to handle RDF returned by OpenTox web services. Discussion. The two provided library bring basic semantic web technologies to R. This paper only shows examples from the biology domain, but we believe the approaches are generally applicable. It currently only exposes a subset of key Apache Jena, but the rrdf package makes it easy to make more of the library's functionality, such as the shortest path finding. The rrdf libraries are freely available from the CRAN under the Affero GNU Public License version 3: http://cran.r-project.org/web/packages/rrdf/.

Author(s):  
Egon Willighagen

Background. Semantic Web technologies are increasingly used in biological database systems. The improved expressiveness shows advantages in tracking provenance and allowing knowledge to be more explicitly annotated. The list of semantic web standards needs a complementary set of tools to handle data in those formats to use them in bioinformatics workflows. Methods. The approach proposed in this paper uses the Apache Jena library to create an environment where semantic web technologies can be used in the statistical environment R. The code is exposed as two R packages available from the Comprehensive R Archive Network (CRAN). The RJava library and a custom convenience class is used to bridge between R and the Jena library. Results. We here present three examples showing how the Resource Description Framework (RDF) and SPARQL query standards can be employed in R. The first example takes input on BRCA1 SNPs from a BioMart and converts this into a RDF data set. The second example runs a query on an experimental remote SPARQL end point provided by Uniprot, and searches textual annotations of proteins encoded by the BRCA1 gene. The third example shows how the package can be used to handle RDF returned by OpenTox web services. Discussion. The two provided library bring basic semantic web technologies to R. This paper only shows examples from the biology domain, but we believe the approaches are generally applicable. It currently only exposes a subset of key Apache Jena, but the rrdf package makes it easy to make more of the library's functionality, such as the shortest path finding. The rrdf libraries are freely available from the CRAN under the Affero GNU Public License version 3: http://cran.r-project.org/web/packages/rrdf/.


2014 ◽  
Author(s):  
Egon Willighagen

Background. Semantic Web technologies are increasingly used in biological database systems. The improved expressiveness show advantages in tracking provenance and allowing knowledge to be more explicitly annotated. The list of semantic web standards needs a complementary set of tools to handle data in those formats to use them in bioinformatics workflows. Methods. The approach proposed in this paper uses the Apache Jena library to create an environment where semantic web technologies can be use in the statistical environment R. The code is exposed as two R packages available from the Comprehensive R Archive Network (CRAN). The RJava library and a custom convenience class is used to bridge between R and the Jena library. Results. We here present two examples showing how the Resource Description Framework (RDF) and SPARQL query standards can be employed in R. The first example takes input on BRCA1 SNPs from a BioMart and converts this into a RDF data set. The second example runs a query on an experimental remote SPARQL end point provided by Uniprot, and searches textual annotations of proteins encoded by the BRCA1 gene. The third example shows how the package can be used to handle RDF returned by OpenTox web services. Discussion. The two provided library bring basic semantic web technologies to R. While only a subset of Apache Jena is currently exposed, it provides key methods to deal with RDF data and resources. The libraries are freely available from the CRAN under the Affero GNU Public License version 3: http://cran.r-project.org/web/packages/rrdf/.


Author(s):  
Egon Willighagen

Background. Semantic Web technologies are increasingly used in biological database systems. The improved expressiveness show advantages in tracking provenance and allowing knowledge to be more explicitly annotated. The list of semantic web standards needs a complementary set of tools to handle data in those formats to use them in bioinformatics workflows. Methods. The approach proposed in this paper uses the Apache Jena library to create an environment where semantic web technologies can be use in the statistical environment R. The code is exposed as two R packages available from the Comprehensive R Archive Network (CRAN). The RJava library and a custom convenience class is used to bridge between R and the Jena library. Results. We here present two examples showing how the Resource Description Framework (RDF) and SPARQL query standards can be employed in R. The first example takes input on BRCA1 SNPs from a BioMart and converts this into a RDF data set. The second example runs a query on an experimental remote SPARQL end point provided by Uniprot, and searches textual annotations of proteins encoded by the BRCA1 gene. Discussion. The two provided library bring basic semantic web technologies to R. While only a subset of Apache Jena is currently exposed, it provides key methods to deal with RDF data and resources. The libraries are freely available from the CRAN under the Affero GNU Public License version 3: http://cran.r-project.org/web/packages/rrdf/.


2008 ◽  
pp. 3309-3320
Author(s):  
Csilla Farkas

This chapter investigates the threat of unwanted Semantic Web inferences. We survey the current efforts to detect and remove unwanted inferences, identify research gaps, and recommend future research directions. We begin with a brief overview of Semantic Web technologies and reasoning methods, followed by a description of the inference problem in traditional databases. In the context of the Semantic Web, we study two types of inferences: (1) entailments defined by the formal semantics of the Resource Description Framework (RDF) and the RDF Schema (RDFS) and (2) inferences supported by semantic languages like the Web Ontology Language (OWL). We compare the Semantic Web inferences to the inferences studied in traditional databases. We show that the inference problem exists on the Semantic Web and that existing security methods do not fully prevent indirect data disclosure via inference channels.


2020 ◽  
Vol 1 (1) ◽  
pp. 428-444 ◽  
Author(s):  
Silvio Peroni ◽  
David Shotton

OpenCitations is an infrastructure organization for open scholarship dedicated to the publication of open citation data as Linked Open Data using Semantic Web technologies, thereby providing a disruptive alternative to traditional proprietary citation indexes. Open citation data are valuable for bibliometric analysis, increasing the reproducibility of large-scale analyses by enabling publication of the source data. Following brief introductions to the development and benefits of open scholarship and to Semantic Web technologies, this paper describes OpenCitations and its data sets, tools, services, and activities. These include the OpenCitations Data Model; the SPAR (Semantic Publishing and Referencing) Ontologies; OpenCitations’ open software of generic applicability for searching, browsing, and providing REST APIs over resource description framework (RDF) triplestores; Open Citation Identifiers (OCIs) and the OpenCitations OCI Resolution Service; the OpenCitations Corpus (OCC), a database of open downloadable bibliographic and citation data made available in RDF under a Creative Commons public domain dedication; and the OpenCitations Indexes of open citation data, of which the first and largest is COCI, the OpenCitations Index of Crossref Open DOI-to-DOI Citations, which currently contains over 624 million bibliographic citations and is receiving considerable usage by the scholarly community.


Author(s):  
Flavius Frasincar ◽  
Jethro Borsje ◽  
Frederik Hogenboom

This chapter describes Hermes, a framework for building personalized news services using Semantic Web technologies. The Hermes framework consists of four phases: classification, which categorizes news items with respect to a domain ontology, knowledge base updating, which keeps the knowledge base up-to-date based on the news information, news querying, which allows the user to search the news with concepts of interest, and results presentation, which shows the news results of the search process. Hermes is supported by a framework implementation, the Hermes News Portal, a tool that enables users to have a personalized access to news items. The Hermes framework and its associated implementation aim at advancing the state-of-the-art of semantic approaches for personalized news services by employing Semantic Web standards, exploiting and keeping up-to-date domain information, using advanced natural language processing techniques (e.g., ontology-based gazetteering, word sense disambiguation, etc.), and supporting time-based queries for expressing the desired news items.


Author(s):  
Anna Neovesky ◽  
Frederic von Vlahovits

Abstract The growing amount of openly available research data enables various possibilities of reuse. Data can be analyzed, visualized, or even further processed, enriched, and combined with other sources to enable new research questions and a different view on the material. Several catalogues and research tools aggregate collections on specific topics to make them searchable and reusable. To be able to bring together different collections, a common data standard is necessary. This article discusses how semantic web technologies can be used to connect digital as well as analogue music catalogues and music editions. The article first discusses how music can be searched using its characteristic melody and presents the approach of the open-source search engine for music incipits, IncipitSearch. Subsequently, the advantages of semantic web standards for musicology are highlighted. Then, the underlying RDF- and schema.org-based metadata standard, which is used to aggregate and distribute the data, is discussed. The article concludes with an outlook on research perspectives for digital musicology and musicology in general.


Author(s):  
E. Iadanza ◽  
F. Maietti ◽  
A. E. Ziri ◽  
R. Di Giulio ◽  
M. Medici ◽  
...  

<p><strong>Abstract.</strong> Within the EU funded project INCEPTION &amp;ndash; <i>Inclusive Cultural Heritage in Europe through 3D semantic modelling</i>, the key-targeted achievement is the development of a specific cloud based platform, in order to accomplish the main objectives of accessing, understanding and strengthening European Cultural Heritage by means of enriched 3D models. The whole INCEPTION project is based on the close connection between state-of-the-art architectural modeling technologies (BIM, Building Information Modeling) and the latest cutting-edge web technologies. The platform is grounded on semantic web technologies and makes extensive use of WebGL and RESTful APIs, in order to enrich heritage 3D models by using Semantic Web standards. The INCEPTION platform will be a space for interchange of information and for the dialogue among professionals, students, scholars, curators, non-expert users, etc. Furthermore, the Semantic Web structure interlinks the platform with external Cultural Heritage available linked data and makes it gradually enhanced by specific flexible data structures provided as project specific ontologies. The paper will describe solutions based on the match between BIM, Cloud and Semantic Web.</p>


Sign in / Sign up

Export Citation Format

Share Document