scholarly journals HAMAP as SPARQL rules—A portable annotation pipeline for genomes and proteomes

GigaScience ◽  
2020 ◽  
Vol 9 (2) ◽  
Author(s):  
Jerven Bolleman ◽  
Edouard de Castro ◽  
Delphine Baratin ◽  
Sebastien Gehant ◽  
Beatrice A Cuche ◽  
...  

Abstract Background Genome and proteome annotation pipelines are generally custom built and not easily reusable by other groups. This leads to duplication of effort, increased costs, and suboptimal annotation quality. One way to address these issues is to encourage the adoption of annotation standards and technological solutions that enable the sharing of biological knowledge and tools for genome and proteome annotation. Results Here we demonstrate one approach to generate portable genome and proteome annotation pipelines that users can run without recourse to custom software. This proof of concept uses our own rule-based annotation pipeline HAMAP, which provides functional annotation for protein sequences to the same depth and quality as UniProtKB/Swiss-Prot, and the World Wide Web Consortium (W3C) standards Resource Description Framework (RDF) and SPARQL (a recursive acronym for the SPARQL Protocol and RDF Query Language). We translate complex HAMAP rules into the W3C standard SPARQL 1.1 syntax, and then apply them to protein sequences in RDF format using freely available SPARQL engines. This approach supports the generation of annotation that is identical to that generated by our own in-house pipeline, using standard, off-the-shelf solutions, and is applicable to any genome or proteome annotation pipeline. Conclusions HAMAP SPARQL rules are freely available for download from the HAMAP FTP site, ftp://ftp.expasy.org/databases/hamap/sparql/, under the CC-BY-ND 4.0 license. The annotations generated by the rules are under the CC-BY 4.0 license. A tutorial and supplementary code to use HAMAP as SPARQL are available on GitHub at https://github.com/sib-swiss/HAMAP-SPARQL, and general documentation about HAMAP can be found on the HAMAP website at https://hamap.expasy.org.

2018 ◽  
Vol 10 (8) ◽  
pp. 2613
Author(s):  
Dandan He ◽  
Zhongfu Li ◽  
Chunlin Wu ◽  
Xin Ning

Industrialized construction has raised the requirements of procurement methods used in the construction industry. The rapid development of e-commerce offers efficient and effective solutions, however the large number of participants in the construction industry means that the data involved are complex, and problems arise related to volume, heterogeneity, and fragmentation. Thus, the sector lags behind others in the adoption of e-commerce. In particular, data integration has become a barrier preventing further development. Traditional e-commerce platform, which considered data integration for common product data, cannot meet the requirements of construction product data integration. This study aimed to build an information-integrated e-commerce platform for industrialized construction procurement (ICP) to overcome some of the shortcomings existing platforms. We proposed a platform based on Building Information Modelling (BIM) and linked data, taking an innovative approach to data integration. It uses industrialized construction technology to support product standardization, BIM to support procurement process, and linked data to connect different data sources. The platform was validated using a case study. With the development of an e-commerce ontology, industrialized construction component information was extracted from BIM models and converted to Resource Description Framework (RDF) format. Related information from different data sources was also converted to RDF format, and Simple Protocol and Resource Description Framework Query Language (SPARQL) queries were implemented. The platform provides a solution for the development of e-commerce platform in the construction industry.


Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 34 ◽  
Author(s):  
Maria-Evangelia Papadaki ◽  
Nicolas Spyratos ◽  
Yannis Tzitzikas

The continuous accumulation of multi-dimensional data and the development of Semantic Web and Linked Data published in the Resource Description Framework (RDF) bring new requirements for data analytics tools. Such tools should take into account the special features of RDF graphs, exploit the semantics of RDF and support flexible aggregate queries. In this paper, we present an approach for applying analytics to RDF data based on a high-level functional query language, called HIFUN. According to that language, each analytical query is considered to be a well-formed expression of a functional algebra and its definition is independent of the nature and structure of the data. In this paper, we investigate how HIFUN can be used for easing the formulation of analytic queries over RDF data. We detail the applicability of HIFUN over RDF, as well as the transformations of data that may be required, we introduce the translation rules of HIFUN queries to SPARQL and we describe a first implementation of the proposed model.


2017 ◽  
Vol 10 (13) ◽  
pp. 499
Author(s):  
Poornima N ◽  
Shivam Agrawal ◽  
Shivam Agrawal ◽  
Saleena B ◽  
Saleena B

Objective: Most of the current search engines follow informal keyword based search. Finding the user intention and improving the relevancy of results are the major issues faced by the current traditional keyword based search. Targeting to solve the problems of traditional search and to boost the retrieval process, a framework for semantic based information retrieval is planned. Methods: Social and wine ontologies are used to find the user intention and retrieving it. User’s natural language queries are translated into SPARQL (SPARQL Protocol and Resource Description Framework query language) query for finding related items from those ontologies.Results: The proposed method makes a significant improvement over traditional search in terms of some searches required for searching a particular number of pages using performance graph.Conclusion: Semantic based search can understand the user intention and gives better results than traditional search.


Author(s):  
Leila Zemmouchi-Ghomari

The data on the web is heterogeneous and distributed, which makes its integration a sine qua non-condition for its effective exploitation within the context of the semantic web or the so-called web of data. A promising solution for web data integration is the linked data initiative, which is based on four principles that aim to standardize the publication of structured data on the web. The objective of this chapter is to provide an overview of the essential aspects of this fairly recent and exciting field, including the model of linked data: resource description framework (RDF), its query language: simple protocol, and the RDF query language (SPARQL), the available means of publication and consumption of linked data, and the existing applications and the issues not yet addressed in research.


2016 ◽  
Vol 31 (4) ◽  
pp. 391-413 ◽  
Author(s):  
Zongmin Ma ◽  
Miriam A. M. Capretz ◽  
Li Yan

AbstractThe Resource Description Framework (RDF) is a flexible model for representing information about resources on the Web. As a W3C (World Wide Web Consortium) Recommendation, RDF has rapidly gained popularity. With the widespread acceptance of RDF on the Web and in the enterprise, a huge amount of RDF data is being proliferated and becoming available. Efficient and scalable management of RDF data is therefore of increasing importance. RDF data management has attracted attention in the database and Semantic Web communities. Much work has been devoted to proposing different solutions to store RDF data efficiently. This paper focusses on using relational databases and NoSQL (for ‘not only SQL (Structured Query Language)’) databases to store massive RDF data. A full up-to-date overview of the current state of the art in RDF data storage is provided in the paper.


2016 ◽  
Vol 15 (1) ◽  
pp. 8
Author(s):  
Made Pradnyana Ambara ◽  
Made Sudarma ◽  
I Nyoman Satya Kumara

Data warehouse pada umumnya yang sering dikenal data warehouse tradisional mempunyai beberapa kelemahan yang mengakibatkan kualitas data yang dihasilkan tidak spesifik dan efektif. Sistem semantic data warehouse merupakan solusi untuk menangani permasalahan pada data warehouse tradisional dengan kelebihan antara lain: manajeman kualitas data yang spesifik dengan format data seragam untuk mendukung laporan OLAP yang baik, dan performance pencarian informasi yang lebih efektif dengan kata kunci bahasa alami. Pemodelan sistem semantic data warehouse menggunakan metode ontology menghasilkan model resource description framework schema (RDFS) logic yang akan ditransformasikan menjadi snowflake schema. Laporan akademik yang dibutuhkan dihasilkan melalui metode nine step Kimball dan pencarian semantic menggunakan metode rule based. Pengujian dilakukan menggunakan dua metode uji yaitu pengujian dengan black box testing dan angket kuesioner cheklist. Dari hasil penelitian ini dapat disimpulkan bahwa sistem semantic data warehouse dapat membantu proses pengolahan data akademik yang menghasilkan laporan yang berkualitas untuk mendukung proses pengambilan keputusan. DOI: 10.24843/MITE.1501.02


2017 ◽  
Vol 1 (2) ◽  
pp. 84-103 ◽  
Author(s):  
Dong Wang ◽  
Lei Zou ◽  
Dongyan Zhao

Abstract The Simple Protocol and RDF Query Language (SPARQL) query language allows users to issue a structural query over a resource description framework (RDF) graph. However, the lack of a spatiotemporal query language limits the usage of RDF data in spatiotemporal-oriented applications. As the spatiotemporal information continuously increases in RDF data, it is necessary to design an effective and efficient spatiotemporal RDF data management system. In this paper, we formally define the spatiotemporal information-integrated RDF data, introduce a spatiotemporal query language that extends the SPARQL language with spatiotemporal assertions to query spatiotemporal information-integrated RDF data, and design a novel index and the corresponding query algorithm. The experimental results on a large, real RDF graph integrating spatial and temporal information (> 180 million triples) confirm the superiority of our approach. In contrast to its competitors, gst-store outperforms by more than 20%-30% in most cases.


2021 ◽  
Vol 50 (02) ◽  
Author(s):  
TẠ DUY CÔNG CHIẾN

There are many applications related to semantic web, information retrieval, information extraction, and question answering applying ontologies in recent years. To avoid the conceptual and terminological confusion, an ontology is built as a taxonomy ontology which identifies and distinguishes concepts as well as terminology. It accomplishes this by specifying a set of generic concepts that characterizes the domain as well as their definitions and interrelationships. There are some methods to represent ontologies, such as Resource Description Framework (RDF), Web Ontology Language (OWL), databases etc. depending on the characteristic of data. RDF, OWL usually is used the cases when data structure is objects which the relationship among the objects is simple. But if the relationship among the objects is more complex, using databases for storing ontologies is an approach to be better. However, using relational databases do not sufficiently support the semantic orientated search by Structured Query Language (SQL) and the searching speed is slow. Therefore, this paper introduces an approach to extending query sentences for semantic oriented search on knowledge graph.


2018 ◽  
Vol 3 (1) ◽  
pp. 3 ◽  
Author(s):  
Yongming Wang ◽  
Sharon Q Yang

For the past ten years libraries have been working diligently towards Linked Data and the Semantic Web. Due to the complexity and vast scope of Linked Data, many people have a hard time to understand its technical details and its potential for the library community. This paper aims to help librarians better understand some important concepts by explaining the basic Linked Data technologies that consist of Resource Description Framework (RDF), the ontology, and the query language. It also includes an overview of the achievements by libraries around the world in their efforts to turn library data into Linked Data including those by Library of Congress, OCLC, and some other national libraries. Some of the challenges and setbacks that libraries have encountered are analyzed and discussed. In spite of the difficulties, there is no way to turn back. Libraries will have to succeed.


2018 ◽  
Vol 6 (1) ◽  
pp. 226-239
Author(s):  
Guidedi Kaladzavi ◽  
Papa Fary Diallo ◽  
Cedric Bere ◽  
Olivier Corby ◽  
Isabelle Minrel ◽  
...  

Considering the evolution of the semantic wiki engine on-based platforms, two main approaches could be distinguished: Ontologies for Wikis (OfW) and Wikis for Ontologies (WfO). OfW vision requires the existing ontologies to be imported. Most of them use the Resource Description Framework (RDF-based) systems in conjunction with the standard Structured Query Language (SQL) database to manage and query semantic data. But, relational database is not an ideal type of storage for semantic data. A more natural data model for Semantic MediaWiki (SMW) is RDF, a data format that organizes information in graphs rather than in fixed database tables. This paper presents an ontology on-based architecture, which aims to implement this idea. The Architecture mainly includes three-layered functional architecture: Web User Interface Layer, Semantic Layer, and Persistence Layer.


Sign in / Sign up

Export Citation Format

Share Document