scholarly journals RMLx : Mapping Interface for Integrating Open Data with Linked Data Exploration Environment

2017 ◽  
Author(s):  
Peb Ruswono Aryan ◽  
Fajar Juang Ekaputra ◽  
Kabul Kurniawan ◽  
Elmar Kiesling ◽  
A Min Tjoa

Recent advances in linked data generation through mapping such as RML (RDF mapping language) allows for providing large-scale RDF data in a more automatic way.However, considerable amount of data in open data portals remain inaccessible as linked data.This is due to the nature of data portals having large number of small-size dataset which makes writing mapping description becomes tedious and error-prone. Moreover, these data sources requires additional preprocessing before To solve this challenge, We introduce extensions to RML to support required tasks and developed RMLx, a visual web-interface to create RML mappings.Using this interface, the process of creating mapping description can become faster and less error-prone.Furthermore, the process of linked data generation can be wrapped as to enable integration with other data in a linked data exploration environment. We explore on four different use cases to identify the requirements followed by describing how these are solved.

Heritage ◽  
2020 ◽  
Vol 3 (3) ◽  
pp. 875-890
Author(s):  
Ikrom Nishanbaev ◽  
Erik Champion ◽  
David A. McMeekin

Recently, many Resource Description Framework (RDF) data generation tools have been developed to convert geospatial and non-geospatial data into RDF data. Furthermore, there are several interlinking frameworks that find semantically equivalent geospatial resources in related RDF data sources. However, many existing Linked Open Data sources are currently sparsely interlinked. Also, many RDF generation and interlinking frameworks require a solid knowledge of Semantic Web and Geospatial Semantic Web concepts to successfully deploy them. This article comparatively evaluates features and functionality of the current state-of-the-art geospatial RDF generation tools and interlinking frameworks. This evaluation is specifically performed for cultural heritage researchers and professionals who have limited expertise in computer programming. Hence, a set of criteria has been defined to facilitate the selection of tools and frameworks. In addition, the article provides a methodology to generate geospatial cultural heritage RDF data and to interlink it with the related RDF data. This methodology uses a CIDOC Conceptual Reference Model (CRM) ontology and interlinks the RDF data with DBpedia. Although this methodology has been developed for cultural heritage researchers and professionals, it may also be used by other domain professionals.


Epidemiologia ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 315-324
Author(s):  
Juan M. Banda ◽  
Ramya Tekumalla ◽  
Guanyu Wang ◽  
Jingyuan Yu ◽  
Tuo Liu ◽  
...  

As the COVID-19 pandemic continues to spread worldwide, an unprecedented amount of open data is being generated for medical, genetics, and epidemiological research. The unparalleled rate at which many research groups around the world are releasing data and publications on the ongoing pandemic is allowing other scientists to learn from local experiences and data generated on the front lines of the COVID-19 pandemic. However, there is a need to integrate additional data sources that map and measure the role of social dynamics of such a unique worldwide event in biomedical, biological, and epidemiological analyses. For this purpose, we present a large-scale curated dataset of over 1.12 billion tweets, growing daily, related to COVID-19 chatter generated from 1 January 2020 to 27 June 2021 at the time of writing. This data source provides a freely available additional data source for researchers worldwide to conduct a wide and diverse number of research projects, such as epidemiological analyses, emotional and mental responses to social distancing measures, the identification of sources of misinformation, stratified measurement of sentiment towards the pandemic in near real time, among many others.


2017 ◽  
Vol 44 (2) ◽  
pp. 203-229 ◽  
Author(s):  
Javier D Fernández ◽  
Miguel A Martínez-Prieto ◽  
Pablo de la Fuente Redondo ◽  
Claudio Gutiérrez

The publication of semantic web data, commonly represented in Resource Description Framework (RDF), has experienced outstanding growth over the last few years. Data from all fields of knowledge are shared publicly and interconnected in active initiatives such as Linked Open Data. However, despite the increasing availability of applications managing large-scale RDF information such as RDF stores and reasoning tools, little attention has been given to the structural features emerging in real-world RDF data. Our work addresses this issue by proposing specific metrics to characterise RDF data. We specifically focus on revealing the redundancy of each data set, as well as common structural patterns. We evaluate the proposed metrics on several data sets, which cover a wide range of designs and models. Our findings provide a basis for more efficient RDF data structures, indexes and compressors.


2018 ◽  
Vol 186 ◽  
pp. 12013 ◽  
Author(s):  
Luisa Schiavone ◽  
Federico Morando ◽  

The CoBiS is a network formed by 65 libraries. The project is a pilot for Piedmont that is aiming to provide the Committee with an infrastructure for LOD publishing, thus creating a triplification pipeline designed to be easy to automate and replicate. This is being realized with open source technologies, such as the RML mapping language or the JARQL tool that uses Linked Data to describe the conversion of XML, JSON or tabular data into RDF. The first challenge consisted in making possible the dialog of heterogeneous data sources, coming from four different library software (Clavis, Erasmo, SBNWeb and BIBLIOWin 5.0web) and different types of data (bibliographic, multimedia, and archival). The information contained in the catalogs is progressively interlinked with external data sources, such as Wikidata, VIAF, LoC and BNF authority files, Wikipedia and the Dizionario Biografico degli Italiani. Partners of the CoBiS LOD Project are: National Institute for Astrophysics (INAF), Turin Academy of Sciences, Olivetti Historical Archives Association, Alpine Club National Library, Deputazione Subalpina di Storia Patria, National Institute for Metrological Research (INRIM). The technical realization of the project is entrusted to Synapta, and it is partially sponsored by Piedmont Region.


Author(s):  
Bonnie MacKellar ◽  
Christina Schweikert ◽  
Soon Ae Chun

Patients often want to participate in relevant clinical trials for new or more effective alternative treatments. The clinical search system made available by the NIH is a step forward to support the patient's decision making, but, it is difficult to use and requires the patient to sift through lengthy text descriptions for relevant information. In addition, patients deciding whether to pursue a given trial often want more information, such as drug information. The authors' overall aim is to develop an intelligent patient-centered clinical trial decision support system. Their approach is to integrate Open Data sources related to clinical trials using the Semantic Web's Linked Data framework. The linked data representation, in terms of RDF triples, allows the development of a clinical trial knowledge base that includes entities from different open data sources and relationships among entities. The authors consider Open Data sources such as clinical trials provided by NIH as well as the drug side effects dataset SIDER. The authors use UMLS (Unified Medical Language System) to provide consistent semantics and ontological knowledge for clinical trial related entities and terms. The authors' semantic approach is a step toward a cognitive system that provides not only patient-centered integrated data search but also allows automated reasoning in search, analysis and decision making using the semantic relationships embedded in the Linked data. The authors present their integrated clinical trial knowledge base development and a prototype, patient-centered Clinical Trial Decision Support System that include capabilities of semantic search and query with reasoning ability, and semantic-link browsing where an exploration of one concept leads to other concepts easily via links which can provide visual search for the end users.


Semantic Web ◽  
2020 ◽  
pp. 1-25
Author(s):  
Andre Gomes Regino ◽  
Julio Cesar dos Reis ◽  
Rodrigo Bonacin ◽  
Ahsan Morshed ◽  
Timos Sellis

RDF data has been extensively deployed describing various types of resources in a structured way. Links between data elements described by RDF models stand for the core of Semantic Web. The rising amount of structured data published in public RDF repositories, also known as Linked Open Data, elucidates the success of the global and unified dataset proposed by the vision of the Semantic Web. Nowadays, semi-automatic algorithms build connections among these datasets by exploring a variety of methods. Interconnected open data demands automatic methods and tools to maintain their consistency over time. The update of linked data is considered as key process due to the evolutionary characteristic of such structured datasets. However, data changing operations might influence well-formed links, which turns difficult to maintain the consistencies of connections over time. In this article, we propose a thorough survey that provides a systematic review of the state of the art in link maintenance in linked open data evolution scenario. We conduct a detailed analysis of the literature for characterising and understanding methods and algorithms responsible for detecting, fixing and updating links between RDF data. Our investigation provides a categorisation of existing approaches as well as describes and discusses existing studies. The results reveal an absence of comprehensive solutions suited to fully detect, warn and automatically maintain the consistency of linked data over time.


2021 ◽  
Vol 9 ◽  
Author(s):  
Kirti Sundar Sahu ◽  
Shannon E. Majowicz ◽  
Joel A. Dubin ◽  
Plinio Pelegrini Morita

Recent advances in technology have led to the rise of new-age data sources (e.g., Internet of Things (IoT), wearables, social media, and mobile health). IoT is becoming ubiquitous, and data generation is accelerating globally. Other health research domains have used IoT as a data source, but its potential has not been thoroughly explored and utilized systematically in public health surveillance. This article summarizes the existing literature on the use of IoT as a data source for surveillance. It presents the shortcomings of current data sources and how NextGen data sources, including the large-scale applications of IoT, can meet the needs of surveillance. The opportunities and challenges of using these modern data sources in public health surveillance are also explored. These IoT data ecosystems are being generated with minimal effort by the device users and benefit from high granularity, objectivity, and validity. Advances in computing are now bringing IoT-based surveillance into the realm of possibility. The potential advantages of IoT data include high-frequency, high volume, zero effort data collection methods, with a potential to have syndromic surveillance. In contrast, the critical challenges to mainstream this data source within surveillance systems are the huge volume and variety of data, fusing data from multiple devices to produce a unified result, and the lack of multidisciplinary professionals to understand the domain and analyze the domain data accordingly.


Author(s):  
D. Ulutaş Karakol ◽  
G. Kara ◽  
C. Yılmaz ◽  
Ç. Cömert

<p><strong>Abstract.</strong> Large amounts of spatial data are hold in relational databases. Spatial data in the relational databases must be converted to RDF for semantic web applications. Spatial data is an important key factor for creating spatial RDF data. Linked Data is the most preferred way by users to publish and share data in the relational databases on the Web. In order to define the semantics of the data, links are provided to vocabularies (ontologies or other external web resources) that are common conceptualizations for a domain. Linking data of resource vocabulary with globally published concepts of domain resources combines different data sources and datasets, makes data more understandable, discoverable and usable, improves data interoperability and integration, provides automatic reasoning and prevents data duplication. The need to convert relational data to RDF is coming in sight due to semantic expressiveness of Semantic Web Technologies. One of the important key factors of Semantic Web is ontologies. Ontology means “explicit specification of a conceptualization”. The semantics of spatial data relies on ontologies. Linking of spatial data from relational databases to the web data sources is not an easy task for sharing machine-readable interlinked data on the Web. Tim Berners-Lee, the inventor of the World Wide Web and the advocate of Semantic Web and Linked Data, layed down the Linked Data design principles. Based on these rules, firstly, spatial data in the relational databases must be converted to RDF with the use of supporting tools. Secondly, spatial RDF data must be linked to upper level-domain ontologies and related web data sources. Thirdly, external data sources (ontologies and web data sources) must be determined and spatial RDF data must be linked related data sources. Finally, spatial linked data must be published on the web. The main contribution of this study is to determine requirements for finding RDF links and put forward the deficiencies for creating or publishing linked spatial data. To achieve this objective, this study researches existing approaches, conversion tools and web data sources for relational data conversion to the spatial RDF. In this paper, we have investigated current state of spatial RDF data, standards, open source platforms (particularly D2RQ, Geometry2RDF, TripleGeo, GeoTriples, Ontop, etc.) and the Web Data Sources. Moreover, the process of spatial data conversion to the RDF and how to link it to the web data sources is described. The implementation of linking spatial RDF data to the web data sources is demonstrated with an example use case. Road data has been linked to the one of the related popular web data sources, DBPedia. SILK, a tool for discovering relationships between data items within different Linked Data sources, is used as a link discovery framework. Also, we evaluated other link discovery tools e.g. LIMES, Silk and results are compared to carry out matching/linking task. As a result, linked road data is shared and represented as an information resource on the web and enriched with definitions of related different resources. By this way, road datasets are also linked by the related classes, individuals, spatial relations and properties they cover such as, construction date, road length, coordinates, etc.</p>


Sign in / Sign up

Export Citation Format

Share Document