Handling qualitative preferences in SPARQL over virtual ontology-based data access

A Framework Uniting Ontology-Based Geodata Integration and Geovisual Analytics

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9080474 ◽

2020 ◽

Vol 9 (8) ◽

pp. 474

Author(s):

Linfang Ding ◽

Guohui Xiao ◽

Diego Calvanese ◽

Liqiu Meng

Keyword(s):

Data Integration ◽

Visual Analysis ◽

Open Data ◽

Data Access ◽

Heterogeneous Data ◽

Geospatial Data ◽

Making Sense ◽

Research Areas ◽

Geovisual Analytics ◽

Integration Techniques

In a variety of applications relying on geospatial data, getting insights into heterogeneous geodata sources is crucial for decision making, but often challenging. The reason is that it typically requires combining information coming from different sources via data integration techniques, and then making sense out of the combined data via sophisticated analysis methods. To address this challenge we rely on two well-established research areas: data integration and geovisual analytics, and propose to adopt an ontology-based approach to decouple the challenges of data access and analytics. Our framework consists of two modules centered around an ontology: (1) an ontology-based data integration (OBDI) module, in which mappings specify the relationship between the underlying data and a domain ontology; (2) a geovisual analytics (GeoVA) module, designed for the exploration of the integrated data, by explicitly making use of standard ontologies. In this framework, ontologies play a central role by providing a coherent view over the heterogeneous data, and by acting as a mediator for visual analysis tasks. We test our framework in a scenario for the investigation of the spatiotemporal patterns of meteorological and traffic data from several open data sources. Initial studies show that our approach is feasible for the exploration and understanding of heterogeneous geospatial data.

Download Full-text

Towards an Open Data Framework for Body Sensor Networks Supporting Bluetooth Low Energy

10.1101/076166 ◽

2016 ◽

Author(s):

Ninoshka K. Singh ◽

Darrell O Ricke

Keyword(s):

Health Status ◽

New Technologies ◽

Emotional Health ◽

Open Data ◽

Data Access ◽

Heterogeneous Data ◽

Low Energy ◽

Easy Access ◽

Bluetooth Low Energy ◽

Data Framework

AbstractMajor companies, healthcare professionals, the military, and other scientists and innovators are now sensing that fitness and health data from wearable biosensors will likely provide new discoveries and insights into physiological, cognitive, and emotional health status of an individual. Having the ability to collect, process, and correlate data simultaneously from a set of heterogonous biosensor sources may be a key factor in informing the development of new technologies for reducing health risks, improving health status, and possibly preventing and predicting disease. The challenge in achieving this is getting easy access to heterogeneous data from a set of disparate sensors in a single, integrated wearable monitoring system. Often times, the data recorded by commercial biosensing devices are contained within each manufacturer’s proprietary platform. Summary data is available for some devices as free downloads or included only in annual premium memberships. Access to raw measurements is generally unavailable, especially from a custom developed application that may include prototype biosensors. In this paper, we explore key ideas on how to leverage the design features of Bluetooth Low Energy to ease the integration of disparate biosensors at the sensor communication layer. This component is intended to fit into a larger, multi-layered, open data framework that can provide additional data management and analytics capabilities for consumers and scientists alike at all the layers of a data access model which is typically employed in a body sensor network system.

Download Full-text

Development of Knowledge Graph for Data Management Related to Flooding Disasters Using Open Data

Future Internet ◽

10.3390/fi13050124 ◽

2021 ◽

Vol 13 (5) ◽

pp. 124

Author(s):

Jiseong Son ◽

Chul-Su Lim ◽

Hyoung-Seop Shim ◽

Ji-Sun Kang

Keyword(s):

Artificial Intelligence ◽

Domain Knowledge ◽

Open Data ◽

Heterogeneous Data ◽

Big Data Analysis ◽

Knowledge Graph ◽

Cross Domain ◽

Disaster Data ◽

Knowledge Graphs ◽

Open Datasets

Despite the development of various technologies and systems using artificial intelligence (AI) to solve problems related to disasters, difficult challenges are still being encountered. Data are the foundation to solving diverse disaster problems using AI, big data analysis, and so on. Therefore, we must focus on these various data. Disaster data depend on the domain by disaster type and include heterogeneous data and lack interoperability. In particular, in the case of open data related to disasters, there are several issues, where the source and format of data are different because various data are collected by different organizations. Moreover, the vocabularies used for each domain are inconsistent. This study proposes a knowledge graph to resolve the heterogeneity among various disaster data and provide interoperability among domains. Among disaster domains, we describe the knowledge graph for flooding disasters using Korean open datasets and cross-domain knowledge graphs. Furthermore, the proposed knowledge graph is used to assist, solve, and manage disaster problems.

Download Full-text

Data Inter-Operability Driven by Oceanic Data Assimilation Needs

Marine Technology Society Journal ◽

10.4031/mtsj.33.3.7 ◽

1999 ◽

Vol 33 (3) ◽

pp. 55-66 ◽

Cited By ~ 4

Author(s):

L. Charles Sun

Keyword(s):

Data Assimilation ◽

Data Access ◽

Heterogeneous Data ◽

Environmental Data ◽

Ocean Dynamics ◽

Data Systems ◽

Oceanographic Data ◽

Ease Of Access ◽

Access To Data ◽

Station Location

An interactive data access and retrieval system, developed at the U.S. National Oceanographic Data Genter (NODG) and available at <ext-link ext-link-type="uri" href="http://www.node.noaa.gov">http://www.node.noaa.gov</ext-link>, is presented in this paper. The purposes of this paper are: (1) to illustrate the procedures of quality control and loading oceanographic data into the NODG ocean databases and (2) to describe the development of a system to manage, visualize, and disseminate the NODG data holdings over the Internet. The objective of the system is to provide ease of access to data that will be required for data assimilation models. With advances in scientific understanding of the ocean dynamics, data assimilation models require the synthesis of data from a variety of resources. Modern intelligent data systems usually involve integrating distributed heterogeneous data and information sources. As the repository for oceanographic data, NOAA’s National Oceanographic Data Genter (NODG) is in a unique position to develop such a data system. In support of the data assimilation needs, NODG has developed a system to facilitate browsing of the oceanographic environmental data and information that is available on-line at NODG. Users may select oceanographic data based on geographic areas, time periods and measured parameters. Once the selection is complete, users may produce a station location plot, produce plots of the parameters or retrieve the data.

Download Full-text

RESOURCES PROVIDING DATA FOR MACHINE LEARNING AND TESTING ARTIFICIAL INTELLIGENCE TECHNOLOGIES

Информационные и математические технологии в науке и управлении ◽

10.38028/esi.2021.22.2.004 ◽

2021 ◽

pp. 39-52

Author(s):

Денис Валерьевич Сикулер

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

World Bank ◽

Open Data ◽

Data Access ◽

Free Access ◽

Internet Resources ◽

Data Registry ◽

Data Portal ◽

Access To Data

В статье выполнен обзор 10 ресурсов сети Интернет, позволяющих подобрать данные для разнообразных задач, связанных с машинным обучением и искусственным интеллектом. Рассмотрены как широко известные сайты (например, Kaggle, Registry of Open Data on AWS), так и менее популярные или узкоспециализированные ресурсы (к примеру, The Big Bad NLP Database, Common Crawl). Все ресурсы предоставляют бесплатный доступ к данным, в большинстве случаев для этого даже не требуется регистрация. Для каждого ресурса указаны характеристики и особенности, касающиеся поиска и получения наборов данных. В работе представлены следующие сайты: Kaggle, Google Research, Microsoft Research Open Data, Registry of Open Data on AWS, Harvard Dataverse Repository, Zenodo, Портал открытых данных Российской Федерации, World Bank, The Big Bad NLP Database, Common Crawl. The work presents review of 10 Internet resources that can be used to find data for different tasks related to machine learning and artificial intelligence. There were examined some popular sites (like Kaggle, Registry of Open Data on AWS) and some less known and specific ones (like The Big Bad NLP Database, Common Crawl). All included resources provide free access to data. Moreover in most cases registration is not needed for data access. Main features are specified for every examined resource, including regarding data search and access. The following sites are included in the review: Kaggle, Google Research, Microsoft Research Open Data, Registry of Open Data on AWS, Harvard Dataverse Repository, Zenodo, Open Data portal of the Russian Federation, World Bank, The Big Bad NLP Database, Common Crawl.

Download Full-text

Semantic-Based Geospatial Data Integration With Unique Features

Geospatial Intelligence ◽

10.4018/978-1-5225-8054-6.ch012 ◽

2019 ◽

pp. 254-277 ◽

Cited By ~ 1

Author(s):

Ying Zhang ◽

Chaopeng Li ◽

Na Chen ◽

Shaowen Liu ◽

Liming Du ◽

...

Keyword(s):

Data Integration ◽

High Performance ◽

Data Access ◽

Heterogeneous Data ◽

Geospatial Data ◽

Experimental Results ◽

Data Sources ◽

Data Format ◽

Access Protocols ◽

Data Source

Since large amount of geospatial data are produced by various sources, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. We mainly adopt four kinds of geospatial data sources to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).

Download Full-text

Semantic Web and Geospatial Unique Features Based Geospatial Data Integration

Geospatial Intelligence ◽

10.4018/978-1-5225-8054-6.ch011 ◽

2019 ◽

pp. 230-253

Author(s):

Ying Zhang ◽

Chaopeng Li ◽

Na Chen ◽

Shaowen Liu ◽

Liming Du ◽

...

Keyword(s):

Semantic Web ◽

Data Integration ◽

High Performance ◽

Data Access ◽

Heterogeneous Data ◽

Geospatial Data ◽

Data Sources ◽

Modeling Process ◽

Translation Function ◽

Data Source

Since large amount of geospatial data are produced by various sources and stored in incompatible formats, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. First, we provide a uniform integration paradigm for users to retrieve geospatial data. Then, we align the retrieved geospatial data in the modeling process to eliminate heterogeneity with the help of Karma. Our main contribution focuses on addressing the third problem. Previous work has been done by defining a set of semantic rules for performing the linking process. However, the geospatial data has some specific geospatial relationships, which is significant for linking but cannot be solved by the Semantic Web techniques directly. We take advantage of such unique features about geospatial data to implement the linking process. In addition, the previous work will meet a complicated problem when the geospatial data sources are in different languages. In contrast, our proposed linking algorithms are endowed with translation function, which can save the translating cost among all the geospatial sources with different languages. Finally, the geospatial data is integrated by eliminating data redundancy and combining the complementary properties from the linked records. We mainly adopt four kinds of geospatial data sources, namely, OpenStreetMap(OSM), Wikmapia, USGS and EPA, to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).

Download Full-text

Semantic-Based Geospatial Data Integration With Unique Features

Innovations, Developments, and Applications of Semantic Web and Information Systems - Advances in Web Technologies and Engineering ◽

10.4018/978-1-5225-5042-6.ch015 ◽

2018 ◽

pp. 393-416

Author(s):

Ying Zhang ◽

Chaopeng Li ◽

Na Chen ◽

Shaowen Liu ◽

Liming Du ◽

...

Keyword(s):

Data Integration ◽

High Performance ◽

Data Access ◽

Heterogeneous Data ◽

Geospatial Data ◽

Experimental Results ◽

Data Sources ◽

Data Format ◽

Access Protocols ◽

Data Source

Since large amount of geospatial data are produced by various sources, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. We mainly adopt four kinds of geospatial data sources to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).

Download Full-text

Creation and Integration of Reference Ontologies for Efficient LOD Management

Semi-Automatic Ontology Development ◽

10.4018/978-1-4666-0188-8.ch007 ◽

2012 ◽

pp. 162-199 ◽

Cited By ~ 1

Author(s):

Mariana Damova ◽

Atanas Kiryakov ◽

Maurice Grinberg ◽

Michael K. Bergman ◽

Frédérick Giasson ◽

...

Keyword(s):

Real Life ◽

Open Data ◽

Real Data ◽

Heterogeneous Data ◽

Linked Open Data ◽

Data Space ◽

Upper Level ◽

Reference Knowledge ◽

Global Data ◽

The Web

The chapter introduces the process of design of two upper-level ontologies—PROTON and UMBEL—into reference ontologies and their integration in the so-called Reference Knowledge Stack (RKS). It is argued that RKS is an important step in the efforts of the Linked Open Data (LOD) project to transform the Web into a global data space with diverse real data, available for review and analysis. RKS is intended to make the interoperability between published datasets much more efficient than it is now. The approach discussed in the chapter consists of developing reference layers of upper-level ontologies by mapping them to certain LOD schemata and assigning instance data to them so they cover a reasonable portion of the LOD datasets. The chapter presents the methods (manual and semi-automatic) used in the creation of the RKS and gives examples that illustrate its advantages for managing highly heterogeneous data and its usefulness in real life knowledge intense applications.

Download Full-text

The Dutch Participation Society Needs Open Data, but What is Meant by Open?

Handbook of Research on Citizen Engagement and Public Participation in the Era of New Media - Advances in Public Policy and Administration ◽

10.4018/978-1-5225-1081-9.ch017 ◽

2017 ◽

pp. 304-322

Author(s):

Roel During ◽

Marcel Pleijte ◽

Rosalie I. van Dam ◽

Irini E. Salverda

Keyword(s):

Social Action ◽

Web Sites ◽

Information Exchange ◽

Personal Information ◽

Open Data ◽

Data Access ◽

Representative Democracy ◽

Sensitive Information ◽

Data Governance ◽

Societal Benefit

Open data and citizen-led initiatives can be both friends and foes. Where it is available and ‘open', official data not only encourages increased public participation but can also generate the production and scrutiny of new material, potentially of benefit to the original provider and others, official or otherwise. In this way, official open data can be seen to improve democracy or, more accurately, the so-called ‘participative democracy'. On the other hand, the public is not always eager to share their personal information in the most open ways. Private and sometimes sensitive information however is required to initiate projects of societal benefit in difficult times. Many citizens appear content to channel personal information exchange via social media instead of putting it on public web sites. The perceived benefits from sharing and complete openness do not outweigh any disadvantages or fear of regulation. This is caused by various sources of contingency, such as the different appeals on citizens, construed in discourses on the participation society and the representative democracy, calling for social openness in the first and privacy protection in the latter. Moreover, the discourse on open data is an economic argument fighting the rules of privacy instead of the promotion of open data as one of the prerequisites for social action. Civil servants acknowledge that access to open data via all sorts of apps could contribute to the mushrooming of public initiatives, but are reluctant to release person-related sensitive information. The authors will describe and discuss this dilemma in the context of some recent case studies from the Netherlands concerning governmental programmes on open data and citizens' initiatives, to highlight both the governance constraints and uncertainties as well as citizens' concerns on data access and data sharing. It will be shown that openness has a different meaning and understanding in the participation society and representative democracy: i.e. the tension surrounding the sharing of private social information versus transparency. Looking from both sides at openness reveals double contingency: understanding and intentions on this openness invokes mutual enforcing uncertainties. This double contingency hampers citizens' eagerness to participate. The paper will conclude with a practical recommendation for improving data governance.

Download Full-text