J-CO: A Platform-Independent Framework for Managing Geo-Referenced JSON Data Sets

Giuseppe Psaila; Paolo Fosci

doi:10.3390/electronics10050621

J-CO: A Platform-Independent Framework for Managing Geo-Referenced JSON Data Sets

Electronics ◽

10.3390/electronics10050621 ◽

2021 ◽

Vol 10 (5) ◽

pp. 621

Author(s):

Giuseppe Psaila ◽

Paolo Fosci

Keyword(s):

Query Language ◽

Open Data ◽

Internet Technology ◽

Data Sets ◽

Specific Storage ◽

Current State ◽

Execution Engine ◽

Share Data ◽

Cloud Servers ◽

Computational Resources

Internet technology and mobile technology have enabled producing and diffusing massive data sets concerning almost every aspect of day-by-day life. Remarkable examples are social media and apps for volunteered information production, as well as Open Data portals on which public administrations publish authoritative and (often) geo-referenced data sets. In this context, JSON has become the most popular standard for representing and exchanging possibly geo-referenced data sets over the Internet.Analysts, wishing to manage, integrate and cross-analyze such data sets, need a framework that allows them to access possibly remote storage systems for JSON data sets, to retrieve and query data sets by means of a unique query language (independent of the specific storage technology), by exploiting possibly-remote computational resources (such as cloud servers), comfortably working on their PC in their office, more or less unaware of real location of resources. In this paper, we present the current state of the J-CO Framework, a platform-independent and analyst-oriented software framework to manipulate and cross-analyze possibly geo-tagged JSON data sets. The paper presents the general approach behind the J-CO Framework, by illustrating the query language by means of a simple, yet non-trivial, example of geographical cross-analysis. The paper also presents the novel features introduced by the re-engineered version of the execution engine and the most recent components, i.e., the storage service for large single JSON documents and the user interface that allows analysts to comfortably share data sets and computational resources with other analysts possibly working in different places of the Earth globe. Finally, the paper reports the results of an experimental campaign, which show that the execution engine actually performs in a more than satisfactory way, proving that our framework can be actually used by analysts to process JSON data sets.

Download Full-text

Towards Flexible Retrieval, Integration and Analysis of JSON Data Sets through Fuzzy Sets: A Case Study

Information ◽

10.3390/info12070258 ◽

2021 ◽

Vol 12 (7) ◽

pp. 258

Author(s):

Paolo Fosci ◽

Giuseppe Psaila

Keyword(s):

Fuzzy Sets ◽

Query Language ◽

Traditional Approach ◽

Open Data ◽

Real Data ◽

Data Sets ◽

Practical Case ◽

Innovative Capabilities ◽

Potential Applications

How to exploit the incredible variety of JSON data sets currently available on the Internet, for example, on Open Data portals? The traditional approach would require getting them from the portals, then storing them into some JSON document store and integrating them within the document store. However, once data are integrated, the lack of a query language that provides flexible querying capabilities could prevent analysts from successfully completing their analysis. In this paper, we show how the J-CO Framework, a novel framework that we developed at the University of Bergamo (Italy) to manage large collections of JSON documents, is a unique and innovative tool that provides analysts with querying capabilities based on fuzzy sets over JSON data sets. Its query language, called J-CO-QL, is continuously evolving to increase potential applications; the most recent extensions give analysts the capability to retrieve data sets directly from web portals as well as constructs to apply fuzzy set theory to JSON documents and to provide analysts with the capability to perform imprecise queries on documents by means of flexible soft conditions. This paper presents a practical case study in which real data sets are retrieved, integrated and analyzed to effectively show the unique and innovative capabilities of the J-CO Framework.

Download Full-text

Smarter Open Government Data for Society 5.0: Are Your Open Data Smart Enough?

Sensors ◽

10.3390/s21155204 ◽

2021 ◽

Vol 21 (15) ◽

pp. 5204

Author(s):

Anastasija Nikiforova

Keyword(s):

Industry 4.0 ◽

Economic Value ◽

Open Data ◽

Digital Data ◽

Open Government ◽

Data Sets ◽

Time Data ◽

Open Government Data ◽

Information And Communication ◽

Government Data

Nowadays, governments launch open government data (OGD) portals that provide data that can be accessed and used by everyone for their own needs. Although the potential economic value of open (government) data is assessed in millions and billions, not all open data are reused. Moreover, the open (government) data initiative as well as users’ intent for open (government) data are changing continuously and today, in line with IoT and smart city trends, real-time data and sensor-generated data have higher interest for users. These “smarter” open (government) data are also considered to be one of the crucial drivers for the sustainable economy, and might have an impact on information and communication technology (ICT) innovation and become a creativity bridge in developing a new ecosystem in Industry 4.0 and Society 5.0. The paper inspects OGD portals of 60 countries in order to understand the correspondence of their content to the Society 5.0 expectations. The paper provides a report on how much countries provide these data, focusing on some open (government) data success facilitating factors for both the portal in general and data sets of interest in particular. The presence of “smarter” data, their level of accessibility, availability, currency and timeliness, as well as support for users, are analyzed. The list of most competitive countries by data category are provided. This makes it possible to understand which OGD portals react to users’ needs, Industry 4.0 and Society 5.0 request the opening and updating of data for their further potential reuse, which is essential in the digital data-driven world.

Download Full-text

Mapping Public Urban Green Spaces Based on OpenStreetMap and Sentinel-2 Imagery Using Belief Functions

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10040251 ◽

2021 ◽

Vol 10 (4) ◽

pp. 251

Author(s):

Christina Ludwig ◽

Robert Hecht ◽

Sven Lautenbach ◽

Martin Schorcht ◽

Alexander Zipf

Keyword(s):

Vegetation Index ◽

Normalized Difference Vegetation Index ◽

Open Data ◽

Green Spaces ◽

Data Sets ◽

Urban Green ◽

Urban Green Spaces ◽

Urban Quality Of Life ◽

Shafer Theory ◽

Sentinel 2

Public urban green spaces are important for the urban quality of life. Still, comprehensive open data sets on urban green spaces are not available for most cities. As open and globally available data sets, the potential of Sentinel-2 satellite imagery and OpenStreetMap (OSM) data for urban green space mapping is high but limited due to their respective uncertainties. Sentinel-2 imagery cannot distinguish public from private green spaces and its spatial resolution of 10 m fails to capture fine-grained urban structures, while in OSM green spaces are not mapped consistently and with the same level of completeness everywhere. To address these limitations, we propose to fuse these data sets under explicit consideration of their uncertainties. The Sentinel-2 derived Normalized Difference Vegetation Index was fused with OSM data using the Dempster–Shafer theory to enhance the detection of small vegetated areas. The distinction between public and private green spaces was achieved using a Bayesian hierarchical model and OSM data. The analysis was performed based on land use parcels derived from OSM data and tested for the city of Dresden, Germany. The overall accuracy of the final map of public urban green spaces was 95% and was mainly influenced by the uncertainty of the public accessibility model.

Download Full-text

A Hybrid Approach Combining R*-Tree and k-d Trees to Improve Linked Open Data Query Performance

Applied Sciences ◽

10.3390/app11052405 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2405

Author(s):

Yuxiang Sun ◽

Tianyi Zhao ◽

Seulgi Yoon ◽

Yongju Lee

Keyword(s):

Flash Memory ◽

Query Language ◽

Hybrid Approach ◽

Open Data ◽

Main Memory ◽

Linked Open Data ◽

Index Structure ◽

Identification Algorithm ◽

Distributed Computing Systems ◽

Query Performance

Semantic Web has recently gained traction with the use of Linked Open Data (LOD) on the Web. Although numerous state-of-the-art methodologies, standards, and technologies are applicable to the LOD cloud, many issues persist. Because the LOD cloud is based on graph-based resource description framework (RDF) triples and the SPARQL query language, we cannot directly adopt traditional techniques employed for database management systems or distributed computing systems. This paper addresses how the LOD cloud can be efficiently organized, retrieved, and evaluated. We propose a novel hybrid approach that combines the index and live exploration approaches for improved LOD join query performance. Using a two-step index structure combining a disk-based 3D R*-tree with the extended multidimensional histogram and flash memory-based k-d trees, we can efficiently discover interlinked data distributed across multiple resources. Because this method rapidly prunes numerous false hits, the performance of join query processing is remarkably improved. We also propose a hot-cold segment identification algorithm to identify regions of high interest. The proposed method is compared with existing popular methods on real RDF datasets. Results indicate that our method outperforms the existing methods because it can quickly obtain target results by reducing unnecessary data scanning and reduce the amount of main memory required to load filtering results.

Download Full-text

The current state and future of internet technology-based hypertension management in Japan

Hypertension Research ◽

10.1038/s41440-020-00591-0 ◽

2020 ◽

Author(s):

Junichi Yatabe ◽

Midori Sasaki Yatabe ◽

Atsuhiro Ichihara

Keyword(s):

Internet Technology ◽

Hypertension Management ◽

Current State

Download Full-text

Enhancing transparency through open government data: the case of data portals and their features and capabilities

Online Information Review ◽

10.1108/oir-05-2020-0204 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Martin Lněnička ◽

Renata Machova ◽

Jolana Volejníková ◽

Veronika Linhartová ◽

Radka Knezackova ◽

...

Keyword(s):

Open Data ◽

Open Government ◽

Data Sets ◽

Web Content ◽

Content Type ◽

Domain Experts ◽

Computer Mediated ◽

Open Government Data ◽

Decision Making Processes ◽

Government Data

PurposeThe purpose of this paper was to draw on evidence from computer-mediated transparency and examine the argument that open government data and national data infrastructures represented by open data portals can help in enhancing transparency by providing various relevant features and capabilities for stakeholders' interactions.Design/methodology/approachThe developed methodology consisted of a two-step strategy to investigate research questions. First, a web content analysis was conducted to identify the most common features and capabilities provided by existing national open data portals. The second step involved performing the Delphi process by surveying domain experts to measure the diversity of their opinions on this topic.FindingsIdentified features and capabilities were classified into categories and ranked according to their importance. By formalizing these feature-related transparency mechanisms through which stakeholders work with data sets we provided recommendations on how to incorporate them into designing and developing open data portals.Social implicationsThe creation of appropriate open data portals aims to fulfil the principles of open government and enables stakeholders to effectively engage in the policy and decision-making processes.Originality/valueBy analyzing existing national open data portals and validating the feature-related transparency mechanisms, this paper fills this gap in existing literature on designing and developing open data portals for transparency efforts.

Download Full-text

Asymmetric Open Government Data (OGD) framework in India

Digital Policy Regulation and Governance ◽

10.1108/dprg-11-2017-0059 ◽

2018 ◽

Vol 20 (5) ◽

pp. 434-448 ◽

Cited By ~ 3

Author(s):

Stuti Saxena

Keyword(s):

Economic Value ◽

Open Data ◽

Developed Countries ◽

Open Government ◽

Data Sets ◽

Indian States ◽

Content Type ◽

Open Government Data ◽

Government Data ◽

The Government

Purpose With the ongoing drives towards Open Government Data (OGD) initiatives across the globe, governments have been keen on pursuing their OGD policies to ensure transparency, collaboration and efficiency in administration. As a developing country, India has recently adopted the OGD policy (www.data.gov.in); however, the percolation of this policy in the States has remained slow. This paper aims to underpin the “asymmetry” in OGD framework as far as the Indian States are concerned. Besides, the study also assesses the contribution of “Open Citizens” in furthering the OGD initiatives of the country. Design/methodology/approach An exploratory qualitative following a case study approach informs the present study using documentary analysis where evidentiary support from five Indian States (Uttar Pradesh, Telangana, West Bengal, Sikkim and Gujarat) is being drawn to assess the nature and scope of the OGD framework. Further, conceptualization for “Open Citizen” framework is provided to emphasize upon the need to have aware, informed and pro-active citizens to spearhead the OGD initiatives in the country. Findings While the National OGD portal has a substantial number of data sets across different sectors, the States are lagging behind in the adoption and implementation of OGD policies, and while Telangana and Sikkim have been the frontrunners in adoption of OGD policies in a rudimentary manner, others are yet to catch up with them. Further, there is “asymmetry” in terms of the individual contribution of the government bodies to the open data sets where some government bodies are more reluctant to share their datasets than the others. Practical implications It is the conclusion of the study that governments need to institutionalize the OGD framework in the country, and all the States should appreciate the requirement of adopting a robust OGD policy for furthering transparency, collaboration and efficiency in administration. Social implications As an “Open Citizen”, it behooves upon the citizens to be pro-active and contribute towards the open data sets which would go a long way in deriving social and economic value out of these data sets. Originality/value While there are many studies on OGD in the West, studies focused upon the developing countries are starkly lacking. This study plugs this gap by attempting a comparative analysis of the OGD frameworks across Indian States. Besides, the study has provided a conceptualization of “Open Citizen” (OGD) which may be tapped for further research in developing and developed countries to ascertain the linkage between OGD and OC.

Download Full-text

SD-UNet: Stripping down U-Net for Segmentation of Biomedical Images on Platforms with Low Computational Budgets

Diagnostics ◽

10.3390/diagnostics10020110 ◽

2020 ◽

Vol 10 (2) ◽

pp. 110 ◽

Cited By ~ 4

Author(s):

Pius Kwao Gadosey ◽

Yujian Li ◽

Enock Adjei Agyekum ◽

Ting Zhang ◽

Zhaoying Liu ◽

...

Keyword(s):

Neural Network ◽

Network Architecture ◽

Tumor Segmentation ◽

Biomedical Data ◽

Electron Microscopic ◽

Brain Tumor Segmentation ◽

Current State ◽

Expected Performance ◽

Computational Resources ◽

Medical Segmentation

During image segmentation tasks in computer vision, achieving high accuracy performance while requiring fewer computations and faster inference is a big challenge. This is especially important in medical imaging tasks but one metric is usually compromised for the other. To address this problem, this paper presents an extremely fast, small and computationally effective deep neural network called Stripped-Down UNet (SD-UNet), designed for the segmentation of biomedical data on devices with limited computational resources. By making use of depthwise separable convolutions in the entire network, we design a lightweight deep convolutional neural network architecture inspired by the widely adapted U-Net model. In order to recover the expected performance degradation in the process, we introduce a weight standardization algorithm with the group normalization method. We demonstrate that SD-UNet has three major advantages including: (i) smaller model size (23x smaller than U-Net); (ii) 8x fewer parameters; and (iii) faster inference time with a computational complexity lower than 8M floating point operations (FLOPs). Experiments on the benchmark dataset of the Internatioanl Symposium on Biomedical Imaging (ISBI) challenge for segmentation of neuronal structures in electron microscopic (EM) stacks and the Medical Segmentation Decathlon (MSD) challenge brain tumor segmentation (BRATs) dataset show that the proposed model achieves comparable and sometimes better results compared to the current state-of-the-art.

Download Full-text

Matrix application for multi-radar processing of radar data arrays

Radio Industry (Russia) ◽

10.21778/2413-9599-2020-30-3-99-111 ◽

2020 ◽

Vol 30 (3) ◽

pp. 99-111

Author(s):

D. A. Palguyev ◽

A. N. Shentyabin

Keyword(s):

Processing Time ◽

Radar Data ◽

Practical Implementation ◽

Relative Reduction ◽

Data Sets ◽

Processing Efficiency ◽

Complex Information ◽

Computational Resources ◽

Crucial Part ◽

Processing Device

In the processing of dynamically changing data, for example, radar data (RD), a crucial part is made by the representation of various data sets containing information about routes and signs of air objects. In the practical implementation of the computational process, it previously seemed natural that RD processing in data arrays was carried out by the elementwise search method. However, the representation of data arrays in the form of matrices and the use of matrix math allow optimal calculations to be formed during tertiary processing. Forming matrices and working with them requires a significant computational resource, so the authors can assume that a certain gain in calculation time may be achieved if there is a large amount of data in the arrays, at least several thousand messages. The article shows the sequences of the most frequently repeated operations of tertiary network processing, such as searching for and replacing an array element. The simulation results show that the processing efficiency (relative reduction of processing time and saving of computing resources) with the use of matrices, in comparison with elementwise search and replacement, increases in proportion to the number of messages received by the information processing device. The most significant gain is observed when processing several thousand messages (array elements). Thus, the use of matrices and the mathematical apparatus of matrix math for processing arrays of dynamically changing data can reduce processing time and save computational resources. The proposed matrix method of organizing calculations can also find its place in the modeling of complex information systems.

Download Full-text

Risk Analysis of Setting up a Restaurant at NYC

10.5121/csit.2021.110703 ◽

2021 ◽

Author(s):

Santoshi Laxmi Reddy Ellanki ◽

John Jenq

Keyword(s):

New York ◽

Risk Analysis ◽

Open Data ◽

Data Sets ◽

Rating Data

In this report, a system was developed that can predict the outcome of opening a restaurant in NYC based on various NYC open data sets, such as 311 calls, New York Police crime records and restaurant rating data. The data sets were preprocessed and cleaned before analysis to improve the quality of our results.

Download Full-text