A Heterogeneous Geospatial Data Retrieval Method Using Knowledge Graph

Information resources have increased rapidly in the big data era. Geospatial data plays an indispensable role in spatially informed analyses, while data in different areas are relatively isolated. Therefore, it is inadequate to use relational data in handling many semantic intricacies and retrieving geospatial data. In light of this, a heterogeneous retrieval method based on knowledge graph is proposed in this paper. There are three advantages of this method: (1) the semantic knowledge of geospatial data is considered; (2) more information required by users could be obtained; (3) data retrieval speed can be improved. Firstly, implicit semantic knowledge is studied and applied to construct a knowledge graph, integrating semantics in multi-source heterogeneous geospatial data. Then, the query expansion rules and the mappings between knowledge and database are designed to construct retrieval statements and obtain related spatial entities. Finally, the effectiveness and efficiency are verified through comparative analysis and practices. The experiment indicates that the method could automatically construct database retrieval statements and retrieve more relevant data. Additionally, users could reduce the dependence on data storage mode and database Structured Query Language syntax. This paper would facilitate the sharing and outreach of geospatial knowledge for various spatial studies.

Download Full-text

FAIR.ReD: Semantic knowledge graph infrastructure for the life sciences

Biodiversity Information Science and Standards ◽

10.3897/biss.3.37206 ◽

2019 ◽

Vol 3 ◽

Author(s):

Lars Vogt ◽

Sören Auer ◽

Thomas Bartolomaeus ◽

Pier Luigi Buttigieg ◽

Peter Grobe ◽

...

Keyword(s):

Data Management ◽

Data Storage ◽

Additional Data ◽

Source Code ◽

Life Sciences ◽

Data Access ◽

Semantic Knowledge ◽

Knowledge Graph ◽

Machine Readable ◽

Semantic Knowledge Graph

We would like to present FAIR Research Data: Semantic Knowledge Graph Infrastructure for the Life Sciences (in short, FAIR.ReD), a project initiative that is currently being evaluated for funding. FAIR.ReD is a software environment for developing data management solutions according to the FAIR (Findable, Accessible, Interoperable, Reusable; Wilkinson et al. 2016) data principles. It utilizes what we call a Data Sea Storage, which employs the idea of Data Lakes to decouple data storage from data access but modifies it by storing data in a semantically structured format as either semantic graphs or semantic tables, instead of storing them in their native form. Storage follows a top-down approach, resulting in a standardized storage model, which allows sharing data across all FAIR.ReD Knowledge Graph Applications (KGAs) connected to the same Sea, with newly developed KGAs having automatically access to all contents in the Sea. In contrast access and export of data follows a bottom-up approach that allows the specification of additional data models to meet the varying domain-specific and programmatic needs for accessing structured data. The FAIR.ReD engine enables bidirectional data conversion between the two storage models and any additional data model, which will substantially reduce conversion workload for data-rich institutes (Fig. 1). Moreover, with the possibility to store data in semantic tables, FAIR.ReD provides high performance storage for incoming data streams such as sensory data. FAIR.ReD KGAs are modularly organized. Modules can be edited using the FAIR.ReD editor and combined to form coherent KGAs. The editor allows domain experts to develop their own modules and KGAs without any programming experience required, thus also allowing smaller projects and individual researchers to build their own FAIR data management solution. Contents from FAIR.ReD KGAs can be published under a Creative Commons license as documents, micropublications, or nanopublications, each receiving their own DOI. A publication-life-cycle is implemented in FAIR.ReD and allows updating published contents for corrections or additions without overwriting the originally published version. Together with the fact that data and metadata are semantically structured and machine-readable, all contents from FAIR.ReD KGAs will comply with the FAIR Guiding Principles. Due to all FAIR.Red KGAs providing access to semantic knowledge graphs in both a human-readable and a machine-readable version, FAIR.ReD seamlessly integrates the complex RDF (Resource Description Framework) world with a more intuitively comprehensible presentation of data in form of data entry forms, charts, and tables. Guided by use cases, the FAIR.ReD environment will be developed using semantic programming where the source code of an application is stored in its own ontology. The set of source code ontologies of a KGA and its modules provides the steering logic for running the KGA. With this clear separation of steering logic from interpretation logic, semantic programming follows the idea of separating main layers of an application, analog to the separation of interpretation logic and presentation logic. Each KGA and module is specified exactly in this way and their source code ontologies stored in the Data Sea. Thus, all data and metadata are semantically transparent and so is the data management application itself, which substantially improves their sustainability on all levels of data processing and storing.

Download Full-text

Mobile Software Assurance Informed through Knowledge Graph Construction: The OWASP Threat of Insecure Data Storage

Journal of Computer Science Research ◽

10.30564/jcsr.v2i2.1765 ◽

2020 ◽

Vol 2 (2) ◽

Author(s):

Suzanna Schmeelk ◽

Lixin Tao

Keyword(s):

Data Storage ◽

Program Analysis ◽

Web Application ◽

Security Analysis ◽

Knowledge Graph ◽

Healthcare Applications ◽

Sensitive Data ◽

Knowledge Graphs ◽

Mobile Malware Detection ◽

Software Assurance

Many organizations, to save costs, are movinheg to t Bring Your Own Mobile Device (BYOD) model and adopting applications built by third-parties at an unprecedented rate. Our research examines software assurance methodologies specifically focusing on security analysis coverage of the program analysis for mobile malware detection, mitigation, and prevention. This research focuses on secure software development of Android applications by developing knowledge graphs for threats reported by the Open Web Application Security Project (OWASP). OWASP maintains lists of the top ten security threats to web and mobile applications. We develop knowledge graphs based on the two most recent top ten threat years and show how the knowledge graph relationships can be discovered in mobile application source code. We analyze 200+ healthcare applications from GitHub to gain an understanding of their software assurance of their developed software for one of the OWASP top ten moble threats, the threat of “Insecure Data Storage.” We find that many of the applications are storing personally identifying information (PII) in potentially vulnerable places leaving users exposed to higher risks for the loss of their sensitive data.

Download Full-text

A Binary Volumetric Data Retrieval Method Based on Neighboring Voxel Pattern Descriptors

2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC) ◽

10.1109/smc.2018.00710 ◽

2018 ◽

Author(s):

Motofumi Suzuki

Keyword(s):

Data Retrieval ◽

Volumetric Data ◽

Retrieval Method

Download Full-text

A Set of Integral Grid-Coding Algebraic Operations Based on GeoSOT-3D

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070489 ◽

2021 ◽

Vol 10 (7) ◽

pp. 489

Author(s):

Kaihua Hou ◽

Chengqi Cheng ◽

Bo Chen ◽

Chi Zhang ◽

Liesong He ◽

...

Keyword(s):

Real Time ◽

Spatial Information ◽

Data Retrieval ◽

Geospatial Data ◽

Algebraic Operation ◽

Real Time Processing ◽

C Language ◽

Time Processing ◽

Binary Operations ◽

Set Operations

As the amount of collected spatial information (2D/3D) increases, the real-time processing of these massive data is among the urgent issues that need to be dealt with. Discretizing the physical earth into a digital gridded earth and assigning an integral computable code to each grid has become an effective way to accelerate real-time processing. Researchers have proposed optimization algorithms for spatial calculations in specific scenarios. However, a complete set of algorithms for real-time processing using grid coding is still lacking. To address this issue, a carefully designed, integral grid-coding algebraic operation framework for GeoSOT-3D (a multilayer latitude and longitude grid model) is proposed. By converting traditional floating-point calculations based on latitude and longitude into binary operations, the complexity of the algorithm is greatly reduced. We then present the detailed algorithms that were designed, including basic operations, vector operations, code conversion operations, spatial operations, metric operations, topological relation operations, and set operations. To verify the feasibility and efficiency of the above algorithms, we developed an experimental platform using C++ language (including major algorithms, and more algorithms may be expanded in the future). Then, we generated random data and conducted experiments. The experimental results show that the computing framework is feasible and can significantly improve the efficiency of spatial processing. The algebraic operation framework is expected to support large geospatial data retrieval and analysis, and experience a revival, on top of parallel and distributed computing, in an era of large geospatial data.

Download Full-text

IDSM ChemWebRDF: SPARQLing small-molecule datasets

Journal of Cheminformatics ◽

10.1186/s13321-021-00515-1 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Jakub Galgonek ◽

Jiří Vondrášek

Keyword(s):

Data Storage ◽

Small Molecule ◽

Web Application ◽

Query Language ◽

Data Interoperability ◽

Sparql Endpoint ◽

Data Source ◽

Rdf Data ◽

Relational Form ◽

Federated Queries

AbstractThe Resource Description Framework (RDF), together with well-defined ontologies, significantly increases data interoperability and usability. The SPARQL query language was introduced to retrieve requested RDF data and to explore links between them. Among other useful features, SPARQL supports federated queries that combine multiple independent data source endpoints. This allows users to obtain insights that are not possible using only a single data source. Owing to all of these useful features, many biological and chemical databases present their data in RDF, and support SPARQL querying. In our project, we primary focused on PubChem, ChEMBL and ChEBI small-molecule datasets. These datasets are already being exported to RDF by their creators. However, none of them has an official and currently supported SPARQL endpoint. This omission makes it difficult to construct complex or federated queries that could access all of the datasets, thus underutilising the main advantage of the availability of RDF data. Our goal is to address this gap by integrating the datasets into one database called the Integrated Database of Small Molecules (IDSM) that will be accessible through a SPARQL endpoint. Beyond that, we will also focus on increasing mutual interoperability of the datasets. To realise the endpoint, we decided to implement an in-house developed SPARQL engine based on the PostgreSQL relational database for data storage. In our approach, data are stored in the traditional relational form, and the SPARQL engine translates incoming SPARQL queries into equivalent SQL queries. An important feature of the engine is that it optimises the resulting SQL queries. Together with optimisations performed by PostgreSQL, this allows efficient evaluations of SPARQL queries. The endpoint provides not only querying in the dataset, but also the compound substructure and similarity search supported by our Sachem project. Although the endpoint is accessible from an internet browser, it is mainly intended to be used for programmatic access by other services, for example as a part of federated queries. For regular users, we offer a rich web application called ChemWebRDF using the endpoint. The application is publicly available at https://idsm.elixir-czech.cz/chemweb/.

Download Full-text

Study of privacy-preserving framework for cloud storage

Computer Science and Information Systems ◽

10.2298/csis100327029r ◽

2011 ◽

Vol 8 (3) ◽

pp. 801-819 ◽

Cited By ~ 12

Author(s):

Huang Ruwei ◽

Gui Xiaolin ◽

Yu Si ◽

Zhuang Wei

Keyword(s):

Data Storage ◽

Cloud Storage ◽

Security Analysis ◽

Bloom Filter ◽

Data Access ◽

Data Retrieval ◽

Privacy Preserving ◽

Index Structure ◽

Structure Generation ◽

Access Right

In order to implement privacy-preserving, efficient and secure data storage and access environment of cloud storage, the following problems must be considered: data index structure, generation and management of keys, data retrieval, treatments of change of users? access right and dynamic operations on data, and interactions among participants. To solve those problems, the interactive protocol among participants is introduced, an extirpation-based key derivation algorithm (EKDA) is designed to manage the keys, a double hashed and weighted Bloom Filter (DWBF) is proposed to retrieve the encrypted keywords, which are combined with lazy revocation, multi-tree structure, asymmetric and symmetric encryptions, which form a privacypreserving, efficient and secure framework for cloud storage. The experiment and security analysis show that EKDA can reduce the communication and storage overheads efficiently, DWBF supports ciphertext retrieval and can reduce communication, storage and computation overhead as well, and the proposed framework is privacy preserving while supporting data access efficiently.

Download Full-text

RANCANG BAGUN SISTEM INFORMASI AGENDA KERJA UNTUK ANGGOTA KEPOLISIAN BERBASIS KINERJA MENGGUNAKAN METODE BERORIENTASI OBJEK

Elkom : Jurnal Elektronika dan Komputer ◽

10.51903/elkom.v14i2.473 ◽

2021 ◽

Vol 14 (2) ◽

pp. 181-189

Author(s):

Edy Siswanto ◽

Sugiarto Sugiarto

Keyword(s):

Data Processing ◽

Data Storage ◽

Data Retrieval ◽

Integrated System ◽

The Body ◽

Police Chief ◽

Activity Data ◽

Public Servant ◽

The Law ◽

The Republic

Within the body of the National Police, the Polsekta / Polsek have an important role in protecting, nurturing, serving and enforcing the law in the Tegowanu Police area community. Therefore, the Tegowanu Police are required to be able to serve the community where one of the main tasks of the Republic of Indonesia Police is as a public servant. As one of the law enforcement officers, the National The old work agenda system still uses manual methods and takes a lot of time and with this system will slow down the performance of Polsek members. Documentation of activities carried out by the Tegowanu Police Officer cannot be seen by the police chief and members, because photo documentation is only stored on the officer's computer. Another problem is that there is no data storage for community activity data and activity schedules, because there is no storage in the database so that data processing has not been well integrated. By designing a Performance-based Activity information system using the Object Oriented Method at the Tegowanu Police, Resort Grobogan is expected to help data processing so that it is more well integrated and the reporting process and data retrieval are faster if data is needed at any time and create an integrated system with the database. The information that will be built can speed up the process of processing and sending information and activities of the Tegowanu Police to the head of the Sector Police.

Download Full-text

Performance Benchmarking of Key-Value Store NoSQL Databases

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v8i6.pp5333-5341 ◽

2018 ◽

Vol 8 (6) ◽

pp. 5333

Author(s):

Omoruyi Osemwegie ◽

Kennedy Okokpujie ◽

Nsikan Nkordeh ◽

Charles Ndujiuba ◽

Samuel John ◽

...

Keyword(s):

Data Storage ◽

Web Applications ◽

Query Language ◽

Research Work ◽

Database Systems ◽

Nosql Databases ◽

Performance Benchmarking ◽

Nosql Database ◽

Structured Query Language ◽

Web Developers

<p>Increasing requirements for scalability and elasticity of data storage for web applications has made Not Structured Query Language NoSQL databases more invaluable to web developers. One of such NoSQL Database solutions is Redis. A budding alternative to Redis database is the SSDB database, which is also a key-value store but is disk-based. The aim of this research work is to benchmark both databases (Redis and SSDB) using the Yahoo Cloud Serving Benchmark (YCSB). YCSB is a platform that has been used to compare and benchmark similar NoSQL database systems. Both databases were given variable workloads to identify the throughput of all given operations. The results obtained shows that SSDB gives a better throughput for majority of operations to Redis’s performance.</p>

Download Full-text