An ETL Framework for Online Analytical Processing of Linked Open Data

Author(s):  
Hiroyuki Inoue ◽  
Toshiyuki Amagasa ◽  
Hiroyuki Kitagawa
Algorithms ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 265
Author(s):  
Irya Wisnubhadra ◽  
Safiza Kamal Baharin ◽  
Nurul A. Emran ◽  
Djoko Budiyanto Setyohadi

The accessibility of devices that track the positions of moving objects has attracted many researchers in Mobility Online Analytical Processing (Mobility OLAP). Mobility OLAP makes use of trajectory data warehousing techniques, which typically include a path of moving objects at a particular point in time. The Semantic Web (SW) users have published a large number of moving object datasets that include spatial and non-spatial data. These data are available as open data and require advanced analysis to aid in decision making. However, current SW technologies support advanced analysis only for multidimensional data warehouses and Online Analytical Processing (OLAP) over static spatial and non-spatial SW data. The existing technology does not support the modeling of moving object facts, the creation of basic mobility analytical queries, or the definition of fundamental operators and functions for moving object types. This article introduces the QB4MobOLAP vocabulary, which enables the analysis of mobility data stored in RDF cubes. This article defines Mobility OLAP operators and SPARQL user-defined functions. As a result, QB4MobOLAP vocabulary and the Mobility OLAP operators are evaluated by applying them to a practical use case of transportation analysis involving 8826 triples consisting of approximately 7000 fact triples. Each triple contains nearly 1000 temporal data points (equivalent to 7 million records in conventional databases). The execution of six pertinent spatiotemporal analytics query samples results in a practical, simple model with expressive performance for the enabling of executive decisions on transportation analysis.


Author(s):  
Roberta Macêdo Marques Gouveia ◽  
Charles Nicollas Cavalcante Freitas

Resumo: O artigo propõe a implementação de um banco de dados educacional multidimensional voltado à análise do ensino superior brasileiro, com foco na Educação a Distância – EaD. Para dar suporte ao processo de apoio à decisão foi projetado um Data Warehouse – DW, tendo como base a modelagem dimensional constelação de fatos e a aplicação de tecnologias OnLine Analytical Processing – OLAP. Foram utilizados dados abertos de alta granularidade dos sistemas e-MEC e Universidade Aberta do Brasil. O trabalho fundamenta-se no desenvolvimento de um ambiente computacional analítico visando traçar o perfil de instituições de ensino atuantes na EaD, bem como estudantes universitários e cursos, em especial da área de conhecimento "ciências exatas e da terra" e subárea "computação". Após as análises realizadas no trabalho por meio do DW, constata-se que a modalidade de educação a distância está cada vez mais acessível no Brasil, com vistas à interiorização e democratização do ensino superior.Palavras-chave: Educação a Distância. Dados Abertos. Data Warehouse. Modelagem Dimensional.IMPLEMENTATION OF A DATA WAREHOUSE FOR ANALYSIS OF OPEN GOVERNMENTAL DATA IN DISTANCE EDUCATION Abstract: The paper presents an implementation of a multidimensional educational database to analysis of Brazilian higher education, focusing on Distance Education - Distance Education. To support the decision support process, a Data Warehouse (DW) was designed, based on the facts constellation dimensional modeling and the application of OnLine Analytical Processing (OLAP) technologies. Open data of high granularity of the e-MEC and Open University of Brazil systems were used. The work is based on the development of an analytical computational environment aiming at tracing the profile of educational institutions active in EaD, as well as university and courses, especially the area of knowledge "exact sciences and land" and subarea "computing". Therefore, after the analyzes carried out in the work by means of DW, it is verified that the modality of distance education is increasingly accessible in Brazil, aiming at the interiorization and democratization of higher education. Keywords: Distance Education. Open Data. Data Warehouse. Dimensional Modeling.


Author(s):  
Caio Saraiva Coneglian ◽  
José Eduardo Santarem Segundo

O surgimento de novas tecnologias, tem introduzido meios para a divulgação e a disponibilização das informações mais eficientemente. Uma iniciativa, chamada de Europeana, vem promovendo esta adaptação dos objetos informacionais dentro da Web, e mais especificamente no Linked Data. Desta forma, o presente estudo tem como objetivo apresentar uma discussão acerca da relação entre as Humanidades Digitais e o Linked Open Data, na figura da Europeana. Para tal, utilizamos uma metodologia exploratória e que busca explorar as questões relacionadas ao modelo de dados da Europeana, EDM, por meio do SPARQL. Como resultados, compreendemos as características do EDM, pela utilização do SPARQL. Identificamos, ainda, a importância que o conceito de Humanidades Digitais possui dentro do contexto da Europeana.Palavras-chave: Web semântica. Linked open data. Humanidades digitais. Europeana. EDM.Link: https://periodicos.ufsc.br/index.php/eb/article/view/1518-2924.2017v22n48p88/33031


2021 ◽  
Vol 11 (5) ◽  
pp. 2405
Author(s):  
Yuxiang Sun ◽  
Tianyi Zhao ◽  
Seulgi Yoon ◽  
Yongju Lee

Semantic Web has recently gained traction with the use of Linked Open Data (LOD) on the Web. Although numerous state-of-the-art methodologies, standards, and technologies are applicable to the LOD cloud, many issues persist. Because the LOD cloud is based on graph-based resource description framework (RDF) triples and the SPARQL query language, we cannot directly adopt traditional techniques employed for database management systems or distributed computing systems. This paper addresses how the LOD cloud can be efficiently organized, retrieved, and evaluated. We propose a novel hybrid approach that combines the index and live exploration approaches for improved LOD join query performance. Using a two-step index structure combining a disk-based 3D R*-tree with the extended multidimensional histogram and flash memory-based k-d trees, we can efficiently discover interlinked data distributed across multiple resources. Because this method rapidly prunes numerous false hits, the performance of join query processing is remarkably improved. We also propose a hot-cold segment identification algorithm to identify regions of high interest. The proposed method is compared with existing popular methods on real RDF datasets. Results indicate that our method outperforms the existing methods because it can quickly obtain target results by reducing unnecessary data scanning and reduce the amount of main memory required to load filtering results.


Sign in / Sign up

Export Citation Format

Share Document