view materialization
Recently Published Documents


TOTAL DOCUMENTS

33
(FIVE YEARS 6)

H-INDEX

6
(FIVE YEARS 0)

2021 ◽  
Vol 14 (13) ◽  
pp. 3281-3294
Author(s):  
Theofilos Mailis ◽  
Yannis Kotidis ◽  
Stamatis Christoforidis ◽  
Evgeny Kharlamov ◽  
Yannis Ioannidis

Knowledge Graphs (KGs) are collections of interconnected and annotated entities that have become powerful assets for data integration, search enhancement, and other industrial applications. Knowledge Graphs such as DBPEDIA may contain billion of triple relations and are intensively queried with millions of queries per day. A prominent approach to enhance query answering on Knowledge Graph databases is View Materialization, ie., the materialization of an appropriate set of computations that will improve query performance. We study the problem of view materialization and propose a view selection methodology for processing query workloads with more than a million queries. Our approach heavily relies on subgraph pattern mining techniques that allow to create efficient summarizations of massive query workloads while also identifying the candidate views for materialization. In the core of our work is the correspondence between the view selection problem to that of Maximizing a Nondecreasing Submodular Set Function Subject to a Knapsack Constraint . The latter leads to a tractable view-selection process for native triple stores that allows a (1 - e ---1 )-approximation of the optimal selection of views. Our experimental evaluation shows that all the steps of the view-selection process are completed in a few minutes, while the corresponding rewritings accelerate 67.68% of the queries in the DBPEDIA query workload. Those queries are executed in 2.19% of their initial time on average.



2021 ◽  
Vol 12 (2) ◽  
pp. 17-37
Author(s):  
Akshay Kumar ◽  
T. V. Vijay Kumar

Big data comprises voluminous and heterogeneous data that has a limited level of trustworthiness. This data is used to generate valuable information that can be used for decision making. However, decision making queries on Big data consume a lot of time for processing resulting in higher response times. For effective and efficient decision making, this response time needs to be reduced. View materialization has been used successfully to reduce the query response time in the context of a data warehouse. Selection of such views is a complex problem vis-à-vis Big data and is the focus of this paper. In this paper, the Big data view selection problem is formulated as a bi-objective optimization problem with the two objectives being the minimization of the query evaluation cost and the minimization of the update processing cost. Accordingly, a Big data view selection algorithm that selects Big data views for a given query workload, using the vector evaluated genetic algorithm, is proposed. The proposed algorithm aims to generate views that are able to reduce the response time of decision-making queries.



2021 ◽  
Vol 34 (2) ◽  
pp. 1-28
Author(s):  
Akshay Kumar ◽  
T. V. Vijay Kumar

Big data views, in the context of distributed file system (DFS), are defined over structured, semi-structured and unstructured data that are voluminous in nature with the purpose to reduce the response time of queries over Big data. As the size of semi-structured and unstructured data in Big data is very large compared to structured data, a framework based on query attributes on Big data can be used to identify Big data views. Materializing Big data views can enhance the query response time and facilitate efficient distribution of data over the DFS based application. Given all the Big data views cannot be materialized, therefore, a subset of Big data views should be selected for materialization. The purpose of view selection for materialization is to improve query response time subject to resource constraints. The Big data view materialization problem was defined as a bi-objective problem with the two objectives- minimization of query evaluation cost and minimization of the update processing cost, with a constraint on the total size of the materialized views. This problem is addressed in this paper using multi-objective genetic algorithm NSGA-II. The experimental results show that proposed NSGA-II based Big data view selection algorithm is able to select reasonably good quality views for materialization.



Author(s):  
Ashwin Verma ◽  
Pronaya Bhattacharya ◽  
Umesh Bodkhe ◽  
Akhilesh Ladha ◽  
Sudeep Tanwar


2021 ◽  
Vol 2 (1) ◽  
pp. 61-85
Author(s):  
Akshay Kumar ◽  
T. V. Vijay Kumar

Advances in technology have resulted in the generation of a large volume of heterogeneous big data for large enterprises engaged in e-commerce, healthcare, education, etc. This is being created at a rapid rate but is low in its veracity. This big data includes large sets of semi-structured and unstructured data and is stored over a distributed file system (DFS). This data can be processed in a fault tolerant manner using several frameworks, tools, and advanced database technologies. Big data can provide important information, which can be used for business decision making. View materialization, which has been widely studied for structured databases or data warehouse, has been extended to big data to enhance efficiency of big data query processing. This paper focuses on the selection of big data views for materialization. The big data views can be identified by extracting a set of query attributes from the set of query workload of an enterprise. The query attributes are interrelated resulting in the creation of alternate access paths for query evaluation. The cost of query processing using big data views involves the integrity of different data types of heterogeneous big data, frequency of queries, change in the size of big data, selected sets of big data materialized views, and updates on big data and these sets of materialized views. The cost of query processing is computed using the stored size of big data views on the DFS system, which is a consistent processing framework of DFS. A big data view selection algorithm that is capable of selecting views from structured, semi-structured, and unstructured data has been proposed in this paper. The proposed algorithm would select big data views that would result in faster processing of most user queries resulting in efficient decision making.





Author(s):  
M. E. Megahed ◽  
Rasha M. Ismail ◽  
Nagwa L. Badr ◽  
Mohamed Fahmy Tolba


2014 ◽  
Vol 26 (10) ◽  
pp. 2439-2452 ◽  
Author(s):  
Hidayet Aksu ◽  
Mustafa Canim ◽  
Yuan-Chi Chang ◽  
Ibrahim Korpeoglu ◽  
ozgur Ulusoy




Sign in / Sign up

Export Citation Format

Share Document