View Materialization Over Big Data

2021 ◽  
Vol 2 (1) ◽  
pp. 61-85
Author(s):  
Akshay Kumar ◽  
T. V. Vijay Kumar

Advances in technology have resulted in the generation of a large volume of heterogeneous big data for large enterprises engaged in e-commerce, healthcare, education, etc. This is being created at a rapid rate but is low in its veracity. This big data includes large sets of semi-structured and unstructured data and is stored over a distributed file system (DFS). This data can be processed in a fault tolerant manner using several frameworks, tools, and advanced database technologies. Big data can provide important information, which can be used for business decision making. View materialization, which has been widely studied for structured databases or data warehouse, has been extended to big data to enhance efficiency of big data query processing. This paper focuses on the selection of big data views for materialization. The big data views can be identified by extracting a set of query attributes from the set of query workload of an enterprise. The query attributes are interrelated resulting in the creation of alternate access paths for query evaluation. The cost of query processing using big data views involves the integrity of different data types of heterogeneous big data, frequency of queries, change in the size of big data, selected sets of big data materialized views, and updates on big data and these sets of materialized views. The cost of query processing is computed using the stored size of big data views on the DFS system, which is a consistent processing framework of DFS. A big data view selection algorithm that is capable of selecting views from structured, semi-structured, and unstructured data has been proposed in this paper. The proposed algorithm would select big data views that would result in faster processing of most user queries resulting in efficient decision making.

2021 ◽  
Vol 34 (2) ◽  
pp. 1-28
Author(s):  
Akshay Kumar ◽  
T. V. Vijay Kumar

Big data views, in the context of distributed file system (DFS), are defined over structured, semi-structured and unstructured data that are voluminous in nature with the purpose to reduce the response time of queries over Big data. As the size of semi-structured and unstructured data in Big data is very large compared to structured data, a framework based on query attributes on Big data can be used to identify Big data views. Materializing Big data views can enhance the query response time and facilitate efficient distribution of data over the DFS based application. Given all the Big data views cannot be materialized, therefore, a subset of Big data views should be selected for materialization. The purpose of view selection for materialization is to improve query response time subject to resource constraints. The Big data view materialization problem was defined as a bi-objective problem with the two objectives- minimization of query evaluation cost and minimization of the update processing cost, with a constraint on the total size of the materialized views. This problem is addressed in this paper using multi-objective genetic algorithm NSGA-II. The experimental results show that proposed NSGA-II based Big data view selection algorithm is able to select reasonably good quality views for materialization.


2021 ◽  
pp. 67-74
Author(s):  
Liudmyla Zubyk ◽  
Yaroslav Zubyk

Big data is one of modern tools that have impacted the world industry a lot of. It also plays an important role in determining the ways in which businesses and organizations formulate their strategies and policies. However, very limited academic researches has been conducted into forecasting based on big data due to the difficulties in capturing, collecting, handling, and modeling of unstructured data, which is normally characterized by it’s confidential. We define big data in the context of ecosystem for future forecasting in business decision-making. It can be difficult for a single organization to possess all of the necessary capabilities to derive strategic business value from their findings. That’s why different organizations will build, and operate their own analytics ecosystems or tap into existing ones. An analytics ecosystem comprising a symbiosis of data, applications, platforms, talent, partnerships, and third-party service providers lets organizations be more agile and adapt to changing demands. Organizations participating in analytics ecosystems can examine, learn from, and influence not only their own business processes, but those of their partners. Architectures of popular platforms for forecasting based on big data are presented in this issue.


Web Services ◽  
2019 ◽  
pp. 1430-1443
Author(s):  
Louise Leenen ◽  
Thomas Meyer

The Governments, military forces and other organisations responsible for cybersecurity deal with vast amounts of data that has to be understood in order to lead to intelligent decision making. Due to the vast amounts of information pertinent to cybersecurity, automation is required for processing and decision making, specifically to present advance warning of possible threats. The ability to detect patterns in vast data sets, and being able to understanding the significance of detected patterns are essential in the cyber defence domain. Big data technologies supported by semantic technologies can improve cybersecurity, and thus cyber defence by providing support for the processing and understanding of the huge amounts of information in the cyber environment. The term big data analytics refers to advanced analytic techniques such as machine learning, predictive analysis, and other intelligent processing techniques applied to large data sets that contain different data types. The purpose is to detect patterns, correlations, trends and other useful information. Semantic technologies is a knowledge representation paradigm where the meaning of data is encoded separately from the data itself. The use of semantic technologies such as logic-based systems to support decision making is becoming increasingly popular. However, most automated systems are currently based on syntactic rules. These rules are generally not sophisticated enough to deal with the complexity of decisions required to be made. The incorporation of semantic information allows for increased understanding and sophistication in cyber defence systems. This paper argues that both big data analytics and semantic technologies are necessary to provide counter measures against cyber threats. An overview of the use of semantic technologies and big data technologies in cyber defence is provided, and important areas for future research in the combined domains are discussed.


Author(s):  
Pedro Caldeira Neves ◽  
Jorge Rodrigues Bernardino

The amount of data in our world has been exploding, and big data represents a fundamental shift in business decision-making. Analyzing such so-called big data is today a keystone of competition and the success of organizations depends on fast and well-founded decisions taken by relevant people in their specific area of responsibility. Business analytics (BA) represents a merger between data strategy and a collection of decision support technologies and mechanisms for enterprises aimed at enabling knowledge workers such as executives, managers, and analysts to make better and faster decisions. The authors review the concept of BA as an open innovation strategy and address the importance of BA in revolutionizing knowledge towards economics and business sustainability. Using big data with open source business analytics systems generates the greatest opportunities to increase competitiveness and differentiation in organizations. In this chapter, the authors describe and analyze business intelligence and analytics (BI&A) and four popular open source systems – BIRT, Jaspersoft, Pentaho, and SpagoBI.


Author(s):  
Jorge Bernardino ◽  
Pedro Caldeira Neves

The importance of supporting decision making for improving business performance is a crucial, yet challenging task in enterprise management. The amount of data in our world has been exploding and Big Data represents a fundamental shift in business decision-making. Analyzing such so-called Big Data is becoming a keystone of competition and the success of organizations depends on fast and well-founded decisions taken by relevant people in their specific area of responsibility. Business Intelligence (BI) is a collection of decision support technologies for enterprises aimed at enabling knowledge workers such as executives, managers, and analysts to make better and faster decisions. We review the concept of BI as an open innovation strategy and address the importance of BI in revolutionizing knowledge towards economics and business sustainability. Using Big Data with Open Source Business Intelligence Systems will generate the biggest opportunities to increase competitiveness and differentiation in organizations. In this chapter, we describe and analyze four popular open source BI systems - Jaspersoft, Jedox, Pentaho and Actuate/BIRT.


2019 ◽  
Vol 11 ◽  
pp. 184797901989077 ◽  
Author(s):  
Kiran Adnan ◽  
Rehan Akbar

During the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information from this large variety of unstructured data. Several techniques and methods have been presented for IE from unstructured data. However, numerous studies conducted on IE from a variety of unstructured data are limited to single data types such as text, image, audio, or video. This article reviews the existing IE techniques along with its subtasks, limitations, and challenges for the variety of unstructured data highlighting the impact of unstructured big data on IE techniques. To the best of our knowledge, there is no comprehensive study conducted to investigate the limitations of existing IE techniques for the variety of unstructured big data. The objective of the structured review presented in this article is twofold. First, it presents the overview of IE techniques from a variety of unstructured data such as text, image, audio, and video at one platform. Second, it investigates the limitations of these existing IE techniques due to the heterogeneity, dimensionality, and volume of unstructured big data. The review finds that advanced techniques for IE, particularly for multifaceted unstructured big data sets, are the utmost requirement of the organizations to manage big data and derive strategic information. Further, potential solutions are also presented to improve the unstructured big data IE systems for future research. These solutions will help to increase the efficiency and effectiveness of the data analytics process in terms of context-aware analytics systems, data-driven decision-making, and knowledge management.


Author(s):  
Joshua Devadason ◽  
◽  
Rehan Akbar

Big data is a valuable asset for organisation as it analyses and help to understand the customers, changes within their business environment, market analysis and future trends. The big data is multifaceted (different data types and versatile), and mostly exists in unstructured formats. The extraction of value from this data is challenging. The usability and productivity of this multifaceted unstructured data is greatly compromised. A number of factors and associated reasons affect the usability of unstructured big data. The present research work investigates these factors and associated reasons behind the usability issues of multifaceted unstructured big data. The identification of these factors contribute to develop solutions to reduce the lack of usability of highly unstructured big data. A detailed study of existing literature followed by survey questionnaire has been conducted to identify the factors and their reasons. Descriptive statistics has been used to analyse and interpret the data and results.


Author(s):  
Zhaohao Sun

Intelligent big data analytics is an emerging paradigm in the age of big data, analytics, and artificial intelligence (AI). This chapter explores intelligent big data analytics from a managerial perspective. More specifically, it first looks at the age of trinity and argues that intelligent big data analytics is at the center of the age of trinity. This chapter then proposes a managerial framework of intelligent big data analytics, which consists of intelligent big data analytics as a science, technology, system, service, and management for improving business decision making. Then it examines intelligent big data analytics for management taking into account four managerial functions: planning, organizing, leading, and controlling. The proposed approach in this chapter might facilitate the research and development of intelligent big data analytics, big data analytics, business intelligence, artificial intelligence, and data science.


Web Services ◽  
2019 ◽  
pp. 431-458 ◽  
Author(s):  
Jorge Bernardino ◽  
Pedro Caldeira Neves

The importance of supporting decision making for improving business performance is a crucial, yet challenging task in enterprise management. The amount of data in our world has been exploding and Big Data represents a fundamental shift in business decision-making. Analyzing such so-called Big Data is becoming a keystone of competition and the success of organizations depends on fast and well-founded decisions taken by relevant people in their specific area of responsibility. Business Intelligence (BI) is a collection of decision support technologies for enterprises aimed at enabling knowledge workers such as executives, managers, and analysts to make better and faster decisions. We review the concept of BI as an open innovation strategy and address the importance of BI in revolutionizing knowledge towards economics and business sustainability. Using Big Data with Open Source Business Intelligence Systems will generate the biggest opportunities to increase competitiveness and differentiation in organizations. In this chapter, we describe and analyze four popular open source BI systems - Jaspersoft, Jedox, Pentaho and Actuate/BIRT.


2014 ◽  
Author(s):  
Patrick L. David ◽  
Patrick D. Roberts

Recent strides in data analytics have uncovered interesting and actionable correlations across many different industries. Organizations are finding opportunities for making more intelligent business decisions by enhancing data with new insights and sources of information. In many cases these insights are gleaned through deeper analytics of existing data. The relatively large amount of information generated through the shipbuilding enterprise, coupled with rapidly advancing methods for optimizing data capture, points to a rapid convergence on exploiting data analytics for enhanced business decision making. An ad-hoc working group was formed consisting of multiple US shipyards with broad representation across the NSRP to investigate opportunities to leverage modern data analytics.


Sign in / Sign up

Export Citation Format

Share Document