Intelligent Management and Efficient Operation of Big Data

This chapter details how Big Data can be used and implemented in networking and computing infrastructures. Specifically, it addresses three main aspects: the timely extraction of relevant knowledge from heterogeneous, and very often unstructured large data sources; the enhancement on the performance of processing and networking (cloud) infrastructures that are the most important foundational pillars of Big Data applications or services; and novel ways to efficiently manage network infrastructures with high-level composed policies for supporting the transmission of large amounts of data with distinct requisites (video vs. non-video). A case study involving an intelligent management solution to route data traffic with diverse requirements in a wide area Internet Exchange Point is presented, discussed in the context of Big Data, and evaluated.

Download Full-text

Big Data Architecture Components

Advances in Data Mining and Database Management - Big Data Processing With Hadoop ◽

10.4018/978-1-5225-3790-8.ch002 ◽

2018 ◽

pp. 10-31

Keyword(s):

Big Data ◽

Missing Values ◽

Noise Removal ◽

Data Sources ◽

Data Types ◽

Layer Analysis ◽

External Sources ◽

Internal Sources ◽

High Level

The previous chapter overviewed big data including its types, sources, analytic techniques, and applications. This chapter briefly discusses the architecture components dealing with the huge volume of data. The complexity of big data types defines a logical architecture with layers and high-level components to obtain a big data solution that includes data sources with the relation to atomic patterns. The dimensions of the approach include volume, variety, velocity, veracity, and governance. The diverse layers of the architecture are big data sources, data massaging and store layer, analysis layer, and consumption layer. Big data sources are data collected from various sources to perform analytics by data scientists. Data can be from internal and external sources. Internal sources comprise transactional data, device sensors, business documents, internal files, etc. External sources can be from social network profiles, geographical data, data stores, etc. Data massage is the process of extracting data by preprocessing like removal of missing values, dimensionality reduction, and noise removal to attain a useful format to be stored. Analysis layer is to provide insight with preferred analytics techniques and tools. The analytics methods, issues to be considered, requirements, and tools are widely mentioned. Consumption layer being the result of business insight can be outsourced to sources like retail marketing, public sector, financial body, and media. Finally, a case study of architectural drivers is applied on a retail industry application and its challenges and usecases are discussed.

Download Full-text

Preliminary Benefits of Big Data in the Construction Industry: A Case Study

Proceedings of the Institution of Civil Engineers - Management Procurement and Law ◽

10.1680/jmapl.21.00027 ◽

2022 ◽

pp. 1-11

Author(s):

Bernard Tuffour Atuahene ◽

Sittimont Kanjanabootra ◽

Thayaparan Gajendran

Keyword(s):

Big Data ◽

Construction Industry ◽

Construction Projects ◽

Big Data Applications ◽

Data Application ◽

Construction Firm ◽

Big Data Application ◽

Tangible Benefit ◽

Design Construction

Big data applications consist of i) data collection using big data sources, ii) storing and processing the data, and iii) analysing data to gain insights for creating organisational benefit. The influx of digital technologies and digitization in the construction process includes big data as one newly emerging digital technology adopted in the construction industry. Big data application is in a nascent stage in construction, and there is a need to understand the tangible benefit(s) that big data can offer the construction industry. This study explores the benefits of big data in the construction industry. Using a qualitative case study design, construction professionals in an Australian Construction firm were interviewed. The research highlights that the benefits of big data include reduction of litigation amongst projects stakeholders, enablement of near to real-time communication, and facilitation of effective subcontractor selection. By implication, on a broader scale, these benefits can improve contract management, procurement, and management of construction projects. This study contributes to an ongoing discourse on big data application, and more generally, digitization in the construction industry.

Download Full-text

Security and Privacy Issues of Big Data

Cyber Law, Privacy, and Security ◽

10.4018/978-1-5225-8897-9.ch019 ◽

2019 ◽

pp. 375-407

Author(s):

José Moura ◽

Carlos Serrão

Keyword(s):

Social Networks ◽

Big Data ◽

Personal Data ◽

Security And Privacy ◽

Data Systems ◽

Computing Systems ◽

Big Data Applications ◽

Big Data Systems ◽

Privacy Issues

This chapter revises the most important aspects in how computing infrastructures should be configured and intelligently managed to fulfill the most notably security aspects required by Big Data applications. One of them is privacy. It is a pertinent aspect to be addressed because users share more and more personal data and content through their devices and computers to social networks and public clouds. So, a secure framework to social networks is a very hot topic research. This last topic is addressed in one of the two sections of the current chapter with case studies. In addition, the traditional mechanisms to support security such as firewalls and demilitarized zones are not suitable to be applied in computing systems to support Big Data. SDN is an emergent management solution that could become a convenient mechanism to implement security in Big Data systems, as we show through a second case study at the end of the chapter. This also discusses current relevant work and identifies open issues.

Download Full-text

Big Data Applications in Cancer Research: A Case Study at the Brazilian National Cancer Institute

Proceedings of the International Conference on Information Technology & Systems (ICITS 2018) - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-319-73450-7_44 ◽

2018 ◽

pp. 467-475

Author(s):

Antônio Augusto Gonçalves ◽

Carlos Henrique Fernandes Martins ◽

José Geraldo Pereira Barbosa ◽

Sandro Luís Freire de Castro Silva

Keyword(s):

Cancer Research ◽

Big Data ◽

National Cancer Institute ◽

Big Data Applications

Download Full-text

Big Data: A Strategic Perspective

International Journal of Semantic Computing ◽

10.1142/s1793351x1440011x ◽

2014 ◽

Vol 08 (03) ◽

pp. 319-333

Author(s):

David Alfred Ostrowski

Keyword(s):

Big Data ◽

Software Development ◽

Data Sources ◽

Apache Hadoop ◽

Big Data Applications ◽

Programming Techniques ◽

Future Direction ◽

Strategic Perspective ◽

New Applications

Big Data has become ubiquitous across all areas of research allowing for new applications that were not possible earlier. Unlike software development relying on traditional data sources, Big Data applications present their own unique challenges to appropriately harness the utility of the Apache Hadoop architecture. In this paper, we introduce fundamental concepts of Hadoop and explore its usage as well as future direction. We also present our strategy for exploring the Hadoop architecture including addressing issues of scalability, customization of code and utilization of programming techniques.

Download Full-text

Big Data Query Optimization -Literature Survey

10.21203/rs.3.rs-655386/v1 ◽

2021 ◽

Author(s):

Anuja S. ◽

Malathy C.

Keyword(s):

Big Data ◽

Query Optimization ◽

Small Businesses ◽

Large Data ◽

Efficient Manner ◽

Raw Data ◽

Data Query ◽

Big Data Applications ◽

Large Databases ◽

Private And Public

Abstract In today's world, most of the private and public sector organizations deal with massive amounts of raw data, which includes information and knowledge in their secret layer. In addition, the format, scale, variety, and velocity of generated data make it more difficult to use the algorithms in an efficient manner. This complexity necessitates the use of sophisticated methods, strategies, and algorithms to solve the challenges of managing raw data. Big data query optimization (BDQO) requires businesses to define, diagnose, forecast, prescribe, and cognize hidden growth opportunities and guiding them toward achieving market value. BDQO uses advanced analytical methods to extract information from an increasingly growing volume of data, resulting in a reduction in the difficulty of the decision-making process. Hadoop, Apache Hive, No SQL, Map Reduce, and HPCC are the technologies used in big data applications to manage large data. It is less costly to consume data for query processing because big data provides scalability. However, small businesses will never be able to query large databases. Joining tables with millions of tuples could take hours. Parallelism, which solves the problem by using more processors, may be a potential solution. Unfortunately, small businesses cannot afford to operate on a shoestring budget. There are many techniques to tackle the problem. The technologies used in the big data query optimization process are discussed in depth in this paper.

Download Full-text

Software-Defined Networking for Scalable Cloud-based Services to Improve System Performance of Hadoop-based Big Data Applications

International Journal of Grid and High Performance Computing ◽

10.4018/ijghpc.2016040101 ◽

2016 ◽

Vol 8 (2) ◽

pp. 1-22 ◽

Cited By ~ 7

Author(s):

Desta Haileselassie Hagos

Keyword(s):

Big Data ◽

System Performance ◽

Dynamic Network ◽

Software Defined Networking ◽

Network Control ◽

Virtual Networks ◽

Big Data Applications ◽

New Research ◽

Cloud Infrastructures ◽

The Impact

The rapid growth of Cloud Computing has brought with it major new challenges in the automated manageability, dynamic network reconfiguration, provisioning, scalability and flexibility of virtual networks. OpenFlow-enabled Software-Defined Networking (SDN) alleviates these key challenges through the abstraction of lower level functionality that removes the complexities of the underlying hardware by separating the data and control planes. SDN has an efficient, dynamic, automated network management, higher availability and application provisioning through programmable interfaces which are very critical for flexible and scalable cloud-based services. In this study, the author explores broadly useful open technologies and methodologies for applying an OpenFlow-enabled SDN to scalable cloud-based services and a variety of diverse applications. The approach in this paper introduces new research challenges in the design and implementation of advanced techniques for bringing an SDN-enabled components and big data applications into a cloud environment in a dynamic setting. Some of these challenges become pressing concerns to cloud providers when managing virtual networks and data centers, while others complicate the development and deployment of cloud-hosted applications from the perspective of developers and end users. However, the growing demand for manageable, scalable and flexible clouds necessitates that effective solutions to these challenges be found. Hence, through real-world research validation use cases, this paper aims at exploring useful mechanisms for the role and potential of an OpenFlow-enabled SDN and its direct benefit for scalable cloud-based services. Finally, it demonstrates the impact of an OpenFlow-enabled SDN that fully embraces the opportunities and challenges of cloud infrastructures to improve the system performance of Hadoop-based big data applications by utilizing the network control capabilities of an OpenFlow to solve network congestion.

Download Full-text

Software-Defined Networking for Scalable Cloud-Based Services to Improve System Performance of Hadoop-Based Big Data Applications

Web Services ◽

10.4018/978-1-5225-7501-6.ch076 ◽

2019 ◽

pp. 1460-1484

Author(s):

Desta Haileselassie Hagos

Keyword(s):

Big Data ◽

System Performance ◽

Dynamic Network ◽

Software Defined Networking ◽

Network Control ◽

Virtual Networks ◽

Big Data Applications ◽

New Research ◽

Cloud Infrastructures ◽

And Control

The rapid growth of Cloud Computing has brought with it major new challenges in the automated manageability, dynamic network reconfiguration, provisioning, scalability and flexibility of virtual networks. OpenFlow-enabled Software-Defined Networking (SDN) alleviates these key challenges through the abstraction of lower level functionality that removes the complexities of the underlying hardware by separating the data and control planes. SDN has an efficient, dynamic, automated network management, higher availability and application provisioning through programmable interfaces which are very critical for flexible and scalable cloud-based services. In this study, the author explores broadly useful open technologies and methodologies for applying an OpenFlow-enabled SDN to scalable cloud-based services and a variety of diverse applications. The approach in this paper introduces new research challenges in the design and implementation of advanced techniques for bringing an SDN-enabled components and big data applications into a cloud environment in a dynamic setting. Some of these challenges become pressing concerns to cloud providers when managing virtual networks and data centers, while others complicate the development and deployment of cloud-hosted applications from the perspective of developers and end users. However, the growing demand for manageable, scalable and flexible clouds necessitates that effective solutions to these challenges be found. Hence, through real-world research validation use cases, this paper aims at exploring useful mechanisms for the role and potential of an OpenFlow-enabled SDN and its direct benefit for scalable cloud-based services. Finally, it demonstrates the impact of an OpenFlow-enabled SDN that fully embraces the opportunities and challenges of cloud infrastructures to improve the system performance of Hadoop-based big data applications by utilizing the network control capabilities of an OpenFlow to solve network congestion.

Download Full-text

A Specification Framework for Big Data Initiatives

Strategic Data-Based Wisdom in the Big Data Era - Advances in Knowledge Acquisition, Transfer, and Management ◽

10.4018/978-1-4666-8122-4.ch015 ◽

2015 ◽

pp. 257-274 ◽

Cited By ~ 2

Author(s):

Anh D. Ta ◽

Marcus Tanque ◽

Montressa Washington

Keyword(s):

Big Data ◽

Return On Investment ◽

Data Sources ◽

Top Down ◽

Raw Data ◽

Avant Garde ◽

High Level ◽

Data Requirements ◽

Big Data Technology ◽

Three Components

Given the emergence of big data technology and its rising popularity, it is important to ensure that the use of this avant-garde technology directly addresses the enterprise goals which are required to maximize the Return-On-Investment (ROI). This chapter aims to address a specification framework for the process of transforming enterprise data into wisdom or actionable information through the use of big data technology. The framework is based on proven methodologies, which consist of three components: Specify, Design, and Refine. The recommended framework provides a systematic, top-down process to extrapolate big data requirements from high-level technical and enterprise goals. The framework also provides a process for managing the quality and relationship between raw data sources and big data products.

Download Full-text