A Survey on Graph Database Management Techniques for Huge Unstructured Data

Data analysis, data management, and big data play a major role in both social and business perspective, in the last decade. Nowadays, the graph database is the hottest and trending research topic. A graph database is preferred to deal with the dynamic and complex relationships in connected data and offer better results. Every data element is represented as a node. For example, in social media site, a person is represented as a node, and its properties name, age, likes, and dislikes, etc and the nodes are connected with the relationships via edges. Use of graph database is expected to be beneficial in business, and social networking sites that generate huge unstructured data as that Big Data requires proper and efficient computational techniques to handle with. This paper reviews the existing graph data computational techniques and the research work, to offer the future research line up in graph database management.

Download Full-text

An analytical study of information extraction from unstructured and multidimensional big data

Journal Of Big Data ◽

10.1186/s40537-019-0254-8 ◽

2019 ◽

Vol 6 (1) ◽

Cited By ~ 7

Author(s):

Kiran Adnan ◽

Rehan Akbar

Keyword(s):

Big Data ◽

Information Extraction ◽

Data Analytics ◽

Data Extraction ◽

Research Work ◽

Big Data Analytics ◽

Unstructured Data ◽

Future Research ◽

Data Types ◽

Future Research Directions

Abstract Process of information extraction (IE) is used to extract useful information from unstructured or semi-structured data. Big data arise new challenges for IE techniques with the rapid growth of multifaceted also called as multidimensional unstructured data. Traditional IE systems are inefficient to deal with this huge deluge of unstructured big data. The volume and variety of big data demand to improve the computational capabilities of these IE systems. It is necessary to understand the competency and limitations of the existing IE techniques related to data pre-processing, data extraction and transformation, and representations for huge volumes of multidimensional unstructured data. Numerous studies have been conducted on IE, addressing the challenges and issues for different data types such as text, image, audio and video. Very limited consolidated research work have been conducted to investigate the task-dependent and task-independent limitations of IE covering all data types in a single study. This research work address this limitation and present a systematic literature review of state-of-the-art techniques for a variety of big data, consolidating all data types. Recent challenges of IE are also identified and summarized. Potential solutions are proposed giving future research directions in big data IE. The research is significant in terms of recent trends and challenges related to big data analytics. The outcome of the research and recommendations will help to improve the big data analytics by making it more productive.

Download Full-text

An experimental investigation of the impact of using big data analytics on customers’ performance measurement

Accounting Research Journal ◽

10.1108/arj-04-2020-0080 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Marwa Rabe Mohamed Elkmash ◽

Magdy Gamal Abdel-Kader ◽

Bassant Badr El Din

Keyword(s):

Big Data ◽

Performance Measurement ◽

Data Analytics ◽

Positive Impact ◽

Big Data Analytics ◽

Unstructured Data ◽

Future Research ◽

Web Based ◽

Content Type ◽

The Impact

Purpose This study aims to investigate and explore the impact of big data analytics (BDA) as a mechanism that could develop the ability to measure customers’ performance. To accomplish the research aim, the theoretical discussion was developed through the combination of the diffusion of innovation theory with the technology acceptance model (TAM) that is less developed for the research field of this study. Design/methodology/approach Empirical data was obtained using Web-based quasi-experiments with 104 Egyptian accounting professionals. Further, the Wilcoxon signed-rank test and the chi-square goodness-of-fit test were used to analyze data. Findings The empirical results indicate that measuring customers’ performance based on BDA increase the organizations’ ability to analyze the customers’ unstructured data, decrease the cost of customers’ unstructured data analysis, increase the ability to handle the customers’ problems quickly, minimize the time spent to analyze the customers’ data and obtaining the customers’ performance reports and control managers’ bias when they measure customer satisfaction. The study findings supported the accounting professionals’ acceptance of BDA through the TAM elements: the intention to use (R), perceived usefulness (U) and the perceived ease of use (E). Research limitations/implications This study has several limitations that could be addressed in future research. First, this study focuses on customers’ performance measurement (CPM) only and ignores other performance measurements such as employees’ performance measurement and financial performance measurement. Future research can examine these areas. Second, this study conducts a Web-based experiment with Master of Business Administration students as a study’s participants, researchers could conduct a laboratory experiment and report if there are differences. Third, owing to the novelty of the topic, there was a lack of theoretical evidence in developing the study’s hypotheses. Practical implications This study succeeds to provide the much-needed empirical evidence for BDA positive impact in improving CPM efficiency through the proposed framework (i.e. CPM and BDA framework). Furthermore, this study contributes to the improvement of the performance measurement process, thus, the decision-making process with meaningful and proper insights through the capability of collecting and analyzing the customers’ unstructured data. On a practical level, the company could eventually use this study’s results and the new insights to make better decisions and develop its policies. Originality/value This study holds significance as it provides the much-needed empirical evidence for BDA positive impact in improving CPM efficiency. The study findings will contribute to the enhancement of the performance measurement process through the ability of gathering and analyzing the customers’ unstructured data.

Download Full-text

Applications of Big Data in the Digital India: Opportunities and Challenges

IRA-International Journal of Technology & Engineering (ISSN 2455-4480) ◽

10.21013/jte.v3.n3.p7 ◽

2016 ◽

Vol 3 (3) ◽

Author(s):

Vinay Kumar ◽

Arpana Chaturvedi

Keyword(s):

Big Data ◽

Social Networking ◽

Exponential Growth ◽

Social Networking Sites ◽

Unstructured Data ◽

Threat Perception ◽

Data Repository ◽

Huge Amount ◽

Real Challenge ◽

Area Of Application

<div><p><em>With the advent of Social Networking Sites (SNS), volumes of data are generated daily. Most of these data are multimedia type and unstructured with exponential growth. This exponential growth of variety, volume and complexity of structured and unstructured data leads to the concept of big data. Managing big data and harnessing its benefits is a real challenge. With increase in access to big data repository for various applications, security and access control is another aspect that needs to be considered while managing big data. We have discussed area of application of big data, opportunities it provides and challenges that we face in the managing such huge amount of data for various applications. Issues related to security against different threat perception of big data are also discussed. </em></p></div>

Download Full-text

The Impact of Big Data on Security

Big Data ◽

10.4018/978-1-4666-9840-6.ch068 ◽

2016 ◽

pp. 1495-1518

Author(s):

Mohammad Alaa Hussain Al-Hamami

Keyword(s):

Social Media ◽

Big Data ◽

Management System ◽

Database Management ◽

Database Systems ◽

Structured Data ◽

Database Management System ◽

Unstructured Data ◽

And Behavior ◽

The Impact

Big Data is comprised systems, to remain competitive by techniques emerging due to Big Data. Big Data includes structured data, semi-structured and unstructured. Structured data are those data formatted for use in a database management system. Semi-structured and unstructured data include all types of unformatted data including multimedia and social media content. Among practitioners and applied researchers, the reaction to data available through blogs, Twitter, Facebook, or other social media can be described as a “data rush” promising new insights about consumers' choices and behavior and many other issues. In the past Big Data has been used just by very large organizations, governments and large enterprises that have the ability to create its own infrastructure for hosting and mining large amounts of data. This chapter will show the requirements for the Big Data environments to be protected using the same rigorous security strategies applied to traditional database systems.

Download Full-text

Survelliance of Type I and II Diabetic Subjects on Physical Characteristics

Advances in Web Technologies and Engineering - Challenges and Opportunities for the Convergence of IoT, Big Data, and Cloud Computing ◽

10.4018/978-1-7998-3111-2.ch017 ◽

2021 ◽

pp. 308-344

Author(s):

Rohit Rastogi ◽

Devendra Kumar Chaturvedi ◽

Parul Singhal

Keyword(s):

Big Data ◽

Data Analysis ◽

Electronic Health Records ◽

Analysis Data ◽

Healthcare Systems ◽

Clinical Analysis ◽

Big Data Analysis ◽

Diagnostic Information ◽

Unstructured Data ◽

Type I

The Delhi and NCR healthcare systems are rapidly registering electronic health records and diagnostic information available electronically. Furthermore, clinical analysis is rapidly advancing, and large quantities of information are examined and new insights are part of the analysis of this technology experienced as big data. It provides tools for storing, managing, studying, and assimilating large amounts of robust, structured, and unstructured data generated by existing medical organizations. Recently, data analysis data have been used to help provide care. The present study aimed to analyse diabetes with the latest IoT and big data analysis techniques and its correlation with stress (TTH) on human health. The authors have tried to include age, gender, and insulin factor and its correlation with diabetes. Overall, in conclusion, TTH cases increasing with age in case of males and not following the pattern of diabetes variation with age, while in the case of females, TTH pattern variation is the same as diabetes (i.e., increasing trend up to age of 60 then decreasing).

Download Full-text

Limitations of information extraction methods and techniques for heterogeneous unstructured big data

International Journal of Engineering Business Management ◽

10.1177/1847979019890771 ◽

2019 ◽

Vol 11 ◽

pp. 184797901989077 ◽

Cited By ~ 1

Author(s):

Kiran Adnan ◽

Rehan Akbar

Keyword(s):

Big Data ◽

Information Extraction ◽

Extraction Methods ◽

Unstructured Data ◽

Future Research ◽

Data Sets ◽

Data Types ◽

Efficiency And Effectiveness ◽

Single Data ◽

The Impact

During the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information from this large variety of unstructured data. Several techniques and methods have been presented for IE from unstructured data. However, numerous studies conducted on IE from a variety of unstructured data are limited to single data types such as text, image, audio, or video. This article reviews the existing IE techniques along with its subtasks, limitations, and challenges for the variety of unstructured data highlighting the impact of unstructured big data on IE techniques. To the best of our knowledge, there is no comprehensive study conducted to investigate the limitations of existing IE techniques for the variety of unstructured big data. The objective of the structured review presented in this article is twofold. First, it presents the overview of IE techniques from a variety of unstructured data such as text, image, audio, and video at one platform. Second, it investigates the limitations of these existing IE techniques due to the heterogeneity, dimensionality, and volume of unstructured big data. The review finds that advanced techniques for IE, particularly for multifaceted unstructured big data sets, are the utmost requirement of the organizations to manage big data and derive strategic information. Further, potential solutions are also presented to improve the unstructured big data IE systems for future research. These solutions will help to increase the efficiency and effectiveness of the data analytics process in terms of context-aware analytics systems, data-driven decision-making, and knowledge management.

Download Full-text

Big Data Storage System Based on a Distributed Hash Tables System

International Journal of Database Management Systems ◽

10.5121/ijdms.2020.12501 ◽

2020 ◽

Vol 12 (5) ◽

pp. 1-9

Author(s):

Telesphore Tiendrebeogo ◽

Mamadou Diarra

Keyword(s):

Big Data ◽

Data Storage ◽

Storage System ◽

Research Work ◽

Future Research ◽

Distributed Hash Tables ◽

Hash Tables ◽

Wide Range ◽

Data Storage System ◽

Big Data Storage

The Big Data is unavoidable considering the place of the digital is the predominant form of communication in the daily life of the consumer. The control of its stakes and the quality its data must be a priority in order not to distort the strategies arising from their treatment in the aim to derive profit. In order to achieve this, a lot of research work has been carried out companies and several platforms created. MapReduce, is one of the enabling technologies, has proven to be applicable to a wide range of fields. However, despite its importance recent work has shown its limitations. And to remedy this, the Distributed Hash Tables (DHT) has been used. Thus, this document not only analyses the and MapReduce implementations and Top-Level Domain (TLD)s in general, but it also provides a description of a model of DHT as well as some guidelines for the planification of the future research.

Download Full-text

Factors Affecting The Usability Of Unstructured Big Data

Journal of Independent Studies and Research - Computing ◽

10.31645/09 ◽

2020 ◽

Author(s):

Joshua Devadason ◽

◽

Rehan Akbar

Keyword(s):

Big Data ◽

Research Work ◽

Business Environment ◽

Market Analysis ◽

Unstructured Data ◽

Future Trends ◽

Survey Questionnaire ◽

Data Types ◽

Factors Affecting ◽

Number Of Factors

Big data is a valuable asset for organisation as it analyses and help to understand the customers, changes within their business environment, market analysis and future trends. The big data is multifaceted (different data types and versatile), and mostly exists in unstructured formats. The extraction of value from this data is challenging. The usability and productivity of this multifaceted unstructured data is greatly compromised. A number of factors and associated reasons affect the usability of unstructured big data. The present research work investigates these factors and associated reasons behind the usability issues of multifaceted unstructured big data. The identification of these factors contribute to develop solutions to reduce the lack of usability of highly unstructured big data. A detailed study of existing literature followed by survey questionnaire has been conducted to identify the factors and their reasons. Descriptive statistics has been used to analyse and interpret the data and results.

Download Full-text

Information Extraction from Multifaceted Unstructured Big Data

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1074.0882s819 ◽

2019 ◽

Vol 8 (2S8) ◽

pp. 1398-1404

Keyword(s):

Big Data ◽

Information Extraction ◽

Data Analytics ◽

Extraction Methods ◽

Vital Role ◽

Unstructured Data ◽

Digital Data ◽

High Rate ◽

Smart Manufacturing ◽

Future Research

In the era of digital globalization, huge volume and variety of data are being produced at a very high rate. Every day, the world is producing around 2.5 quintillion bytes of data. According to IDC, by 2020, over 40 zettabytes of data will be generated and reproduced. Digital data have become a deluge, overwhelming in every field of information technology (IT), business, science and engineering. These fields are shifting to smart and advanced technologies such as smart manufacturing industries, data-aware medical sciences, and other smart applications. These applications are facilitating the industries in context of data-driven decision making, big data storage, and complex analysis of large data sets. Also, these applications are contributing to generate big data deluge where a variety of data necessitate the industries to use advanced IT approaches. 95% of the digital universe is unstructured data. It is rich data as it contains information that can play a vital role to improve big data analytics. The heterogeneity, complexity, lack of structured information, poor quality and scalability of unstructured data generates difficulties in adapting traditional information extraction techniques. Information extraction can play a vital role in transformation of unstructured data into useful information. A multistep pipeline with data preprocessing steps, extraction methods and representation are utmost requirement to improve the unstructured data analytics. In this regard, this paper presents a short review of information extraction process w.r.t. input data type, extraction methods with their corresponding techniques, and representation of extracted information. The issues with unstructured data and the challenges to information extraction from multifaceted unstructured big data as well as the future research directions have also been discussed

Download Full-text

Data Preparation for Big Data Analytics

Advances in Business Information Systems and Analytics - Enterprise Big Data Engineering, Analytics, and Management ◽

10.4018/978-1-5225-0293-7.ch010 ◽

2016 ◽

pp. 157-170 ◽

Cited By ~ 3

Author(s):

Andreas Schmidt ◽

Martin Atzmueller ◽

Martin Hollender

Keyword(s):

Big Data ◽

Real World ◽

Data Analytics ◽

Big Data Analytics ◽

Unstructured Data ◽

Future Research ◽

Data Preparation ◽

Chemical Production ◽

Specific Project

This chapter provides an overview of methods for preprocessing structured and unstructured data in the scope of Big Data. Specifically, this chapter summarizes according methods in the context of a real-world dataset in a petro-chemical production setting. The chapter describes state-of-the-art methods for data preparation for Big Data Analytics. Furthermore, the chapter discusses experiences and first insights in a specific project setting with respect to a real-world case study. Furthermore, interesting directions for future research are outlined.

Download Full-text