scholarly journals Information Extraction from Multifaceted Unstructured Big Data

2019 ◽  
Vol 8 (2S8) ◽  
pp. 1398-1404

In the era of digital globalization, huge volume and variety of data are being produced at a very high rate. Every day, the world is producing around 2.5 quintillion bytes of data. According to IDC, by 2020, over 40 zettabytes of data will be generated and reproduced. Digital data have become a deluge, overwhelming in every field of information technology (IT), business, science and engineering. These fields are shifting to smart and advanced technologies such as smart manufacturing industries, data-aware medical sciences, and other smart applications. These applications are facilitating the industries in context of data-driven decision making, big data storage, and complex analysis of large data sets. Also, these applications are contributing to generate big data deluge where a variety of data necessitate the industries to use advanced IT approaches. 95% of the digital universe is unstructured data. It is rich data as it contains information that can play a vital role to improve big data analytics. The heterogeneity, complexity, lack of structured information, poor quality and scalability of unstructured data generates difficulties in adapting traditional information extraction techniques. Information extraction can play a vital role in transformation of unstructured data into useful information. A multistep pipeline with data preprocessing steps, extraction methods and representation are utmost requirement to improve the unstructured data analytics. In this regard, this paper presents a short review of information extraction process w.r.t. input data type, extraction methods with their corresponding techniques, and representation of extracted information. The issues with unstructured data and the challenges to information extraction from multifaceted unstructured big data as well as the future research directions have also been discussed

2019 ◽  
Vol 11 ◽  
pp. 184797901989077 ◽  
Author(s):  
Kiran Adnan ◽  
Rehan Akbar

During the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information from this large variety of unstructured data. Several techniques and methods have been presented for IE from unstructured data. However, numerous studies conducted on IE from a variety of unstructured data are limited to single data types such as text, image, audio, or video. This article reviews the existing IE techniques along with its subtasks, limitations, and challenges for the variety of unstructured data highlighting the impact of unstructured big data on IE techniques. To the best of our knowledge, there is no comprehensive study conducted to investigate the limitations of existing IE techniques for the variety of unstructured big data. The objective of the structured review presented in this article is twofold. First, it presents the overview of IE techniques from a variety of unstructured data such as text, image, audio, and video at one platform. Second, it investigates the limitations of these existing IE techniques due to the heterogeneity, dimensionality, and volume of unstructured big data. The review finds that advanced techniques for IE, particularly for multifaceted unstructured big data sets, are the utmost requirement of the organizations to manage big data and derive strategic information. Further, potential solutions are also presented to improve the unstructured big data IE systems for future research. These solutions will help to increase the efficiency and effectiveness of the data analytics process in terms of context-aware analytics systems, data-driven decision-making, and knowledge management.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Kiran Adnan ◽  
Rehan Akbar

Abstract Process of information extraction (IE) is used to extract useful information from unstructured or semi-structured data. Big data arise new challenges for IE techniques with the rapid growth of multifaceted also called as multidimensional unstructured data. Traditional IE systems are inefficient to deal with this huge deluge of unstructured big data. The volume and variety of big data demand to improve the computational capabilities of these IE systems. It is necessary to understand the competency and limitations of the existing IE techniques related to data pre-processing, data extraction and transformation, and representations for huge volumes of multidimensional unstructured data. Numerous studies have been conducted on IE, addressing the challenges and issues for different data types such as text, image, audio and video. Very limited consolidated research work have been conducted to investigate the task-dependent and task-independent limitations of IE covering all data types in a single study. This research work address this limitation and present a systematic literature review of state-of-the-art techniques for a variety of big data, consolidating all data types. Recent challenges of IE are also identified and summarized. Potential solutions are proposed giving future research directions in big data IE. The research is significant in terms of recent trends and challenges related to big data analytics. The outcome of the research and recommendations will help to improve the big data analytics by making it more productive.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Marwa Rabe Mohamed Elkmash ◽  
Magdy Gamal Abdel-Kader ◽  
Bassant Badr El Din

Purpose This study aims to investigate and explore the impact of big data analytics (BDA) as a mechanism that could develop the ability to measure customers’ performance. To accomplish the research aim, the theoretical discussion was developed through the combination of the diffusion of innovation theory with the technology acceptance model (TAM) that is less developed for the research field of this study. Design/methodology/approach Empirical data was obtained using Web-based quasi-experiments with 104 Egyptian accounting professionals. Further, the Wilcoxon signed-rank test and the chi-square goodness-of-fit test were used to analyze data. Findings The empirical results indicate that measuring customers’ performance based on BDA increase the organizations’ ability to analyze the customers’ unstructured data, decrease the cost of customers’ unstructured data analysis, increase the ability to handle the customers’ problems quickly, minimize the time spent to analyze the customers’ data and obtaining the customers’ performance reports and control managers’ bias when they measure customer satisfaction. The study findings supported the accounting professionals’ acceptance of BDA through the TAM elements: the intention to use (R), perceived usefulness (U) and the perceived ease of use (E). Research limitations/implications This study has several limitations that could be addressed in future research. First, this study focuses on customers’ performance measurement (CPM) only and ignores other performance measurements such as employees’ performance measurement and financial performance measurement. Future research can examine these areas. Second, this study conducts a Web-based experiment with Master of Business Administration students as a study’s participants, researchers could conduct a laboratory experiment and report if there are differences. Third, owing to the novelty of the topic, there was a lack of theoretical evidence in developing the study’s hypotheses. Practical implications This study succeeds to provide the much-needed empirical evidence for BDA positive impact in improving CPM efficiency through the proposed framework (i.e. CPM and BDA framework). Furthermore, this study contributes to the improvement of the performance measurement process, thus, the decision-making process with meaningful and proper insights through the capability of collecting and analyzing the customers’ unstructured data. On a practical level, the company could eventually use this study’s results and the new insights to make better decisions and develop its policies. Originality/value This study holds significance as it provides the much-needed empirical evidence for BDA positive impact in improving CPM efficiency. The study findings will contribute to the enhancement of the performance measurement process through the ability of gathering and analyzing the customers’ unstructured data.


Author(s):  
Andreas Schmidt ◽  
Martin Atzmueller ◽  
Martin Hollender

This chapter provides an overview of methods for preprocessing structured and unstructured data in the scope of Big Data. Specifically, this chapter summarizes according methods in the context of a real-world dataset in a petro-chemical production setting. The chapter describes state-of-the-art methods for data preparation for Big Data Analytics. Furthermore, the chapter discusses experiences and first insights in a specific project setting with respect to a real-world case study. Furthermore, interesting directions for future research are outlined.


Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2282
Author(s):  
Shikah J. Alsunaidi ◽  
Abdullah M. Almuhaideb ◽  
Nehad M. Ibrahim ◽  
Fatema S. Shaikh ◽  
Kawther S. Alqudaihi ◽  
...  

The COVID-19 epidemic has caused a large number of human losses and havoc in the economic, social, societal, and health systems around the world. Controlling such epidemic requires understanding its characteristics and behavior, which can be identified by collecting and analyzing the related big data. Big data analytics tools play a vital role in building knowledge required in making decisions and precautionary measures. However, due to the vast amount of data available on COVID-19 from various sources, there is a need to review the roles of big data analysis in controlling the spread of COVID-19, presenting the main challenges and directions of COVID-19 data analysis, as well as providing a framework on the related existing applications and studies to facilitate future research on COVID-19 analysis. Therefore, in this paper, we conduct a literature review to highlight the contributions of several studies in the domain of COVID-19-based big data analysis. The study presents as a taxonomy several applications used to manage and control the pandemic. Moreover, this study discusses several challenges encountered when analyzing COVID-19 data. The findings of this paper suggest valuable future directions to be considered for further research and applications.


2015 ◽  
Vol 2015 ◽  
pp. 1-16 ◽  
Author(s):  
Ashwin Belle ◽  
Raghuram Thiagarajan ◽  
S. M. Reza Soroushmehr ◽  
Fatemeh Navidi ◽  
Daniel A. Beard ◽  
...  

The rapidly expanding field of big data analytics has started to play a pivotal role in the evolution of healthcare practices and research. It has provided tools to accumulate, manage, analyze, and assimilate large volumes of disparate, structured, and unstructured data produced by current healthcare systems. Big data analytics has been recently applied towards aiding the process of care delivery and disease exploration. However, the adoption rate and research development in this space is still hindered by some fundamental problems inherent within the big data paradigm. In this paper, we discuss some of these major challenges with a focus on three upcoming and promising areas of medical research: image, signal, and genomics based analytics. Recent research which targets utilization of large volumes of medical data while combining multimodal data from disparate sources is discussed. Potential areas of research within this field which have the ability to provide meaningful impact on healthcare delivery are also examined.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Rajesh Kumar Singh ◽  
Saurabh Agrawal ◽  
Abhishek Sahu ◽  
Yigit Kazancoglu

PurposeThe proposed article is aimed at exploring the opportunities, challenges and possible outcomes of incorporating big data analytics (BDA) into health-care sector. The purpose of this study is to find the research gaps in the literature and to investigate the scope of incorporating new strategies in the health-care sector for increasing the efficiency of the system.Design/methodology/approachFora state-of-the-art literature review, a systematic literature review has been carried out to find out research gaps in the field of healthcare using big data (BD) applications. A detailed research methodology including material collection, descriptive analysis and categorization is utilized to carry out the literature review.FindingsBD analysis is rapidly being adopted in health-care sector for utilizing precious information available in terms of BD. However, it puts forth certain challenges that need to be focused upon. The article identifies and explains the challenges thoroughly.Research limitations/implicationsThe proposed study will provide useful guidance to the health-care sector professionals for managing health-care system. It will help academicians and physicians for evaluating, improving and benchmarking the health-care strategies through BDA in the health-care sector. One of the limitations of the study is that it is based on literature review and more in-depth studies may be carried out for the generalization of results.Originality/valueThere are certain effective tools available in the market today that are currently being used by both small and large businesses and corporations. One of them is BD, which may be very useful for health-care sector. A comprehensive literature review is carried out for research papers published between 1974 and 2021.


Sign in / Sign up

Export Citation Format

Share Document