Framework to extract context vectors from unstructured data using big data analytics

The rapidly expanding field of big data analytics has started to play a pivotal role in the evolution of healthcare practices and research. It has provided tools to accumulate, manage, analyze, and assimilate large volumes of disparate, structured, and unstructured data produced by current healthcare systems. Big data analytics has been recently applied towards aiding the process of care delivery and disease exploration. However, the adoption rate and research development in this space is still hindered by some fundamental problems inherent within the big data paradigm. In this paper, we discuss some of these major challenges with a focus on three upcoming and promising areas of medical research: image, signal, and genomics based analytics. Recent research which targets utilization of large volumes of medical data while combining multimodal data from disparate sources is discussed. Potential areas of research within this field which have the ability to provide meaningful impact on healthcare delivery are also examined.

Download Full-text

An experimental investigation of the impact of using big data analytics on customers’ performance measurement

Accounting Research Journal ◽

10.1108/arj-04-2020-0080 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Marwa Rabe Mohamed Elkmash ◽

Magdy Gamal Abdel-Kader ◽

Bassant Badr El Din

Keyword(s):

Big Data ◽

Performance Measurement ◽

Data Analytics ◽

Positive Impact ◽

Big Data Analytics ◽

Unstructured Data ◽

Future Research ◽

Web Based ◽

Content Type ◽

The Impact

Purpose This study aims to investigate and explore the impact of big data analytics (BDA) as a mechanism that could develop the ability to measure customers’ performance. To accomplish the research aim, the theoretical discussion was developed through the combination of the diffusion of innovation theory with the technology acceptance model (TAM) that is less developed for the research field of this study. Design/methodology/approach Empirical data was obtained using Web-based quasi-experiments with 104 Egyptian accounting professionals. Further, the Wilcoxon signed-rank test and the chi-square goodness-of-fit test were used to analyze data. Findings The empirical results indicate that measuring customers’ performance based on BDA increase the organizations’ ability to analyze the customers’ unstructured data, decrease the cost of customers’ unstructured data analysis, increase the ability to handle the customers’ problems quickly, minimize the time spent to analyze the customers’ data and obtaining the customers’ performance reports and control managers’ bias when they measure customer satisfaction. The study findings supported the accounting professionals’ acceptance of BDA through the TAM elements: the intention to use (R), perceived usefulness (U) and the perceived ease of use (E). Research limitations/implications This study has several limitations that could be addressed in future research. First, this study focuses on customers’ performance measurement (CPM) only and ignores other performance measurements such as employees’ performance measurement and financial performance measurement. Future research can examine these areas. Second, this study conducts a Web-based experiment with Master of Business Administration students as a study’s participants, researchers could conduct a laboratory experiment and report if there are differences. Third, owing to the novelty of the topic, there was a lack of theoretical evidence in developing the study’s hypotheses. Practical implications This study succeeds to provide the much-needed empirical evidence for BDA positive impact in improving CPM efficiency through the proposed framework (i.e. CPM and BDA framework). Furthermore, this study contributes to the improvement of the performance measurement process, thus, the decision-making process with meaningful and proper insights through the capability of collecting and analyzing the customers’ unstructured data. On a practical level, the company could eventually use this study’s results and the new insights to make better decisions and develop its policies. Originality/value This study holds significance as it provides the much-needed empirical evidence for BDA positive impact in improving CPM efficiency. The study findings will contribute to the enhancement of the performance measurement process through the ability of gathering and analyzing the customers’ unstructured data.

Download Full-text

Architecture of modern platforms for big data analytics

Advanced Information Technology ◽

10.17721/ait.2021.1.09 ◽

2021 ◽

pp. 67-74

Author(s):

Liudmyla Zubyk ◽

Yaroslav Zubyk

Keyword(s):

Big Data ◽

Data Analytics ◽

Business Processes ◽

Service Providers ◽

Big Data Analytics ◽

Business Value ◽

Third Party ◽

Unstructured Data ◽

Business Decision ◽

World Industry

Big data is one of modern tools that have impacted the world industry a lot of. It also plays an important role in determining the ways in which businesses and organizations formulate their strategies and policies. However, very limited academic researches has been conducted into forecasting based on big data due to the difficulties in capturing, collecting, handling, and modeling of unstructured data, which is normally characterized by it’s confidential. We define big data in the context of ecosystem for future forecasting in business decision-making. It can be difficult for a single organization to possess all of the necessary capabilities to derive strategic business value from their findings. That’s why different organizations will build, and operate their own analytics ecosystems or tap into existing ones. An analytics ecosystem comprising a symbiosis of data, applications, platforms, talent, partnerships, and third-party service providers lets organizations be more agile and adapt to changing demands. Organizations participating in analytics ecosystems can examine, learn from, and influence not only their own business processes, but those of their partners. Architectures of popular platforms for forecasting based on big data are presented in this issue.

Download Full-text

EOR/IOR Screening with Big Data Analytics and Natural Language Processing for Unstructured Data: A Statistical Approach

10.2118/181117-ms ◽

2016 ◽

Author(s):

Sardar Afra ◽

Mohammadali Tarrahi

Keyword(s):

Big Data ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Analytics ◽

Statistical Approach ◽

Big Data Analytics ◽

Unstructured Data

Download Full-text

Big Data Analytics for Processing Real-time Unstructured Data from CCTV in Traffic Management

2020 International Conference on Data Science and Its Applications (ICoDSA) ◽

10.1109/icodsa50139.2020.9212858 ◽

2020 ◽

Author(s):

Faqih Hamami ◽

Iqbal Ahmad Dahlan ◽

Setya Widyawan Prakosa ◽

Khamal Fauzan Somantri

Keyword(s):

Big Data ◽

Real Time ◽

Traffic Management ◽

Data Analytics ◽

Big Data Analytics ◽

Unstructured Data

Download Full-text

SIDeKa: The Role of Information Technology for Knowledge Creation

KnE Life Sciences ◽

10.18502/kls.v2i6.1078 ◽

2017 ◽

Vol 2 (6) ◽

pp. 570

Author(s):

Cungki Kusdarjito

Keyword(s):

Information Technology ◽

Big Data ◽

Data Collection ◽

Knowledge Creation ◽

Data Analytics ◽

Big Data Analytics ◽

Unstructured Data ◽

Village Level ◽

The Village

The advancement of big data analytics is paving the way for knowledge creation based on very huge and unstructured data. Currently, information is scattered and growth tremendously, containing many information but difficult to be interpreted. Consequently, traditional approaches are no longer suitable for unstructured data but very rich in information. This situation is different from the role of previous information technology in which information is based on structured data, stored in the local storage, and in more advanced form, information can be retrieved through internet. Meanwhile, in Indonesia data are collected by many institutions with different measurement standard. The nature of the data collection is top-down, carried out by survey which is expensive yet unreliable and stored exclusively by respective institution. SIDeKa (Sistem Informasi Desa dan Kawasan/Village and Regional Information System), which are connected nationally, is proposed as a system of data collection in the village level and prepared by local people. Using SIDeKa, data reliability and readiness can be improved at the local level. The goals of the SIDeKa is not only local people have information in their hand such as poverty level, production, commodity price, the area of cultivated land, and the outbreak of diseases in their village, but also they have information from the neighboring villages or event at the national level. For government, data reliability will improve the policy effectiveness. This paper discusses the implementation and role of SIDeKa for knowledge creation in the village level, especially for the agricultural activities which has been initiated in 2015.Keywords: big data analytics; SIDeKa; unstructured data.

Download Full-text

Lachesis

Proceedings of the VLDB Endowment ◽

10.14778/3457390.3457392 ◽

2021 ◽

Vol 14 (8) ◽

pp. 1262-1275

Author(s):

Jia Zou ◽

Amitabh Das ◽

Pratik Barhate ◽

Arun Iyengar ◽

Binhang Yuan ◽

...

Keyword(s):

Big Data ◽

Reinforcement Learning ◽

Data Analytics ◽

Big Data Analytics ◽

Learning Model ◽

Functional Dependency ◽

Unstructured Data ◽

Significant Challenge ◽

Reinforcement Learning Model

Partitioning is effective in avoiding expensive shuffling operations. However, it remains a significant challenge to automate this process for Big Data analytics workloads that extensively use user defined functions (UDFs), where sub-computations are hard to be reused for partitionings compared to relational applications. In addition, functional dependency that is widely utilized for partitioning selection is often unavailable in the unstructured data that is ubiquitous in UDF-centric analytics. We propose the Lachesis system, which represents UDF-centric workloads as workflows of analyzable and reusable sub-computations. Lachesis further adopts a deep reinforcement learning model to infer which sub-computations should be used to partition the underlying data. This analysis is then applied to automatically optimize the storage of the data across applications to improve the performance and users' productivity.

Download Full-text

Data Preparation for Big Data Analytics

Advances in Business Information Systems and Analytics - Enterprise Big Data Engineering, Analytics, and Management ◽

10.4018/978-1-5225-0293-7.ch010 ◽

2016 ◽

pp. 157-170 ◽

Cited By ~ 3

Author(s):

Andreas Schmidt ◽

Martin Atzmueller ◽

Martin Hollender

Keyword(s):

Big Data ◽

Real World ◽

Data Analytics ◽

Big Data Analytics ◽

Unstructured Data ◽

Future Research ◽

Data Preparation ◽

Chemical Production ◽

Specific Project

This chapter provides an overview of methods for preprocessing structured and unstructured data in the scope of Big Data. Specifically, this chapter summarizes according methods in the context of a real-world dataset in a petro-chemical production setting. The chapter describes state-of-the-art methods for data preparation for Big Data Analytics. Furthermore, the chapter discusses experiences and first insights in a specific project setting with respect to a real-world case study. Furthermore, interesting directions for future research are outlined.

Download Full-text

An analytical study of information extraction from unstructured and multidimensional big data

Journal Of Big Data ◽

10.1186/s40537-019-0254-8 ◽

2019 ◽

Vol 6 (1) ◽

Cited By ~ 7

Author(s):

Kiran Adnan ◽

Rehan Akbar

Keyword(s):

Big Data ◽

Information Extraction ◽

Data Analytics ◽

Data Extraction ◽

Research Work ◽

Big Data Analytics ◽

Unstructured Data ◽

Future Research ◽

Data Types ◽

Future Research Directions

Abstract Process of information extraction (IE) is used to extract useful information from unstructured or semi-structured data. Big data arise new challenges for IE techniques with the rapid growth of multifaceted also called as multidimensional unstructured data. Traditional IE systems are inefficient to deal with this huge deluge of unstructured big data. The volume and variety of big data demand to improve the computational capabilities of these IE systems. It is necessary to understand the competency and limitations of the existing IE techniques related to data pre-processing, data extraction and transformation, and representations for huge volumes of multidimensional unstructured data. Numerous studies have been conducted on IE, addressing the challenges and issues for different data types such as text, image, audio and video. Very limited consolidated research work have been conducted to investigate the task-dependent and task-independent limitations of IE covering all data types in a single study. This research work address this limitation and present a systematic literature review of state-of-the-art techniques for a variety of big data, consolidating all data types. Recent challenges of IE are also identified and summarized. Potential solutions are proposed giving future research directions in big data IE. The research is significant in terms of recent trends and challenges related to big data analytics. The outcome of the research and recommendations will help to improve the big data analytics by making it more productive.

Download Full-text

The role of data science in healthcare advancements: applications, benefits, and future prospects

Irish Journal of Medical Science (1971 -) ◽

10.1007/s11845-021-02730-z ◽

2021 ◽

Author(s):

Sri Venkat Gunturi Subrahmanya ◽

Dasharathraj K. Shetty ◽

Vathsala Patil ◽

B. M. Zeeshan Hameed ◽

Rahul Paul ◽

...

Keyword(s):

Data Mining ◽

Decision Making ◽

Big Data ◽

Data Analytics ◽

Data Science ◽

Big Data Analytics ◽

Machine Learning Algorithms ◽

Unstructured Data ◽

Healthcare Applications ◽

Interdisciplinary Field

AbstractData science is an interdisciplinary field that extracts knowledge and insights from many structural and unstructured data, using scientific methods, data mining techniques, machine-learning algorithms, and big data. The healthcare industry generates large datasets of useful information on patient demography, treatment plans, results of medical examinations, insurance, etc. The data collected from the Internet of Things (IoT) devices attract the attention of data scientists. Data science provides aid to process, manage, analyze, and assimilate the large quantities of fragmented, structured, and unstructured data created by healthcare systems. This data requires effective management and analysis to acquire factual results. The process of data cleansing, data mining, data preparation, and data analysis used in healthcare applications is reviewed and discussed in the article. The article provides an insight into the status and prospects of big data analytics in healthcare, highlights the advantages, describes the frameworks and techniques used, briefs about the challenges faced currently, and discusses viable solutions. Data science and big data analytics can provide practical insights and aid in the decision-making of strategic decisions concerning the health system. It helps build a comprehensive view of patients, consumers, and clinicians. Data-driven decision-making opens up new possibilities to boost healthcare quality.

Download Full-text