Big Data Analytics in Medicine and Healthcare

Abstract This paper surveys big data with highlighting the big data analytics in medicine and healthcare. Big data characteristics: value, volume, velocity, variety, veracity and variability are described. Big data analytics in medicine and healthcare covers integration and analysis of large amount of complex heterogeneous data such as various – omics data (genomics, epigenomics, transcriptomics, proteomics, metabolomics, interactomics, pharmacogenomics, diseasomics), biomedical data and electronic health records data. We underline the challenging issues about big data privacy and security. Regarding big data characteristics, some directions of using suitable and promising open-source distributed data processing software platform are given.

Download Full-text

Privacy-Aware Data Forensics of VRUs Using Machine Learning and Big Data Analytics

Security and Communication Networks ◽

10.1155/2021/3320436 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Muhammad Babar ◽

Muhammad Usman Tariq ◽

Ahmed S. Almasoud ◽

Mohammad Dahman Alshehri

Keyword(s):

Machine Learning ◽

Big Data ◽

Traffic Control ◽

Data Analytics ◽

Data Privacy ◽

Big Data Analytics ◽

Processing Unit ◽

Privacy And Security ◽

User Data ◽

Data Ingestion

The present spreading out of big data found the realization of AI and machine learning. With the rise of big data and machine learning, the idea of improving accuracy and enhancing the efficacy of AI applications is also gaining prominence. Machine learning solutions provide improved guard safety in hazardous traffic circumstances in the context of traffic applications. The existing architectures have various challenges, where data privacy is the foremost challenge for vulnerable road users (VRUs). The key reason for failure in traffic control for pedestrians is flawed in the privacy handling of the users. The user data are at risk and are prone to several privacy and security gaps. If an invader succeeds to infiltrate the setup, exposed data can be malevolently influenced, contrived, and misrepresented for illegitimate drives. In this study, an architecture is proposed based on machine learning to analyze and process big data efficiently in a secure environment. The proposed model considers the privacy of users during big data processing. The proposed architecture is a layered framework with a parallel and distributed module using machine learning on big data to achieve secure big data analytics. The proposed architecture designs a distinct unit for privacy management using a machine learning classifier. A stream processing unit is also integrated with the architecture to process the information. The proposed system is apprehended using real-time datasets from various sources and experimentally tested with reliable datasets that disclose the effectiveness of the proposed architecture. The data ingestion results are also highlighted along with training and validation results.

Download Full-text

4. Big data analytics

Big Data: A Very Short Introduction ◽

10.1093/actrade/9780198779575.003.0004 ◽

2017 ◽

pp. 44-58

Author(s):

Dawn E. Holmes

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Processing System ◽

Distributed Data ◽

New Paradigm ◽

Distributed Data Processing ◽

Customer Preferences ◽

Classical Statistics ◽

Core Functionality

‘Big data analytics’ argues that big data is only useful if we can extract useful information from it. It looks at some of the techniques used to discover useful information from big data, such as customer preferences or how fast an epidemic is spreading. Big data analytics is changing rapidly as the size of the datasets increases and classical statistics makes room for this new paradigm. An example of big data analytics is the algorithmic method called MapReduce, a distributed data processing system that forms part of the core functionality of the Hadoop Ecosystem. Amazon, Google, Facebook, and many others use Hadoop to store and process their data.

Download Full-text

Risks in Adoption and Implementation of Big Data Analytics

International Journal of Risk and Contingency Management ◽

10.4018/ijrcm.2021070101 ◽

2021 ◽

Vol 10 (3) ◽

pp. 1-11

Author(s):

Rajasekhara Mouly Potluri ◽

Narasimha Rao Vajjhala

Keyword(s):

Big Data ◽

Data Analytics ◽

Data Privacy ◽

Small And Medium Enterprises ◽

Big Data Analytics ◽

Small Companies ◽

Privacy And Security ◽

Adoption And Implementation ◽

Research Outcome ◽

Medium Enterprises

The research investigates the risks in adopting and implementing big data analytics in Indian micro, small, and medium enterprises (MSMEs). The researchers outlined a survey questionnaire for accumulating reactions from managers working in 50 Indian micro, small, and medium-sized enterprises on behalf of five vital commercial sectors. The application and use of big data analytics offer several significant problems for small companies as an investment in hardware and software resources are substantial. This study's findings provided experimental evidence on five critical challenges that Indian MSMEs face while adopting and implementing big data analytics: lack of human resources, data privacy and security, shortage of technological resources, deficiency of awareness, and financial implications. This study's findings emphasize the challenges that MSMEs face while leveraging big data analytics benefits. The research outcome will promote MSMEs' organizational leadership in planning and developing short-term and long-term information systems strategies.

Download Full-text

Real-time stream processing for Big Data

it - Information Technology ◽

10.1515/itit-2016-0002 ◽

2016 ◽

Vol 58 (4) ◽

Cited By ~ 7

Author(s):

Wolfram Wingerath ◽

Felix Gessert ◽

Steffen Friedrich ◽

Norbert Ritter

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Sensor Data ◽

Distributed Data ◽

Qualitative Comparison ◽

Data Repositories ◽

Fine Grained ◽

Distributed Data Processing ◽

Trade Offs

AbstractWith the rise of the web 2.0 and the Internet of things, it has become feasible to track all kinds of information over time, in particular fine-grained user activities and sensor data on their environment and even their biometrics. However, while efficiency remains mandatory for any application trying to cope with huge amounts of data, only part of the potential of today's Big Data repositories can be exploited using traditional batch-oriented approaches as the value of data often decays quickly and high latency becomes unacceptable in some applications. In the last couple of years, several distributed data processing systems have emerged that deviate from the batch-oriented approach and tackle data items as they arrive, thus acknowledging the growing importance of timeliness and velocity in Big Data analytics.In this article, we give an overview over the state of the art of stream processors for low-latency Big Data analytics and conduct a qualitative comparison of the most popular contenders, namely Storm and its abstraction layer Trident, Samza and Spark Streaming. We describe their respective underlying rationales, the guarantees they provide and discuss the trade-offs that come with selecting one of them for a particular task.

Download Full-text

Leveraging Distributed Data Over Big Data Analytics Platform for Healthcare Services

2018 2nd International Conference on Trends in Electronics and Informatics (ICOEI) ◽

10.1109/icoei.2018.8553827 ◽

2018 ◽

Cited By ~ 4

Author(s):

Ramesh Mande ◽

G. JayaLakshmi ◽

Kalyan Chakravarti Yelavarti

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Healthcare Services ◽

Distributed Data

Download Full-text

I-Os in the Vanguard of Big Data Analytics and Privacy

Industrial and Organizational Psychology ◽

10.1017/iop.2015.83 ◽

2015 ◽

Vol 8 (4) ◽

pp. 555-563 ◽

Cited By ~ 7

Author(s):

Adam J. Ducey ◽

Nigel Guenole ◽

Sara P. Weiner ◽

Hailey A. Herleman ◽

Robert E. Gibby ◽

...

Keyword(s):

Big Data ◽

Informed Consent ◽

Data Analytics ◽

Data Privacy ◽

Big Data Analytics ◽

Global Scale ◽

Data Sources ◽

Critical Elements ◽

The World

In this response to Guzzo, Fink, King, Tonidandel, and Landis (2015), we suggest industrial–organizational (I-O) psychologists join business analysts, data scientists, statisticians, mathematicians, and economists in creating the vanguard of expertise as we acclimate to the reality of analytics in the world of big data. We enthusiastically accept their invitation to share our perspective that extends the discussion in three key areas of the focal article—that is, big data sources, logistic and analytic challenges, and data privacy and informed consent on a global scale. In the subsequent sections, we share our thoughts on these critical elements for advancing I-O psychology's role in leveraging and adding value from big data.

Download Full-text

Big Data Analytics in Online Structural Health Monitoring

International Journal of Prognostics and Health Management ◽

10.36001/ijphm.2016.v7i4.2462 ◽

2020 ◽

Vol 7 (4) ◽

Author(s):

Guowei Cai ◽

Sankaran Mahadevan

Keyword(s):

Big Data ◽

Structural Health Monitoring ◽

Health Monitoring ◽

Data Analytics ◽

Structural Damage ◽

Big Data Analytics ◽

High Volume ◽

Heterogeneous Data ◽

Sensor Technology ◽

Structural Health

This manuscript explores the application of big data analytics in online structural health monitoring. As smart sensor technology is making progress and low cost online monitoring is increasingly possible, large quantities of highly heterogeneous data can be acquired during the monitoring, thus exceeding the capacity of traditional data analytics techniques. This paper investigates big data techniques to handle the highvolume data obtained in structural health monitoring. In particular, we investigate the analysis of infrared thermal images for structural damage diagnosis. We explore the MapReduce technique to parallelize the data analytics and efficiently handle the high volume, high velocity and high variety of information. In our study, MapReduce is implemented with the Spark platform, and image processing functions such as uniform filter and Sobel filter are wrapped in the mappers. The methodology is illustrated with concrete slabs, using actual experimental data with induced damage

Download Full-text

A Long Short Term Memory with Peephole Connections and Generative Adversarial Network Based Collaborative Methodology to Identify Outliers in ECG Dataset

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9273 ◽

2020 ◽

Vol 17 (8) ◽

pp. 3798-3803

Author(s):

M. D. Anto Praveena ◽

B. Bharathi

Keyword(s):

Time Series ◽

Big Data ◽

Data Analytics ◽

Time Series Data ◽

Short Term Memory ◽

Big Data Analytics ◽

Data Preprocessing ◽

Heterogeneous Data ◽

Series Data ◽

Outlier Identification

Big Data analytics has become an upward field, and it plays a pivotal role in Healthcare and research practices. Big data analytics in healthcare cover vast numbers of dynamic heterogeneous data integration and analysis. Medical records of patients include several data including medical conditions, medications and test findings. One of the major challenges of analytics and prediction in healthcare is data preprocessing. In data preprocessing the outlier identification and correction is the important challenge. Outliers are exciting values that deviates from other values of the attribute; they may simply experimental errors or novelty. Outlier identification is the method of identifying data objects with somewhat different behaviors than expectations. Detecting outliers in time series data is different from normal data. Time series data are the data that are in a series of certain time periods. This kind of data are identified and cleared to bring the quality dataset. In this proposed work a hybrid outlier detection algorithm extended LSTM-GAN is helped to recognize the outliers in time series data. The outcome of the proposed extended algorithm attained better enactment in the time series analysis on ECG dataset processing compared with traditional methodologies.

Download Full-text

Big Data Analytics in Healthcare

International Journal of Big Data and Analytics in Healthcare ◽

10.4018/ijbdah.2020010102 ◽

2020 ◽

Vol 5 (1) ◽

pp. 19-27

Author(s):

Jaimin Navinchandra Undavia ◽

Atul Manubhai Patel

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Heterogeneous Data ◽

Simple Type ◽

Healthcare Industry ◽

Technological Advancement ◽

Huge Amount ◽

High Level ◽

Almost All

The technological advancement has also opened up various ways to collect data through automatic mechanisms. One such mechanism collects a huge amount of data without any further maintenance or human interventions. The health industry sector has been confronted by the need to manage the big data being produced by various sources, which are well known for producing high volumes of heterogeneous data. High level of sophistication has been incorporated in almost all the industry, and healthcare is one of them. The article shows that the existence of huge amount of data in healthcare industry and the data generated in healthcare industry is neither homogeneous nor a simple type of data. Then the various sources and objectives of data are also highlighted and discussed. As data come from various sources, they must be versatile in nature in all aspects. So, rightly and meaningfully, big data analytics has penetrated the healthcare industry and its impact is also highlighted.

Download Full-text

Using Distributed Data over HBase in Big Data Analytics Platform for Clinical Services

Computational and Mathematical Methods in Medicine ◽

10.1155/2017/6120820 ◽

2017 ◽

Vol 2017 ◽

pp. 1-16 ◽

Cited By ~ 5

Author(s):

Dillon Chrimes ◽

Hamid Zamani

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Patient Data ◽

Clinical Event ◽

Distributed Data ◽

Patient Records ◽

Clinical Services ◽

Hospital System ◽

Study Objective

Big data analytics (BDA) is important to reduce healthcare costs. However, there are many challenges of data aggregation, maintenance, integration, translation, analysis, and security/privacy. The study objective to establish an interactive BDA platform with simulated patient data using open-source software technologies was achieved by construction of a platform framework with Hadoop Distributed File System (HDFS) using HBase (key-value NoSQL database). Distributed data structures were generated from benchmarked hospital-specific metadata of nine billion patient records. At optimized iteration, HDFS ingestion of HFiles to HBase store files revealed sustained availability over hundreds of iterations; however, to complete MapReduce to HBase required a week (for 10 TB) and a month for three billion (30 TB) indexed patient records, respectively. Found inconsistencies of MapReduce limited the capacity to generate and replicate data efficiently. Apache Spark and Drill showed high performance with high usability for technical support but poor usability for clinical services. Hospital system based on patient-centric data was challenging in using HBase, whereby not all data profiles were fully integrated with the complex patient-to-hospital relationships. However, we recommend using HBase to achieve secured patient data while querying entire hospital volumes in a simplified clinical event model across clinical services.

Download Full-text