Predictors of outpatients’ no-show: big data analytics using apache spark

AbstractOutpatients who fail to attend their appointments have a negative impact on the healthcare outcome. Thus, healthcare organizations facing new opportunities, one of them is to improve the quality of healthcare. The main challenges is predictive analysis using techniques capable of handle the huge data generated. We propose a big data framework for identifying subject outpatients’ no-show via feature engineering and machine learning (MLlib) in the Spark platform. This study evaluates the performance of five machine learning techniques, using the (2,011,813‬) outpatients’ visits data. Conducting several experiments and using different validation methods, the Gradient Boosting (GB) performed best, resulting in an increase of accuracy and ROC to 79% and 81%, respectively. In addition, we showed that exploring and evaluating the performance of the machine learning models using various evaluation methods is critical as the accuracy of prediction can significantly differ. The aim of this paper is exploring factors that affect no-show rate and can be used to formulate predictions using big data machine learning techniques.

Download Full-text

Predictors of Outpatients’ No-Show: Big Data Analytics using Apache Spark

10.21203/rs.3.rs-33216/v3 ◽

2020 ◽

Author(s):

Tahani Daghistani ◽

Huda AlGhamdi ◽

Riyad Alshammari ◽

Raed H. AlHazme

Keyword(s):

Machine Learning ◽

Big Data ◽

Negative Impact ◽

Big Data Analytics ◽

Quality Of Healthcare ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Healthcare Organizations ◽

Data Framework ◽

Learning Techniques

Abstract Outpatients who fail to attend their appointments have a negative impact on the healthcare outcome. Thus, healthcare organizations facing new opportunities, one of them is to improve the quality of healthcare. The main challenges is predictive analysis using techniques capable of handle the huge data generated. We propose a big data framework for identifying subject outpatients’ no-show via feature engineering and machine learning (MLlib) in the Spark platform. This study evaluates the performance of five machine learning techniques, using the (2,011,813) outpatients’ visits data. Conducting several experiments and using different validation methods, the Gradient Boosting (GB) performed best, resulting in an increase of accuracy and ROC to 79% and 81%, respectively. In addition, we showed that exploring and evaluating the performance of the machine learning models using various evaluation methods is critical as the accuracy of prediction can significantly differ. The aim of this paper is exploring factors that affect no-show rate and can be used to formulate predictions using big data machine learning techniques.

Download Full-text

Predictors of Outpatients’ No-Show: Big Data Analytics using Apache Spark

10.21203/rs.3.rs-33216/v2 ◽

2020 ◽

Author(s):

Tahani Daghistani ◽

Huda AlGhamdi ◽

Riyad Alshammari ◽

Raed H. AlHazme

Keyword(s):

Machine Learning ◽

Big Data ◽

Negative Impact ◽

Big Data Analytics ◽

Quality Of Healthcare ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Healthcare Organizations ◽

Data Framework ◽

Learning Techniques

Abstract Outpatients who fail to attend their appointments have a negative impact on the healthcare outcome. Thus, healthcare organizations facing new opportunities, one of them is to improve the quality of healthcare. The main challenges is predictive analysis using techniques capable of handle the huge data generated. We propose a big data framework for identifying subject outpatients’ no-show via feature engineering and machine learning (MLlib) in the Spark platform. This study evaluates the performance of five machine learning techniques, using the (2,011,813, 19) outpatients’ visits data. Conducting several experiments and using different validation methods, the Gradient Boosting (GB) performed best, resulting in an increase of accuracy and ROC to 79% and 81%, respectively. In addition, we showed that exploring and evaluating the performance of the machine learning models using various evaluation methods is critical as the accuracy of prediction can significantly differ. The aim of this paper is exploring factors that affect no-show rate and can be used to formulate predictions using big data machine learning techniques

Download Full-text

Predictors of Outpatients’ No-Show: Big Data Analytics using Apache Spark

10.21203/rs.3.rs-33216/v4 ◽

2020 ◽

Author(s):

Tahani Daghistani ◽

Huda AlGhamdi ◽

Riyad Alshammari ◽

Raed H. AlHazme

Keyword(s):

Machine Learning ◽

Big Data ◽

Negative Impact ◽

Big Data Analytics ◽

Quality Of Healthcare ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Healthcare Organizations ◽

Data Framework ◽

Learning Techniques

Abstract Outpatients who fail to attend their appointments have a negative impact on the healthcare outcome. Thus, healthcare organizations facing new opportunities, one of them is to improve the quality of healthcare. The main challenges is predictive analysis using techniques capable of handle the huge data generated. We propose a big data framework for identifying subject outpatients’ no-show via feature engineering and machine learning (MLlib) in the Spark platform. This study evaluates the performance of five machine learning techniques, using the (2,011,813) outpatients’ visits data. Conducting several experiments and using different validation methods, the Gradient Boosting (GB) performed best, resulting in an increase of accuracy and ROC to 79% and 81%, respectively. In addition, we showed that exploring and evaluating the performance of the machine learning models using various evaluation methods is critical as the accuracy of prediction can significantly differ. The aim of this paper is exploring factors that affect no-show rate and can be used to formulate predictions using big data machine learning techniques.

Download Full-text

Predictors of Outpatients’ No-Show: Big Data Analytics using Apache Spark

10.21203/rs.3.rs-33216/v1 ◽

2020 ◽

Author(s):

Tahani Daghistani ◽

Huda AlGhamdi ◽

Riyad Alshammari ◽

Raed H. AlHazme

Keyword(s):

Machine Learning ◽

Big Data ◽

Negative Impact ◽

Big Data Analytics ◽

Quality Of Healthcare ◽

Machine Learning Techniques ◽

Healthcare Organizations ◽

Data Framework ◽

Huge Data ◽

Learning Techniques

Download Full-text

Big Data Analytics Processes in Industrial Internet of Things Systems: Sensing and Computing Technologies, Machine Learning Techniques, and Autonomous Decision-Making Algorithms

Journal of Self-Governance and Management Economics ◽

10.22381/jsme7420194 ◽

2019 ◽

Vol 7 (4) ◽

pp. 28 ◽

Cited By ~ 4

Keyword(s):

Machine Learning ◽

Decision Making ◽

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Machine Learning Techniques ◽

Industrial Internet Of Things ◽

Autonomous Decision ◽

Learning Techniques ◽

Industrial Internet

Download Full-text

Application Of Machine Learning Techniques, Big Data Analytics In Health Care Sector – A Literature Survey

2018 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2018 2nd International Conference on ◽

10.1109/i-smac.2018.8653654 ◽

2018 ◽

Author(s):

M. Sughasiny ◽

J. Rajeshwari

Keyword(s):

Machine Learning ◽

Health Care ◽

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Health Care Sector ◽

Literature Survey ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Big Mac: A Distributed PaaS Framework for on Demand Big Data Processing Using Machine Learning Techniques

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.8634 ◽

2020 ◽

Vol 17 (1) ◽

pp. 92-100

Author(s):

Balanand Jha ◽

Kumar Abhishek ◽

Akshay Deepak ◽

Prakhar Shrivastav ◽

Suraj Thakre ◽

...

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Machine Learning Techniques ◽

Computing Power ◽

On Demand ◽

Learning Techniques ◽

Ease Of Access ◽

Start Ups

In the age of start-ups and technical research, the demand for high-end computing power and loads of space is ever increasing. Machine learning techniques have become an inseparable part of the big data analytics. Setting up one’s own infrastructure to deal with all this vastness is usually not feasible due to high expenses and lack of desired expertise. As a solution to this problem, this paper proposes a system for Big-Data Analytics and Machine Learning based on Hadoop and Spark frameworks that also supports Operating System (OS) Rental Services. Machine Learning (ML) services provide option to use both existing inbuilt popular models or create one’s own model. OS Rental services provide users with high end infrastructure on their low-end devices on rent. The entire implementation has been made open source for ease of access and facilitating extensibility.

Download Full-text

REVIEW OF MACHINE LEARNING TECHNIQUES FOR VOLUMINOUS INFORMATION MANAGEMENT

Journal of Soft Computing Paradigm - September 2019 ◽

10.36548/jscp.2019.2.005 ◽

2019 ◽

Vol 2019 (2) ◽

pp. 103-112

Author(s):

Dr. Pasumpon pandian

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analytics ◽

Learning Algorithms ◽

Big Data Analytics ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Technological Growth ◽

Rapid Pace

The recent technological growth at a rapid pace has paved way for the big data that denotes to the exponential growth of the information’s. The big data analytics are the trending concepts that have emerged as the promising technology that offers more enhanced perceptions from the huge set of the data that have been produced from the diverse areas. The review in the paper proceeds with the methods of the big-data-analytics and the machine-learning in handling, the huge set of data flow. The overview of the utilization of the machine-learning algorithms in the analytics of high voluminous data would provide with the deeper and the richer analysis of the huge set of information gathered to extract the valuable and turn it into actionable information’s. The paper is to review the part of machine-learning algorithms in the analytics of high voluminous data

Download Full-text

Industrial Big Data Analytics for Cognitive Internet of Things: Wireless Sensor Networks, Smart Computing Algorithms, and Machine Learning Techniques

Analysis and Metaphysics ◽

10.22381/am1820193 ◽

2019 ◽

Vol 18 (0) ◽

pp. 23 ◽

Cited By ~ 7

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Machine Learning Techniques ◽

Wireless Sensor ◽

Learning Techniques ◽

Industrial Big Data ◽

Smart Computing ◽

Cognitive Internet Of Things

Download Full-text

Credit Rating Forecasting Using Machine Learning Techniques

Advances in Data Mining and Database Management - Managerial Perspectives on Intelligent Big Data Analytics ◽

10.4018/978-1-5225-7277-0.ch010 ◽

2019 ◽

pp. 180-198 ◽

Cited By ~ 1

Author(s):

Mark Wallis ◽

Kuldeep Kumar ◽

Adrian Gepp

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analytics ◽

Credit Ratings ◽

Relative Effectiveness ◽

Big Data Analytics ◽

Parametric Models ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Missing Variables

Credit ratings are an important metric for business managers and a contributor to economic growth. Forecasting such ratings might be a suitable application of big data analytics. As machine learning is one of the foundations of intelligent big data analytics, this chapter presents a comparative analysis of traditional statistical models and popular machine learning models for the prediction of Moody's long-term corporate debt ratings. Machine learning techniques such as artificial neural networks, support vector machines, and random forests generally outperformed their traditional counterparts in terms of both overall accuracy and the Kappa statistic. The parametric models may be hindered by missing variables and restrictive assumptions about the underlying distributions in the data. This chapter reveals the relative effectiveness of non-parametric big data analytics to model a complex process that frequently arises in business, specifically determining credit ratings.

Download Full-text