scholarly journals IDMPF: intelligent diabetes mellitus prediction framework using machine learning

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Leila Ismail ◽  
Huned Materwala

PurposeMachine Learning is an intelligent methodology used for prediction and has shown promising results in predictive classifications. One of the critical areas in which machine learning can save lives is diabetes prediction. Diabetes is a chronic disease and one of the 10 causes of death worldwide. It is expected that the total number of diabetes will be 700 million in 2045; a 51.18% increase compared to 2019. These are alarming figures, and therefore, it becomes an emergency to provide an accurate diabetes prediction.Design/methodology/approachHealth professionals and stakeholders are striving for classification models to support prognosis of diabetes and formulate strategies for prevention. The authors conduct literature review of machine models and propose an intelligent framework for diabetes prediction.FindingsThe authors provide critical analysis of machine learning models, propose and evaluate an intelligent machine learning-based architecture for diabetes prediction. The authors implement and evaluate the decision tree (DT)-based random forest (RF) and support vector machine (SVM) learning models for diabetes prediction as the mostly used approaches in the literature using our framework.Originality/valueThis paper provides novel intelligent diabetes mellitus prediction framework (IDMPF) using machine learning. The framework is the result of a critical examination of prediction models in the literature and their application to diabetes. The authors identify the training methodologies, models evaluation strategies, the challenges in diabetes prediction and propose solutions within the framework. The research results can be used by health professionals, stakeholders, students and researchers working in the diabetes prediction area.

2022 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Dinda Thalia Andariesta ◽  
Meditya Wasesa

PurposeThis research presents machine learning models for predicting international tourist arrivals in Indonesia during the COVID-19 pandemic using multisource Internet data.Design/methodology/approachTo develop the prediction models, this research utilizes multisource Internet data from TripAdvisor travel forum and Google Trends. Temporal factors, posts and comments, search queries index and previous tourist arrivals records are set as predictors. Four sets of predictors and three distinct data compositions were utilized for training the machine learning models, namely artificial neural networks (ANNs), support vector regression (SVR) and random forest (RF). To evaluate the models, this research uses three accuracy metrics, namely root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE).FindingsPrediction models trained using multisource Internet data predictors have better accuracy than those trained using single-source Internet data or other predictors. In addition, using more training sets that cover the phenomenon of interest, such as COVID-19, will enhance the prediction model's learning process and accuracy. The experiments show that the RF models have better prediction accuracy than the ANN and SVR models.Originality/valueFirst, this study pioneers the practice of a multisource Internet data approach in predicting tourist arrivals amid the unprecedented COVID-19 pandemic. Second, the use of multisource Internet data to improve prediction performance is validated with real empirical data. Finally, this is one of the few papers to provide perspectives on the current dynamics of Indonesia's tourism demand.


2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
Raja Krishnamoorthi ◽  
Shubham Joshi ◽  
Hatim Z. Almarzouki ◽  
Piyush Kumar Shukla ◽  
Ali Rizwan ◽  
...  

Diabetes is a chronic disease that continues to be a significant and global concern since it affects the entire population’s health. It is a metabolic disorder that leads to high blood sugar levels and many other problems such as stroke, kidney failure, and heart and nerve problems. Several researchers have attempted to construct an accurate diabetes prediction model over the years. However, this subject still faces significant open research issues due to a lack of appropriate data sets and prediction approaches, which pushes researchers to use big data analytics and machine learning (ML)-based methods. Applying four different machine learning methods, the research tries to overcome the problems and investigate healthcare predictive analytics. The study’s primary goal was to see how big data analytics and machine learning-based techniques may be used in diabetes. The examination of the results shows that the suggested ML-based framework may achieve a score of 86. Health experts and other stakeholders are working to develop categorization models that will aid in the prediction of diabetes and the formulation of preventative initiatives. The authors perform a review of the literature on machine models and suggest an intelligent framework for diabetes prediction based on their findings. Machine learning models are critically examined, and an intelligent machine learning-based architecture for diabetes prediction is proposed and evaluated by the authors. In this study, the authors utilize our framework to develop and assess decision tree (DT)-based random forest (RF) and support vector machine (SVM) learning models for diabetes prediction, which are the most widely used techniques in the literature at the time of writing. It is proposed in this study that a unique intelligent diabetes mellitus prediction framework (IDMPF) is developed using machine learning. According to the framework, it was developed after conducting a rigorous review of existing prediction models in the literature and examining their applicability to diabetes. Using the framework, the authors describe the training procedures, model assessment strategies, and issues associated with diabetes prediction, as well as solutions they provide. The findings of this study may be utilized by health professionals, stakeholders, students, and researchers who are involved in diabetes prediction research and development. The proposed work gives 83% accuracy with the minimum error rate.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Lei Li ◽  
Desheng Wu

PurposeThe infraction of securities regulations (ISRs) of listed firms in their day-to-day operations and management has become one of common problems. This paper proposed several machine learning approaches to forecast the risk at infractions of listed corporates to solve financial problems that are not effective and precise in supervision.Design/methodology/approachThe overall proposed research framework designed for forecasting the infractions (ISRs) include data collection and cleaning, feature engineering, data split, prediction approach application and model performance evaluation. We select Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machines, Artificial Neural Network and Long Short-Term Memory Networks (LSTMs) as ISRs prediction models.FindingsThe research results show that prediction performance of proposed models with the prior infractions provides a significant improvement of the ISRs than those without prior, especially for large sample set. The results also indicate when judging whether a company has infractions, we should pay attention to novel artificial intelligence methods, previous infractions of the company, and large data sets.Originality/valueThe findings could be utilized to address the problems of identifying listed corporates' ISRs at hand to a certain degree. Overall, results elucidate the value of the prior infraction of securities regulations (ISRs). This shows the importance of including more data sources when constructing distress models and not only focus on building increasingly more complex models on the same data. This is also beneficial to the regulatory authorities.


2020 ◽  
Author(s):  
Tahmina Nasrin Poly ◽  
Md.Mohaimenul Islam ◽  
Muhammad Solihuddin Muhtar ◽  
Hsuan-Chia Yang ◽  
Phung Anh (Alex) Nguyen ◽  
...  

BACKGROUND Computerized physician order entry (CPOE) systems are incorporated into clinical decision support systems (CDSSs) to reduce medication errors and improve patient safety. Automatic alerts generated from CDSSs can directly assist physicians in making useful clinical decisions and can help shape prescribing behavior. Multiple studies reported that approximately 90%-96% of alerts are overridden by physicians, which raises questions about the effectiveness of CDSSs. There is intense interest in developing sophisticated methods to combat alert fatigue, but there is no consensus on the optimal approaches so far. OBJECTIVE Our objective was to develop machine learning prediction models to predict physicians’ responses in order to reduce alert fatigue from disease medication–related CDSSs. METHODS We collected data from a disease medication–related CDSS from a university teaching hospital in Taiwan. We considered prescriptions that triggered alerts in the CDSS between August 2018 and May 2019. Machine learning models, such as artificial neural network (ANN), random forest (RF), naïve Bayes (NB), gradient boosting (GB), and support vector machine (SVM), were used to develop prediction models. The data were randomly split into training (80%) and testing (20%) datasets. RESULTS A total of 6453 prescriptions were used in our model. The ANN machine learning prediction model demonstrated excellent discrimination (area under the receiver operating characteristic curve [AUROC] 0.94; accuracy 0.85), whereas the RF, NB, GB, and SVM models had AUROCs of 0.93, 0.91, 0.91, and 0.80, respectively. The sensitivity and specificity of the ANN model were 0.87 and 0.83, respectively. CONCLUSIONS In this study, ANN showed substantially better performance in predicting individual physician responses to an alert from a disease medication–related CDSS, as compared to the other models. To our knowledge, this is the first study to use machine learning models to predict physician responses to alerts; furthermore, it can help to develop sophisticated CDSSs in real-world clinical settings.


10.2196/19489 ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. e19489
Author(s):  
Tahmina Nasrin Poly ◽  
Md.Mohaimenul Islam ◽  
Muhammad Solihuddin Muhtar ◽  
Hsuan-Chia Yang ◽  
Phung Anh (Alex) Nguyen ◽  
...  

Background Computerized physician order entry (CPOE) systems are incorporated into clinical decision support systems (CDSSs) to reduce medication errors and improve patient safety. Automatic alerts generated from CDSSs can directly assist physicians in making useful clinical decisions and can help shape prescribing behavior. Multiple studies reported that approximately 90%-96% of alerts are overridden by physicians, which raises questions about the effectiveness of CDSSs. There is intense interest in developing sophisticated methods to combat alert fatigue, but there is no consensus on the optimal approaches so far. Objective Our objective was to develop machine learning prediction models to predict physicians’ responses in order to reduce alert fatigue from disease medication–related CDSSs. Methods We collected data from a disease medication–related CDSS from a university teaching hospital in Taiwan. We considered prescriptions that triggered alerts in the CDSS between August 2018 and May 2019. Machine learning models, such as artificial neural network (ANN), random forest (RF), naïve Bayes (NB), gradient boosting (GB), and support vector machine (SVM), were used to develop prediction models. The data were randomly split into training (80%) and testing (20%) datasets. Results A total of 6453 prescriptions were used in our model. The ANN machine learning prediction model demonstrated excellent discrimination (area under the receiver operating characteristic curve [AUROC] 0.94; accuracy 0.85), whereas the RF, NB, GB, and SVM models had AUROCs of 0.93, 0.91, 0.91, and 0.80, respectively. The sensitivity and specificity of the ANN model were 0.87 and 0.83, respectively. Conclusions In this study, ANN showed substantially better performance in predicting individual physician responses to an alert from a disease medication–related CDSS, as compared to the other models. To our knowledge, this is the first study to use machine learning models to predict physician responses to alerts; furthermore, it can help to develop sophisticated CDSSs in real-world clinical settings.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Sanjay Sehgal ◽  
Ritesh Kumar Mishra ◽  
Florent Deisting ◽  
Rupali Vashisht

PurposeThe main aim of the study is to identify some critical microeconomic determinants of financial distress and to design a parsimonious distress prediction model for an emerging economy like India. In doing so, the authors also attempt to compare the forecasting accuracy of alternative distress prediction techniques.Design/methodology/approachIn this study, the authors use two alternatives accounting information-based definitions of financial distress to construct a measure of financial distress. The authors then use the binomial logit model and two other popular machine learning–based models, namely artificial neural network and support vector machine, to compare the distress prediction accuracy rate of these alternative techniques for the Indian corporate sector.FindingsThe study’s empirical results suggest that five financial ratios, namely return on capital employed, cash flows to total liability, asset turnover ratio, fixed assets to total assets, debt to equity ratio and a measure of firm size (log total assets), play a highly significant role in distress prediction. The study’s findings suggest that machine learning-based models, namely support vector machine (SVM) and artificial neural network (ANN), are superior in terms of their prediction accuracy compared to the simple binomial logit model. Results also suggest that one-year-ahead forecasts are relatively better than the two-year-ahead forecasts.Practical implicationsThe findings of the study have some important practical implications for creditors, policymakers, regulators and other stakeholders. First, rather than monitoring and collecting information on a list of predictor variables, only six most important accounting ratios may be monitored to track the transition of a healthy firm into financial distress. Second, our six-factor model can be used to devise a sound early warning system for corporate financial distress. Three, machine learning–based distress prediction models have prediction accuracy superiority over the commonly used time series model in the available literature for distress prediction involving a binary dependent variable.Originality/valueThis study is one of the first comprehensive attempts to investigate and design a parsimonious distress prediction model for the emerging Indian economy which is currently facing high levels of corporate financial distress. Unlike the previous studies, the authors use two different accounting information-based measures of financial distress in order to identify an effective way of measuring financial distress. Some of the determinants of financial distress identified in this study are different from the popular distress prediction models used in the literature. Our distress prediction model can be useful for the other emerging markets for distress prediction.


Author(s):  
Cemil Kuzey ◽  
Ali Uyar ◽  
Dursun Delen

Purpose The paper aims to identify and critically analyze the factors influencing cost system functionality (CSF) using several machine learning techniques including decision trees, support vector machines and logistic regression. Design/methodology/approach The study used a self-administered survey method to collect the necessary data from companies conducting business in Turkey. Several prediction models are developed and tested; a series of sensitivity analyses is performed on the developed prediction models to assess the ranked importance of factors/variables. Findings Certain factors/variables influence CSF much more than others. The findings of the study suggest that utilization of management accounting practices require a functional cost system, which is supported by a comprehensive cost data management process (i.e. acquisition, storage and utilization). Research limitations/implications The underlying data were collected using a questionnaire survey; thus, it is subjective which reflects the perceptions of the respondents. Ideally, it is expected to reflect the objective of the practices of the firms. Second, the authors have measured CSF it on a “Yes” or “No” basis which does not allow survey respondents reply in between them; thus, it might have limited the choices of the respondents. Third, the Likert scales adopted in the measurement of the other constructs might be limiting the answers of the respondents. Practical implications Information technology plays a very important role for the success of CSF practices. That is, successful implementation of a functional cost system relies heavily on a fully integrated information infrastructure capable of constantly feeding CSF with accurate, relevant and timely data. Originality/value In addition to providing evidence regarding the factors underlying CSF based on a broad range of industries interesting finding, this study also illustrates the viability of machine learning methods as a research framework to critically analyze domain specific data.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Habeeb Balogun ◽  
Hafiz Alaka ◽  
Christian Nnaemeka Egwim

PurposeThis paper seeks to assess the performance levels of BA-GS-LSSVM compared to popular standalone algorithms used to build NO2 prediction models. The purpose of this paper is to pre-process a relatively large data of NO2 from Internet of Thing (IoT) sensors with time-corresponding weather and traffic data and to use the data to develop NO2 prediction models using BA-GS-LSSVM and popular standalone algorithms to allow for a fair comparison.Design/methodology/approachThis research installed and used data from 14 IoT emission sensors to develop machine learning predictive models for NO2 pollution concentration. The authors used big data analytics infrastructure to retrieve the large volume of data collected in tens of seconds for over 5 months. Weather data from the UK meteorology department and traffic data from the department for transport were collected and merged for the corresponding time and location where the pollution sensors exist.FindingsThe results show that the hybrid BA-GS-LSSVM outperforms all other standalone machine learning predictive Model for NO2 pollution.Practical implicationsThis paper's hybrid model provides a basis for giving an informed decision on the NO2 pollutant avoidance system.Originality/valueThis research installed and used data from 14 IoT emission sensors to develop machine learning predictive models for NO2 pollution concentration.


2019 ◽  
Vol 53 (4) ◽  
pp. 397-421 ◽  
Author(s):  
Guru Prasad Bhandari ◽  
Ratneshwer Gupta ◽  
Satyanshu Kumar Upadhyay

Purpose Software fault prediction is an important concept that can be applied at an early stage of the software life cycle. Effective prediction of faults may improve the reliability and testability of software systems. As service-oriented architecture (SOA)-based systems become more and more complex, the interaction between participating services increases frequently. The component services may generate enormous reports and fault information. Although considerable research has stressed on developing fault-proneness prediction models in service-oriented systems (SOS) using machine learning (ML) techniques, there has been little work on assessing how effective the source code metrics are for fault prediction. The paper aims to discuss this issue. Design/methodology/approach In this paper, the authors have proposed a fault prediction framework to investigate fault prediction in SOS using metrics of web services. The effectiveness of the model has been explored by applying six ML techniques, namely, Naïve Bayes, Artificial Networks (ANN), Adaptive Boosting (AdaBoost), decision tree, Random Forests and Support Vector Machine (SVM), along with five feature selection techniques to extract the essential metrics. The authors have explored accuracy, precision, recall, f-measure and receiver operating characteristic curves of the area under curve values as performance measures. Findings The experimental results show that the proposed system can classify the fault-proneness of web services, whether the service is faulty or non-faulty, as a binary-valued output automatically and effectively. Research limitations/implications One possible threat to internal validity in the study is the unknown effects of undiscovered faults. Specifically, the authors have injected possible faults into the classes using Java C3.0 tool and only fixed faults are injected into the classes. However, considering the Java C3.0 community of development, testing and use, the authors can generalize that the undiscovered faults should be few and have less impact on the results presented in this study, and that the results may be limited to the investigated complexity metrics and the used ML techniques. Originality/value In the literature, only few studies have been observed to directly concentrate on metrics-based fault-proneness prediction of SOS using ML techniques. However, most of the contributions are regarding the fault prediction of the general systems rather than SOS. A majority of them have considered reliability, changeability, maintainability using a logging/history-based approach and mathematical modeling rather than fault prediction in SOS using metrics. Thus, the authors have extended the above contributions further by applying supervised ML techniques over web services metrics and measured their capability by employing fault injection methods.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Umair Bin Yousaf ◽  
Khalil Jebran ◽  
Man Wang

Purpose The purpose of this study is to explore whether different board diversity attributes (corporate governance aspect) can be used to predict financial distress. This study also aims to identify what type of prediction models are more applicable to capture board diversity along with conventional predictors. Design/methodology/approach This study used Chinese A-listed companies during 2007–2016. Board diversity dimensions of gender, age, education, expertise and independence are categorized into three broad categories; relation-oriented diversity (age and gender), task-oriented diversity (expertise and education) and structural diversity (independence). The data is divided into test and validation sets. Six statistical and machine learning models that included logistic regression, dynamic hazard, K-nearest neighbor, random forest (RF), bagging and boosting were compared on Type I errors, Type II errors, accuracy and area under the curve. Findings The results indicate that board diversity attributes can significantly predict the financial distress of firms. Overall, the machine learning models perform better and the best model in terms of Type I error and accuracy is RF. Practical implications This study not only highlights symptoms but also causes of financial distress, which are deeply rooted in weak corporate governance. The result of the study can be used in future credit risk assessment by incorporating board diversity attributes. The study has implications for academicians, practitioners and nomination committees. Originality/value To the best of the authors’ knowledge, this study is the first to comprehensively investigate how different attributes of diversity can predict financial distress in Chinese firms. Further, this study also explores, which financial distress prediction models can show better predictive power.


Sign in / Sign up

Export Citation Format

Share Document