scholarly journals Performance analysis of machine learning algorithms trained on biased data

2021 ◽  
Author(s):  
Renata Sendreti Broder ◽  
Lilian Berton

The use of Artificial Intelligence and Machine Learning algorithms in everyday life is common nowadays in several areas, bringing many possibilities and benefits to society. However, since there is room for learning algorithms to make decisions, the range of related ethical issues was also expanded. There are many complaints about Machine Learning applications that identify some kind of bias, disadvantaging or favoring some group, with the possibility of causing harm to a real person. The present work aims to shed light on the existence of biases, analyzing and comparing the behavior of different learning algorithms – namely Decision Tree, MLP, Naive Bayes, Random Forest, Logistic Regression and SVM – when being trained using biased data. We employed pre-processing algorithms for mitigating bias provided by IBM's framework AI Fairness 360.

2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Zenghua Ren ◽  
Yudan Hu ◽  
Ling Xu

Abstract Background The differential diagnosis of tuberculous pleural effusion (TPE) is challenging. In recent years, artificial intelligence (AI) machine learning algorithms have started being used to an increasing extent in disease diagnosis due to the high level of efficiency, objectivity, and accuracy that they offer. Methods Data samples on 192 patients with TPE, 54 patients with parapneumonic pleural effusion (PPE), and 197 patients with malignant pleural effusion (MPE) were retrospectively collected. Based on 28 different features obtained via statistical analysis, TPE diagnostic models using four machine learning algorithms (MLAs), namely logistic regression, k-nearest neighbors (KNN), support vector machine (SVM) and random forest (RF) were established and their respective diagnostic performances were calculated. The respective diagnostic performances of each of the four algorithmic models were compared with that of pleural fluid adenosine deaminase (pfADA). Based on 12 features with the most significant impacts on the accuracy of the RF model, a new RF model was designed for clinical application. To demonstrate its external validity, a prospective study was conducted and the diagnostic performance of the RF model was calculated. Results The respective sensitivity and specificity of each of the four TPE diagnostic models were as follows: logistic regression – 80.5 and 84.8%; KNN– 78.6 and 86.6%; SVM – 83.2 and 85.9%; and RF – 89.1 and 93.6%. The sensitivity and specificity of pfADA were 85.4 and 84.1%, respectively, at the best cut-off value of 17.5 U/L. RF was the superior method among the four MLAs, and was also superior to pfADA. The newly designed RF model (based on 12 out of 28 features) exhibited an acceptable performance rate for the diagnosis of TPE with a sensitivity and specificity of 90.6 and 92.3%, respectively. In the prospective study, its sensitivity and specificity were 100.0 and 90.0%, respectively. Conclusions Establishing a model for the diagnosis of TPE using RF resulted in a more effective, economical, and faster diagnostic method. This method could enable clinicians to diagnose and treat TPE more effectively.


Author(s):  
M. A. Fesenko ◽  
G. V. Golovaneva ◽  
A. V. Miskevich

The new model «Prognosis of men’ reproductive function disorders» was developed. The machine learning algorithms (artificial intelligence) was used for this purpose, the model has high prognosis accuracy. The aim of the model applying is prioritize diagnostic and preventive measures to minimize reproductive system diseases complications and preserve workers’ health and efficiency.


2020 ◽  
Vol 237 (12) ◽  
pp. 1430-1437
Author(s):  
Achim Langenbucher ◽  
Nóra Szentmáry ◽  
Jascha Wendelstein ◽  
Peter Hoffmann

Abstract Background and Purpose In the last decade, artificial intelligence and machine learning algorithms have been more and more established for the screening and detection of diseases and pathologies, as well as for describing interactions between measures where classical methods are too complex or fail. The purpose of this paper is to model the measured postoperative position of an intraocular lens implant after cataract surgery, based on preoperatively assessed biometric effect sizes using techniques of machine learning. Patients and Methods In this study, we enrolled 249 eyes of patients who underwent elective cataract surgery at Augenklinik Castrop-Rauxel. Eyes were measured preoperatively with the IOLMaster 700 (Carl Zeiss Meditec), as well as preoperatively and postoperatively with the Casia 2 OCT (Tomey). Based on preoperative effect sizes axial length, corneal thickness, internal anterior chamber depth, thickness of the crystalline lens, mean corneal radius and corneal diameter a selection of 17 machine learning algorithms were tested for prediction performance for calculation of internal anterior chamber depth (AQD_post) and axial position of equatorial plane of the lens in the pseudophakic eye (LEQ_post). Results The 17 machine learning algorithms (out of 4 families) varied in root mean squared/mean absolute prediction error between 0.187/0.139 mm and 0.255/0.204 mm (AQD_post) and 0.183/0.135 mm and 0.253/0.206 mm (LEQ_post), using 5-fold cross validation techniques. The Gaussian Process Regression Model using an exponential kernel showed the best performance in terms of root mean squared error for prediction of AQDpost and LEQpost. If the entire dataset is used (without splitting for training and validation data), comparison of a simple multivariate linear regression model vs. the algorithm with the best performance showed a root mean squared prediction error for AQD_post/LEQ_post with 0.188/0.187 mm vs. the best performance Gaussian Process Regression Model with 0.166/0.159 mm. Conclusion In this paper we wanted to show the principles of supervised machine learning applied to prediction of the measured physical postoperative axial position of the intraocular lenses. Based on our limited data pool and the algorithms used in our setting, the benefit of machine learning algorithms seems to be limited compared to a standard multivariate regression model.


2021 ◽  
Vol 10 (2) ◽  
pp. 205846012199029
Author(s):  
Rani Ahmad

Background The scope and productivity of artificial intelligence applications in health science and medicine, particularly in medical imaging, are rapidly progressing, with relatively recent developments in big data and deep learning and increasingly powerful computer algorithms. Accordingly, there are a number of opportunities and challenges for the radiological community. Purpose To provide review on the challenges and barriers experienced in diagnostic radiology on the basis of the key clinical applications of machine learning techniques. Material and Methods Studies published in 2010–2019 were selected that report on the efficacy of machine learning models. A single contingency table was selected for each study to report the highest accuracy of radiology professionals and machine learning algorithms, and a meta-analysis of studies was conducted based on contingency tables. Results The specificity for all the deep learning models ranged from 39% to 100%, whereas sensitivity ranged from 85% to 100%. The pooled sensitivity and specificity were 89% and 85% for the deep learning algorithms for detecting abnormalities compared to 75% and 91% for radiology experts, respectively. The pooled specificity and sensitivity for comparison between radiology professionals and deep learning algorithms were 91% and 81% for deep learning models and 85% and 73% for radiology professionals (p < 0.000), respectively. The pooled sensitivity detection was 82% for health-care professionals and 83% for deep learning algorithms (p < 0.005). Conclusion Radiomic information extracted through machine learning programs form images that may not be discernible through visual examination, thus may improve the prognostic and diagnostic value of data sets.


Author(s):  
Joel Weijia Lai ◽  
Candice Ke En Ang ◽  
U. Rajendra Acharya ◽  
Kang Hao Cheong

Artificial Intelligence in healthcare employs machine learning algorithms to emulate human cognition in the analysis of complicated or large sets of data. Specifically, artificial intelligence taps on the ability of computer algorithms and software with allowable thresholds to make deterministic approximate conclusions. In comparison to traditional technologies in healthcare, artificial intelligence enhances the process of data analysis without the need for human input, producing nearly equally reliable, well defined output. Schizophrenia is a chronic mental health condition that affects millions worldwide, with impairment in thinking and behaviour that may be significantly disabling to daily living. Multiple artificial intelligence and machine learning algorithms have been utilized to analyze the different components of schizophrenia, such as in prediction of disease, and assessment of current prevention methods. These are carried out in hope of assisting with diagnosis and provision of viable options for individuals affected. In this paper, we review the progress of the use of artificial intelligence in schizophrenia.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Matthijs Blankers ◽  
Louk F. M. van der Post ◽  
Jack J. M. Dekker

Abstract Background Accurate prediction models for whether patients on the verge of a psychiatric criseis need hospitalization are lacking and machine learning methods may help improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate the accuracy of ten machine learning algorithms, including the generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact. We also evaluate an ensemble model to optimize the accuracy and we explore individual predictors of hospitalization. Methods Data from 2084 patients included in the longitudinal Amsterdam Study of Acute Psychiatry with at least one reported psychiatric crisis care contact were included. Target variable for the prediction models was whether the patient was hospitalized in the 12 months following inclusion. The predictive power of 39 variables related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts was evaluated. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared and we also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis and the five best performing algorithms were combined in an ensemble model using stacking. Results All models performed above chance level. We found Gradient Boosting to be the best performing algorithm (AUC = 0.774) and K-Nearest Neighbors to be the least performing (AUC = 0.702). The performance of GLM/logistic regression (AUC = 0.76) was slightly above average among the tested algorithms. In a Net Reclassification Improvement analysis Gradient Boosting outperformed GLM/logistic regression by 2.9% and K-Nearest Neighbors by 11.3%. GLM/logistic regression outperformed K-Nearest Neighbors by 8.7%. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was in most cases modest. The results show that a predictive accuracy similar to the best performing model can be achieved when combining multiple algorithms in an ensemble model.


2019 ◽  
Author(s):  
Matthijs Blankers ◽  
Louk F. M. van der Post ◽  
Jack J. M. Dekker

Abstract Background: It is difficult to accurately predict whether a patient on the verge of a potential psychiatric crisis will need to be hospitalized. Machine learning may be helpful to improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate and compare the accuracy of ten machine learning algorithms including the commonly used generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact, and explore the most important predictor variables of hospitalization. Methods: Data from 2,084 patients with at least one reported psychiatric crisis care contact included in the longitudinal Amsterdam Study of Acute Psychiatry were used. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared. We also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis. Target variable for the prediction models was whether or not the patient was hospitalized in the 12 months following inclusion in the study. The 39 predictor variables were related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts. Results: We found Gradient Boosting to perform the best (AUC=0.774) and K-Nearest Neighbors performing the least (AUC=0.702). The performance of GLM/logistic regression (AUC=0.76) was above average among the tested algorithms. Gradient Boosting outperformed GLM/logistic regression and K-Nearest Neighbors, and GLM outperformed K-Nearest Neighbors in a Net Reclassification Improvement analysis, although the differences between Gradient Boosting and GLM/logistic regression were small. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions: Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was modest. Future studies may consider to combine multiple algorithms in an ensemble model for optimal performance and to mitigate the risk of choosing suboptimal performing algorithms.


2020 ◽  
Vol 5 (19) ◽  
pp. 32-35
Author(s):  
Anand Vijay ◽  
Kailash Patidar ◽  
Manoj Yadav ◽  
Rishi Kushwah

In this paper an analytical survey on the role of machine learning algorithms in case of intrusion detection has been presented and discussed. This paper shows the analytical aspects in the development of efficient intrusion detection system (IDS). The related study for the development of this system has been presented in terms of computational methods. The discussed methods are data mining, artificial intelligence and machine learning. It has been discussed along with the attack parameters and attack types. This paper also elaborates the impact of different attack and handling mechanism based on the previous papers.


2021 ◽  
Vol 9 ◽  
Author(s):  
Huanhuan Zhao ◽  
Xiaoyu Zhang ◽  
Yang Xu ◽  
Lisheng Gao ◽  
Zuchang Ma ◽  
...  

Hypertension is a widespread chronic disease. Risk prediction of hypertension is an intervention that contributes to the early prevention and management of hypertension. The implementation of such intervention requires an effective and easy-to-implement hypertension risk prediction model. This study evaluated and compared the performance of four machine learning algorithms on predicting the risk of hypertension based on easy-to-collect risk factors. A dataset of 29,700 samples collected through a physical examination was used for model training and testing. Firstly, we identified easy-to-collect risk factors of hypertension, through univariate logistic regression analysis. Then, based on the selected features, 10-fold cross-validation was utilized to optimize four models, random forest (RF), CatBoost, MLP neural network and logistic regression (LR), to find the best hyper-parameters on the training set. Finally, the performance of models was evaluated by AUC, accuracy, sensitivity and specificity on the test set. The experimental results showed that the RF model outperformed the other three models, and achieved an AUC of 0.92, an accuracy of 0.82, a sensitivity of 0.83 and a specificity of 0.81. In addition, Body Mass Index (BMI), age, family history and waist circumference (WC) are the four primary risk factors of hypertension. These findings reveal that it is feasible to use machine learning algorithms, especially RF, to predict hypertension risk without clinical or genetic data. The technique can provide a non-invasive and economical way for the prevention and management of hypertension in a large population.


Sign in / Sign up

Export Citation Format

Share Document