A feature-based hybrid recommender system for risk prediction : Machine learning approach (Preprint)

2020 ◽  
Author(s):  
Uzair Bhatti

BACKGROUND In the era of health informatics, exponential growth of information generated by health information systems and healthcare organizations demands expert and intelligent recommendation systems. It has become one of the most valuable tools as it reduces problems such as information overload while selecting and suggesting doctors, hospitals, medicine, diagnosis etc according to patients’ interests. OBJECTIVE Recommendation uses Hybrid Filtering as one of the most popular approaches, but the major limitations of this approach are selectivity and data integrity issues.Mostly existing recommendation systems & risk prediction algorithms focus on a single domain, on the other end cross-domain hybrid filtering is able to alleviate the degree of selectivity and data integrity problems to a better extent. METHODS We propose a novel algorithm for recommendation & predictive model using KNN algorithm with machine learning algorithms and artificial intelligence (AI). We find the factors that directly impact on diseases and propose an approach for predicting the correct diagnosis of different diseases. We have constructed a series of models with good reliability for predicting different surgery complications and identified several novel clinical associations. We proposed a novel algorithm pr-KNN to use KNN for prediction and recommendation of diseases RESULTS Beside that we compared the performance of our algorithm with other machine algorithms and found better performance of our algorithm, with predictive accuracy improving by +3.61%. CONCLUSIONS The potential to directly integrate these predictive tools into EHRs may enable personalized medicine and decision-making at the point of care for patient counseling and as a teaching tool. CLINICALTRIAL dataset for the trials of patient attached

Author(s):  
Brian Ayers ◽  
Tuomas Sandholm ◽  
Igor Gosev ◽  
Sunil Prasad ◽  
Arman Kilic

Background: This study investigates the use of modern machine learning (ML) techniques to improve prediction of survival after orthotopic heart transplantation (OHT). Methods: Retrospective study of adult patients undergoing primary, isolated OHT between 2000-2019 as identified in the United Network for Organ Sharing (UNOS) registry. The primary outcome was one-year post-transplant survival. Patients were randomly divided into training (80%) and validation (20%) sets. Dimensionality reduction and data re-sampling were employed during training. Multiple machine learning algorithms were combined into a final ensemble ML model. Discriminatory capability was assessed using area under receiver-operating-characteristic curve (AUROC), net reclassification index (NRI), and decision curve analysis (DCA). Results: A total of 33,657 OHT patients were evaluated. One-year mortality was 11% (n=3,738). In the validation cohort, the AUROC of singular logistic regression was 0.649 (95% CI 0.628-0.670) compared to 0.691 (95% CI 0.671-0.711) with random forest, 0.691 (95% CI 0.671-0.712) with deep neural network, and 0.653 (95% CI 0.632-0.674) with Adaboost. A final ensemble ML model was created that demonstrated the greatest improvement in AUROC: 0.764 (95% CI 0.745-0.782) (p<0.001). The ensemble ML model improved predictive performance by 72.9% ±3.8% (p<0.001) as assessed by NRI compared to logistic regression. DCA showed the final ensemble method improved risk prediction across the entire spectrum of predicted risk as compared to all other models (p<0.001). Conclusions: Modern ML techniques can improve risk prediction in OHT compared to traditional approaches. This may have important implications in patient selection, programmatic evaluation, allocation policy, and patient counseling and prognostication.


2021 ◽  
Author(s):  
Xiaotong Zhu ◽  
Jinhui Jeanne Huang

&lt;p&gt;Remote sensing monitoring has the characteristics of wide monitoring range, celerity, low cost for long-term dynamic monitoring of water environment. With the flourish of artificial intelligence, machine learning has enabled remote sensing inversion of seawater quality to achieve higher prediction accuracy. However, due to the physicochemical property of the water quality parameters, the performance of algorithms differs a lot. In order to improve the predictive accuracy of seawater quality parameters, we proposed a technical framework to identify the optimal machine learning algorithms using Sentinel-2 satellite and in-situ seawater sample data. In the study, we select three algorithms, i.e. support vector regression (SVR), XGBoost and deep learning (DL), and four seawater quality parameters, i.e. dissolved oxygen (DO), total dissolved solids (TDS), turbidity(TUR) and chlorophyll-a (Chla). The results show that SVR is a more precise algorithm to inverse DO (R&lt;sup&gt;2&lt;/sup&gt; = 0.81). XGBoost has the best accuracy for Chla and Tur inversion (R&lt;sup&gt;2&lt;/sup&gt; = 0.75 and 0.78 respectively) while DL performs better in TDS (R&lt;sup&gt;2&lt;/sup&gt; =0.789). Overall, this research provides a theoretical support for high precision remote sensing inversion of offshore seawater quality parameters based on machine learning.&lt;/p&gt;


2019 ◽  
Author(s):  
Donald Salami ◽  
Carla Alexandra Sousa ◽  
Maria do Rosário Oliveira Martins ◽  
César Capinha

ABSTRACTThe geographical spread of dengue is a global public health concern. This is largely mediated by the importation of dengue from endemic to non-endemic areas via the increasing connectivity of the global air transport network. The dynamic nature and intrinsic heterogeneity of the air transport network make it challenging to predict dengue importation.Here, we explore the capabilities of state-of-the-art machine learning algorithms to predict dengue importation. We trained four machine learning classifiers algorithms, using a 6-year historical dengue importation data for 21 countries in Europe and connectivity indices mediating importation and air transport network centrality measures. Predictive performance for the classifiers was evaluated using the area under the receiving operating characteristic curve, sensitivity, and specificity measures. Finally, we applied practical model-agnostic methods, to provide an in-depth explanation of our optimal model’s predictions on a global and local scale.Our best performing model achieved high predictive accuracy, with an area under the receiver operating characteristic score of 0.94 and a maximized sensitivity score of 0.88. The predictor variables identified as most important were the source country’s dengue incidence rate, population size, and volume of air passengers. Network centrality measures, describing the positioning of European countries within the air travel network, were also influential to the predictions.We demonstrated the high predictive performance of a machine learning model in predicting dengue importation and the utility of the model-agnostic methods to offer a comprehensive understanding of the reasons behind the predictions. Similar approaches can be utilized in the development of an operational early warning surveillance system for dengue importation.


Author(s):  
Gandhali Malve ◽  
Lajree Lohar ◽  
Tanay Malviya ◽  
Shirish Sabnis

Today the amount of information in the internet growth very rapidly and people need some instruments to find and access appropriate information. One of such tools is called recommendation system. Recommendation systems help to navigate quickly and receive necessary information. Many of us find it difficult to decide which movie to watch and so we decided to make a recommender system for us to better judge which movie we are more likely to love. In this project we are going to use Machine Learning Algorithms to recommend movies to users based on genres and user ratings. Recommendation system attempt to predict the preference or rating that a user would give to an item.


2019 ◽  
Author(s):  
Lei Zhang ◽  
Xianwen Shang ◽  
Subhashaan Sreedharan ◽  
Xixi Yan ◽  
Jianbin Liu ◽  
...  

BACKGROUND Previous conventional models for the prediction of diabetes could be updated by incorporating the increasing amount of health data available and new risk prediction methodology. OBJECTIVE We aimed to develop a substantially improved diabetes risk prediction model using sophisticated machine-learning algorithms based on a large retrospective population cohort of over 230,000 people who were enrolled in the study during 2006-2017. METHODS We collected demographic, medical, behavioral, and incidence data for type 2 diabetes mellitus (T2DM) in over 236,684 diabetes-free participants recruited from the 45 and Up Study. We predicted and compared the risk of diabetes onset in these participants at 3, 5, 7, and 10 years based on three machine-learning approaches and the conventional regression model. RESULTS Overall, 6.05% (14,313/236,684) of the participants developed T2DM during an average 8.8-year follow-up period. The 10-year diabetes incidence in men was 8.30% (8.08%-8.49%), which was significantly higher (odds ratio 1.37, 95% CI 1.32-1.41) than that in women at 6.20% (6.00%-6.40%). The incidence of T2DM was doubled in individuals with obesity (men: 17.78% [17.05%-18.43%]; women: 14.59% [13.99%-15.17%]) compared with that of nonobese individuals. The gradient boosting machine model showed the best performance among the four models (area under the curve of 79% in 3-year prediction and 75% in 10-year prediction). All machine-learning models predicted BMI as the most significant factor contributing to diabetes onset, which explained 12%-50% of the variance in the prediction of diabetes. The model predicted that if BMI in obese and overweight participants could be hypothetically reduced to a healthy range, the 10-year probability of diabetes onset would be significantly reduced from 8.3% to 2.8% (<i>P</i>&lt;.001). CONCLUSIONS A one-time self-reported survey can accurately predict the risk of diabetes using a machine-learning approach. Achieving a healthy BMI can significantly reduce the risk of developing T2DM.


2021 ◽  
Vol 9 ◽  
Author(s):  
Huanhuan Zhao ◽  
Xiaoyu Zhang ◽  
Yang Xu ◽  
Lisheng Gao ◽  
Zuchang Ma ◽  
...  

Hypertension is a widespread chronic disease. Risk prediction of hypertension is an intervention that contributes to the early prevention and management of hypertension. The implementation of such intervention requires an effective and easy-to-implement hypertension risk prediction model. This study evaluated and compared the performance of four machine learning algorithms on predicting the risk of hypertension based on easy-to-collect risk factors. A dataset of 29,700 samples collected through a physical examination was used for model training and testing. Firstly, we identified easy-to-collect risk factors of hypertension, through univariate logistic regression analysis. Then, based on the selected features, 10-fold cross-validation was utilized to optimize four models, random forest (RF), CatBoost, MLP neural network and logistic regression (LR), to find the best hyper-parameters on the training set. Finally, the performance of models was evaluated by AUC, accuracy, sensitivity and specificity on the test set. The experimental results showed that the RF model outperformed the other three models, and achieved an AUC of 0.92, an accuracy of 0.82, a sensitivity of 0.83 and a specificity of 0.81. In addition, Body Mass Index (BMI), age, family history and waist circumference (WC) are the four primary risk factors of hypertension. These findings reveal that it is feasible to use machine learning algorithms, especially RF, to predict hypertension risk without clinical or genetic data. The technique can provide a non-invasive and economical way for the prevention and management of hypertension in a large population.


2021 ◽  
Author(s):  
Kate Bentley ◽  
Kelly Zuromski ◽  
Rebecca Fortgang ◽  
Emily Madsen ◽  
Daniel Kessler ◽  
...  

Background: Interest in developing machine learning algorithms that use electronic health record data to predict patients’ risk of suicidal behavior has recently proliferated. Whether and how such models might be implemented and useful in clinical practice, however, remains unknown. In order to ultimately make automated suicide risk prediction algorithms useful in practice, and thus better prevent patient suicides, it is critical to partner with key stakeholders (including the frontline providers who will be using such tools) at each stage of the implementation process.Objective: The aim of this focus group study was to inform ongoing and future efforts to deploy suicide risk prediction models in clinical practice. The specific goals were to better understand hospital providers’ current practices for assessing and managing suicide risk; determine providers’ perspectives on using automated suicide risk prediction algorithms; and identify barriers, facilitators, recommendations, and factors to consider for initiatives in this area. Methods: We conducted 10 two-hour focus groups with a total of 40 providers from psychiatry, internal medicine and primary care, emergency medicine, and obstetrics and gynecology departments within an urban academic medical center. Audio recordings of open-ended group discussions were transcribed and coded for relevant and recurrent themes by two independent study staff members. All coded text was reviewed and discrepancies resolved in consensus meetings with doctoral-level staff. Results: Though most providers reported using standardized suicide risk assessment tools in their clinical practices, existing tools were commonly described as unhelpful and providers indicated dissatisfaction with current suicide risk assessment methods. Overall, providers’ general attitudes toward the practical use of automated suicide risk prediction models and corresponding clinical decision support tools were positive. Providers were especially interested in the potential to identify high-risk patients who might be missed by traditional screening methods. Some expressed skepticism about the potential usefulness of these models in routine care; specific barriers included concerns about liability, alert fatigue, and increased demand on the healthcare system. Key facilitators included presenting specific patient-level features contributing to risk scores, emphasizing changes in risk over time, and developing systematic clinical workflows and provider trainings. Participants also recommended considering risk-prediction windows, timing of alerts, who will have access to model predictions, and variability across treatment settings.Conclusions: Providers were dissatisfied with current suicide risk assessment methods and open to the use of a machine learning-based risk prediction system to inform clinical decision-making. They also raised multiple concerns about potential barriers to the usefulness of this approach and suggested several possible facilitators. Future efforts in this area will benefit from incorporating systematic qualitative feedback from providers, patients, administrators, and payers on the use of new methods in routine care, especially given the complex, sensitive, and unfortunately still stigmatized nature of suicide risk.


2020 ◽  
Author(s):  
Xueyan Li ◽  
Genshan Ma ◽  
Xiaobo Qian ◽  
Yamou Wu ◽  
Xiaochen Huang ◽  
...  

Abstract Background: We aimed to assess the performance of machine learning algorithms for the prediction of risk factors of postoperative ileus (POI) in patients underwent laparoscopic colorectal surgery for malignant lesions. Methods: We conducted analyses in a retrospective observational study with a total of 637 patients at Suzhou Hospital of Nanjing Medical University. Four machine learning algorithms (logistic regression, decision tree, random forest, gradient boosting decision tree) were considered to predict risk factors of POI. The total cases were randomly divided into training and testing data sets, with a ratio of 8:2. The performance of each model was evaluated by area under receiver operator characteristic curve (AUC), precision, recall and F1-score. Results: The morbidity of POI in this study was 19.15% (122/637). Gradient boosting decision tree reached the highest AUC (0.76) and was the best model for POI risk prediction. In addition, the results of the importance matrix of gradient boosting decision tree showed that the five most important variables were time to first passage of flatus, opioids during POD3, duration of surgery, height and weight. Conclusions: The gradient boosting decision tree was the optimal model to predict the risk of POI in patients underwent laparoscopic colorectal surgery for malignant lesions. And the results of our study could be useful for clinical guidelines in POI risk prediction.


2018 ◽  
Vol 7 (4.6) ◽  
pp. 108
Author(s):  
Priyadarshini Chatterjee ◽  
Ch. Mamatha ◽  
T. Jagadeeswari ◽  
Katha Chandra Shekhar

Every 100th cases in cancer we come across are of breasts cancer cases. It is becoming very common in woman of all ages. Correct detection of these lesions in breast is very important. With less of human intervention, the goal is to do the correct diagnosis. Not all the cases of breast masses are futile. If the cases are not dealt properly, they might create panic amongst people. Human detection without machine intervention is not hundred percent accurate. If machines can be deeply trained, they can do the same work of detection with much more accuracy. Bayesian method has a vast area of application in the field of medical image processing as well as in machine learning. This paper intends to use Bayesian probabilistic in image segmentation as well as in machine learning. Machine learning in image processing means application in pattern recognition. There are various machine learning algorithms that can classify an image at their best. In the proposed system, we will be firstly segment the image using Bayesian method. On the segmented parts of the image, we will be applying machine learning algorithm to diagnose the mass or the growth.  


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Peter Appiahene ◽  
Yaw Marfo Missah ◽  
Ussiph Najim

The financial crisis that hit Ghana from 2015 to 2018 has raised various issues with respect to the efficiency of banks and the safety of depositors’ in the banking industry. As part of measures to improve the banking sector and also restore customers’ confidence, efficiency and performance analysis in the banking industry has become a hot issue. This is because stakeholders have to detect the underlying causes of inefficiencies within the banking industry. Nonparametric methods such as Data Envelopment Analysis (DEA) have been suggested in the literature as a good measure of banks’ efficiency and performance. Machine learning algorithms have also been viewed as a good tool to estimate various nonparametric and nonlinear problems. This paper presents a combined DEA with three machine learning approaches in evaluating bank efficiency and performance using 444 Ghanaian bank branches, Decision Making Units (DMUs). The results were compared with the corresponding efficiency ratings obtained from the DEA. Finally, the prediction accuracies of the three machine learning algorithm models were compared. The results suggested that the decision tree (DT) and its C5.0 algorithm provided the best predictive model. It had 100% accuracy in predicting the 134 holdout sample dataset (30% banks) and a P value of 0.00. The DT was followed closely by random forest algorithm with a predictive accuracy of 98.5% and a P value of 0.00 and finally the neural network (86.6% accuracy) with a P value 0.66. The study concluded that banks in Ghana can use the result of this study to predict their respective efficiencies. All experiments were performed within a simulation environment and conducted in R studio using R codes.


Sign in / Sign up

Export Citation Format

Share Document