An Efficient Link Prediction Model Using Supervised Machine Learning

Criminal network activities, which are usually secret and stealthy, present certain difficulties in conducting criminal network analysis (CNA) because of the lack of complete datasets. The collection of criminal activities data in these networks tends to be incomplete and inconsistent, which is reflected structurally in the criminal network in the form of missing nodes (actors) and links (relationships). Criminal networks are commonly analyzed using social network analysis (SNA) models. Most machine learning techniques that rely on the metrics of SNA models in the development of hidden or missing link prediction models utilize supervised learning. However, supervised learning usually requires the availability of a large dataset to train the link prediction model in order to achieve an optimum performance level. Therefore, this research is conducted to explore the application of deep reinforcement learning (DRL) in developing a criminal network hidden links prediction model from the reconstruction of a corrupted criminal network dataset. The experiment conducted on the model indicates that the dataset generated by the DRL model through self-play or self-simulation can be used to train the link prediction model. The DRL link prediction model exhibits a better performance than a conventional supervised machine learning technique, such as the gradient boosting machine (GBM) trained with a relatively smaller domain dataset.

Download Full-text

Comparative Analysis of Supervised Machine Learning Algorithms for GIS-Based Crop Selection Prediction Model

Lecture Notes in Networks and Systems - Computing and Network Sustainability ◽

10.1007/978-981-13-7150-9_33 ◽

2019 ◽

pp. 309-314 ◽

Cited By ~ 1

Author(s):

Preetam Tamsekar ◽

Nilesh Deshmukh ◽

Parag Bhalchandra ◽

Govind Kulkarni ◽

Kailas Hambarde ◽

...

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Prediction Model ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Crop Selection

Download Full-text

Supervised Machine Learning Applied to Link Prediction in Bipartite Social Networks

2010 International Conference on Advances in Social Networks Analysis and Mining ◽

10.1109/asonam.2010.87 ◽

2010 ◽

Cited By ~ 57

Author(s):

Nesserine Benchettara ◽

Rushed Kanawati ◽

Céline Rouveirol

Keyword(s):

Machine Learning ◽

Social Networks ◽

Link Prediction ◽

Supervised Machine Learning

Download Full-text

Student Performance Prediction Model based on Supervised Machine Learning Algorithms

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/928/3/032019 ◽

2020 ◽

Vol 928 ◽

pp. 032019

Author(s):

Ali Salah Hashim ◽

Wid Akeel Awadh ◽

Alaa Khalaf Hamoud

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Student Performance ◽

Performance Prediction ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Model Based

Download Full-text

A Supervised Machine Learning Link Prediction Approach for Tag Recommendation

Online Communities and Social Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-21796-8_36 ◽

2011 ◽

pp. 336-344 ◽

Cited By ~ 3

Author(s):

Manisha Pujari ◽

Rushed Kanawati

Keyword(s):

Machine Learning ◽

Link Prediction ◽

Supervised Machine Learning ◽

Tag Recommendation ◽

Prediction Approach

Download Full-text

Prediction Model for Bollywood Movie Success: A Comparative Analysis of Performance of Supervised Machine Learning Algorithms

The Review of Socionetwork Strategies ◽

10.1007/s12626-019-00040-6 ◽

2019 ◽

Vol 14 (1) ◽

pp. 1-17

Author(s):

Hemraj Verma ◽

Garima Verma

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Prediction Model ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Analysis Of Performance ◽

Movie Success

Download Full-text

Non-Invasive Prediction Model to Detect Sepsis using Supervised Machine Learning Algorithms

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e1012.0285s20 ◽

2020 ◽

Vol 8 (5S) ◽

pp. 50-52

Keyword(s):

Machine Learning ◽

Developing Countries ◽

Prediction Model ◽

Tissue Damage ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Blood Samples ◽

Non Invasive ◽

Life Threatening

Sepsis is a life-threatening disease that causes tissue damage, organ failure and results in the death of millions of people. Sepsis is one of the highest risky diseases identified globally. A large proportion of these deaths occur in developing countries due to inaccessibility of hospitals or lack of resources. Blood samples are taken to confirm sepsis, but it requires the presence of laboratory and is time-consuming. The aim and objective of this study is to develop a practical, non-invasive sepsis prediction model that can be used to detect sepsis using supervised machine Learning algorithms. For this retrospective analysis, we used the data available from Physio-Net database.

Download Full-text

Completeness of reporting of clinical prediction models developed using supervised machine learning: A systematic review

10.1101/2021.06.28.21259089 ◽

2021 ◽

Author(s):

Constanza L Andaur Navarro ◽

Johanna AA Damen ◽

Toshihiko Takada ◽

Steven WJ Nijman ◽

Paula Dhiman ◽

...

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Prediction Models ◽

External Validation ◽

Supervised Machine Learning ◽

Model Specification ◽

Essential Information ◽

Model Studies ◽

Completeness Of Reporting ◽

Complete Reporting

ABSTRACT Objective. While many studies have consistently found incomplete reporting of regression-based prediction model studies, evidence is lacking for machine learning-based prediction model studies. Our aim is to systematically review the adherence of Machine Learning (ML)-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Statement. Study design and setting: We included articles reporting on development or external validation of a multivariable prediction model (either diagnostic or prognostic) developed using supervised ML for individualized predictions across all medical fields (PROSPERO, CRD42019161764). We searched PubMed from 1 January 2018 to 31 December 2019. Data extraction was performed using the 22-item checklist for reporting of prediction model studies (www.TRIPOD-statement.org). We measured the overall adherence per article and per TRIPOD item. Results: Our search identified 24 814 articles, of which 152 articles were included: 94 (61.8%) prognostic and 58 (38.2%) diagnostic prediction model studies. Overall, articles adhered to a median of 38.7% (IQR 31.0-46.4) of TRIPOD items. No articles fully adhered to complete reporting of the abstract and very few reported the flow of participants (3.9%, 95% CI 1.8 to 8.3), appropriate title (4.6%, 95% CI 2.2 to 9.2), blinding of predictors (4.6%, 95% CI 2.2 to 9.2), model specification (5.2%, 95% CI 2.4 to 10.8), and model's predictive performance (5.9%, 95% CI 3.1 to 10.9). There was often complete reporting of source of data (98.0%, 95% CI 94.4 to 99.3) and interpretation of the results (94.7%, 95% CI 90.0 to 97.3). Conclusion. Similar to studies using conventional statistical techniques, the completeness of reporting is poor. Essential information to decide to use the model (i.e. model specification and its performance) is rarely reported. However, some items and sub-items of TRIPOD might be less suitable for ML-based prediction model studies and thus, TRIPOD requires extensions. Overall, there is an urgent need to improve the reporting quality and usability of research to avoid research waste.

Download Full-text

Entropy-Based Time Window Features Extraction for Machine Learning to Predict Acute Kidney Injury in ICU

Applied Sciences ◽

10.3390/app11146364 ◽

2021 ◽

Vol 11 (14) ◽

pp. 6364

Author(s):

Chun-Te Huang ◽

Rong-Ching Chang ◽

Yi-Lu Tsai ◽

Kai-Chih Pai ◽

Tsai-Jung Wang ◽

...

Keyword(s):

Machine Learning ◽

Acute Kidney Injury ◽

Missing Data ◽

Prediction Model ◽

Kidney Injury ◽

Machine Learning Algorithms ◽

Data Availability ◽

Supervised Machine Learning ◽

Data Set ◽

Clinical Scenarios

Acute kidney injury (AKI) refers to rapid decline of kidney function and is manifested by decreasing urine output or abnormal blood test (elevated serum creatinine). Electronic health records (EHRs) is fundamental for clinicians and machine learning algorithms to predict the clinical outcome of patients in the Intensive Care Unit (ICU). Early prediction of AKI could automatically warn the clinicians to review the possible risk factors and act in advance to prevent it. However, the enormous amount of patient data usually consists of a relatively incomplete data set and is very challenging for supervised machine learning process. In this paper, we propose an entropy-based feature engineering framework for vital signs based on their frequency of records. In particular, we address the missing at random (MAR) and missing not at random (MNAR) types of missing data according to different clinical scenarios. Regarding its applicability, we applied it to establish a prediction model for future AKI in ICU patients using 4278 ICU admissions from a tertiary hospital. Our result shows that the proposed entropy-based features are feasible to be used in the AKI prediction model and its performance improves as the data availability increases. In addition, we study the performance of AKI prediction model by comparing different time gaps and feature windows with the proposed vital sign entropy features. This work could be used as a guidance for feature windows selection and missing data processing during the development of a prediction model in ICU.

Download Full-text

Gene function finding through cross-organism ensemble learning

BioData Mining ◽

10.1186/s13040-021-00239-w ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Gianluca Moro ◽

Marco Masseroli

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Ensemble Learning ◽

Gene Function ◽

Web Application ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Biological Information ◽

Ensemble Prediction ◽

Learning Method

Abstract Background Structured biological information about genes and proteins is a valuable resource to improve discovery and understanding of complex biological processes via machine learning algorithms. Gene Ontology (GO) controlled annotations describe, in a structured form, features and functions of genes and proteins of many organisms. However, such valuable annotations are not always reliable and sometimes are incomplete, especially for rarely studied organisms. Here, we present GeFF (Gene Function Finder), a novel cross-organism ensemble learning method able to reliably predict new GO annotations of a target organism from GO annotations of another source organism evolutionarily related and better studied. Results Using a supervised method, GeFF predicts unknown annotations from random perturbations of existing annotations. The perturbation consists in randomly deleting a fraction of known annotations in order to produce a reduced annotation set. The key idea is to train a supervised machine learning algorithm with the reduced annotation set to predict, namely to rebuild, the original annotations. The resulting prediction model, in addition to accurately rebuilding the original known annotations for an organism from their perturbed version, also effectively predicts new unknown annotations for the organism. Moreover, the prediction model is also able to discover new unknown annotations in different target organisms without retraining.We combined our novel method with different ensemble learning approaches and compared them to each other and to an equivalent single model technique. We tested the method with five different organisms using their GO annotations: Homo sapiens, Mus musculus, Bos taurus, Gallus gallus and Dictyostelium discoideum. The outcomes demonstrate the effectiveness of the cross-organism ensemble approach, which can be customized with a trade-off between the desired number of predicted new annotations and their precision.A Web application to browse both input annotations used and predicted ones, choosing the ensemble prediction method to use, is publicly available at http://tiny.cc/geff/. Conclusions Our novel cross-organism ensemble learning method provides reliable predicted novel gene annotations, i.e., functions, ranked according to an associated likelihood value. They are very valuable both to speed the annotation curation, focusing it on the prioritized new annotations predicted, and to complement known annotations available.

Download Full-text