Applying Machine Learning for Healthcare: A Case Study on Cervical Pain Assessment with Motion Capture

Given the exponential availability of data in health centers and the massive sensorization that is expected, there is an increasing need to manage and analyze these data in an effective way. For this purpose, data mining (DM) and machine learning (ML) techniques would be helpful. However, due to the specific characteristics of the field of healthcare, a suitable DM and ML methodology adapted to these particularities is required. The applied methodology must structure the different stages needed for data-driven healthcare, from the acquisition of raw data to decision-making by clinicians, considering the specific requirements of this field. In this paper, we focus on a case study of cervical assessment, where the goal is to predict the potential presence of cervical pain in patients affected with whiplash diseases, which is important for example in insurance-related investigations. By analyzing in detail this case study in a real scenario, we show how taking care of those particularities enables the generation of reliable predictive models in the field of healthcare. Using a database of 302 samples, we have generated several predictive models, including logistic regression, support vector machines, k-nearest neighbors, gradient boosting, decision trees, random forest, and neural network algorithms. The results show that it is possible to reliably predict the presence of cervical pain (accuracy, precision, and recall above 90%). We expect that the procedure proposed to apply ML techniques in the field of healthcare will help technologists, researchers, and clinicians to create more objective systems that provide support to objectify the diagnosis, improve test treatment efficacy, and save resources.

Download Full-text

Transformational machine learning: Learning how to learn from many related scientific problems

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2108013118 ◽

2021 ◽

Vol 118 (49) ◽

pp. e2108013118

Author(s):

Ivan Olier ◽

Oghenejokpeme I. Orhobor ◽

Tirtharaj Dash ◽

Andy M. Davis ◽

Larisa N. Soldatova ◽

...

Keyword(s):

Machine Learning ◽

Drug Design ◽

Predictive Performance ◽

Gradient Boosting ◽

Support Vector ◽

Algorithm Selection ◽

K Nearest Neighbors ◽

Vector Machines ◽

Almost All ◽

Insight Into

Almost all machine learning (ML) is based on representing examples using intrinsic features. When there are multiple related ML problems (tasks), it is possible to transform these features into extrinsic features by first training ML models on other tasks and letting them each make predictions for each example of the new task, yielding a novel representation. We call this transformational ML (TML). TML is very closely related to, and synergistic with, transfer learning, multitask learning, and stacking. TML is applicable to improving any nonlinear ML method. We tested TML using the most important classes of nonlinear ML: random forests, gradient boosting machines, support vector machines, k-nearest neighbors, and neural networks. To ensure the generality and robustness of the evaluation, we utilized thousands of ML problems from three scientific domains: drug design, predicting gene expression, and ML algorithm selection. We found that TML significantly improved the predictive performance of all the ML methods in all the domains (4 to 50% average improvements) and that TML features generally outperformed intrinsic features. Use of TML also enhances scientific understanding through explainable ML. In drug design, we found that TML provided insight into drug target specificity, the relationships between drugs, and the relationships between target proteins. TML leads to an ecosystem-based approach to ML, where new tasks, examples, predictions, and so on synergistically interact to improve performance. To contribute to this ecosystem, all our data, code, and our ∼50,000 ML models have been fully annotated with metadata, linked, and openly published using Findability, Accessibility, Interoperability, and Reusability principles (∼100 Gbytes).

Download Full-text

Applications of Machine Learning Predictive Models in the Chronic Disease Diagnosis

Journal of Personalized Medicine ◽

10.3390/jpm10020021 ◽

2020 ◽

Vol 10 (2) ◽

pp. 21 ◽

Cited By ~ 4

Author(s):

Gopi Battineni ◽

Getu Gamo Sagaro ◽

Nalini Chinatalapudi ◽

Francesco Amenta

Keyword(s):

Machine Learning ◽

Chronic Diseases ◽

Predictive Models ◽

Disease Diagnosis ◽

Primary Diagnosis ◽

Support Vector ◽

Advantages And Disadvantages ◽

Vector Machines ◽

Applications Of Machine Learning ◽

Near Future

This paper reviews applications of machine learning (ML) predictive models in the diagnosis of chronic diseases. Chronic diseases (CDs) are responsible for a major portion of global health costs. Patients who suffer from these diseases need lifelong treatment. Nowadays, predictive models are frequently applied in the diagnosis and forecasting of these diseases. In this study, we reviewed the state-of-the-art approaches that encompass ML models in the primary diagnosis of CD. This analysis covers 453 papers published between 2015 and 2019, and our document search was conducted from PubMed (Medline), and Cumulative Index to Nursing and Allied Health Literature (CINAHL) libraries. Ultimately, 22 studies were selected to present all modeling methods in a precise way that explains CD diagnosis and usage models of individual pathologies with associated strengths and limitations. Our outcomes suggest that there are no standard methods to determine the best approach in real-time clinical practice since each method has its advantages and disadvantages. Among the methods considered, support vector machines (SVM), logistic regression (LR), clustering were the most commonly used. These models are highly applicable in classification, and diagnosis of CD and are expected to become more important in medical practice in the near future.

Download Full-text

Detection of Loss Zones while Drilling Using Different Machine Learning Techniques

Journal of Energy Resources Technology ◽

10.1115/1.4051553 ◽

2021 ◽

pp. 1-29

Author(s):

Ahmed Alsaihati ◽

Mahmoud Abughaban ◽

Salaheldin Elkatatny ◽

Abdulazeez Abdulraheem

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Random Forests ◽

Nearest Neighbors ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbors ◽

Learning Techniques ◽

Vector Machines ◽

Testing Set

Abstract Fluid loss into formations is a common operational issue that is frequently encountered when drilling across naturally or induced fractured formations. This could pose significant operational risks, such as well-control, stuck pipe, and wellbore instability, which, in turn, lead to an increase of well time and cost. This research aims to use and evaluate different machine learning techniques, namely: support vector machines, random forests, and K-nearest neighbors in detecting loss circulation occurrences while drilling using solely drilling surface parameters. Actual field data of seven wells, which had suffered partial or severe loss circulation, were used to build predictive models, while Well-8 was used to compare the performance of the developed models. Different performance metrics were used to evaluate the performance of the developed models. Recall, precision, and F1-score measures were used to evaluate the ability of the developed model to detect loss circulation occurrences. The results showed the K-nearest neighbors classifier achieved a high F1-score of 0.912 in detecting loss circulation occurrence in the testing set, while the random forests was the second-best classifier with almost the same F1-score of 0.910. The support vector machines achieved an F1-score of 0.83 in predicting the loss circulation occurrence in the testing set. The K-nearest neighbors outperformed other models in detecting the loss circulation occurrences in Well-8 with an F1-score of 0.80. The main contribution of this research as compared to previous studies is that it identifies losses events based on real-time measurements of the active pit volume.

Download Full-text

Machine Learning Application for Gas Lift Performance and Well Integrity

10.2118/205134-ms ◽

2021 ◽

Author(s):

Mostafa Sa'eed Yakoot ◽

Adel Mohamed Salem Ragab ◽

Omar Mahmoud

Keyword(s):

Machine Learning ◽

Business Performance ◽

Risk Level ◽

Gradient Boosting ◽

Support Vector ◽

Risk Category ◽

K Nearest Neighbors ◽

Failure Data ◽

Well Integrity ◽

Gas Lift

Abstract Constructing and maintaining integrity for different types of wells requires accurate assessment of posed risk level, especially when one barrier element or group of barriers fails. Risk assessment and well integrity (WI) categorization is conducted typically using traditional spreadsheets and in-house software that contain their own inherent errors. This is mainly because they are subjected to the understanding and the interpretation of the assigned team to WI data. Because of these limitations, industrial practices involve the collection and analysis of failure data to estimate risk level through certain established probability/likelihood matrices. However, those matrices have become less efficient due to the possible bias in failure data and consequent misleading assessment. The main objective of this work is to utilize machine learning (ML) algorithms to develop a powerful model and predict WI risk category of gas-lifted wells. ML algorithms implemented in this study are; logistic regression, decision trees, random forest, support vector machines, k-nearest neighbors, and gradient boosting algorithms. In addition, those algorithms are used to develop physical equation to predict risk category. Three thousand WI and gas-lift datasets were collected, preprocessed, and fed into the ML model. The newly developed model can predict well risk level and provide a unique methodology to convert associated failure risk of each element in the well envelope into tangible value. This shows the total potential risk and hence the status of well-barrier integrity overall. The implementation of ML can enhance brownfield asset operations, reduce intervention costs, better control WI through the field, improve business performance, and optimize production.

Download Full-text

Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines

Ore Geology Reviews ◽

10.1016/j.oregeorev.2015.01.001 ◽

2015 ◽

Vol 71 ◽

pp. 804-818 ◽

Cited By ~ 226

Author(s):

V. Rodriguez-Galiano ◽

M. Sanchez-Castillo ◽

M. Chica-Olmo ◽

M. Chica-Rivas

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Support Vector Machines ◽

Random Forest ◽

Predictive Models ◽

Regression Trees ◽

Support Vector ◽

Random Forest Regression ◽

Vector Machines ◽

Mineral Prospectivity

Download Full-text

Evaluating machine learning techniques for archaeological lithic sourcing: a case study of flint in Britain

Scientific Reports ◽

10.1038/s41598-021-87834-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Tom Elliot ◽

Robert Morse ◽

Duane Smythe ◽

Ashley Norris

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Random Forest ◽

Objective Evaluation ◽

Classification Performance ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Techniques ◽

Vector Machines

AbstractIt is 50 years since Sieveking et al. published their pioneering research in Nature on the geochemical analysis of artefacts from Neolithic flint mines in southern Britain. In the decades since, geochemical techniques to source stone artefacts have flourished globally, with a renaissance in recent years from new instrumentation, data analysis, and machine learning techniques. Despite the interest over these latter approaches, there has been variation in the quality with which these methods have been applied. Using the case study of flint artefacts and geological samples from England, we present a robust and objective evaluation of three popular techniques, Random Forest, K-Nearest-Neighbour, and Support Vector Machines, and present a pipeline for their appropriate use. When evaluated correctly, the results establish high model classification performance, with Random Forest leading with an average accuracy of 85% (measured through F1 Scores), and with Support Vector Machines following closely. The methodology developed in this paper demonstrates the potential to significantly improve on previous approaches, particularly in removing bias, and providing greater means of evaluation than previously utilised.

Download Full-text

PREDICTION OF FATIGUE CRACK GROWTH DIAGRAMS BY METHODS OF MACHINE LEARNING UNDER CONSTANT AMPLITUDE LOADING

Acta Metallurgica Slovaca ◽

10.36547/ams.26.1.346 ◽

2020 ◽

Vol 26 (1) ◽

pp. 31-33

Author(s):

Oleh Yasniy ◽

Iryna Didych ◽

Yuri Lapusta

Keyword(s):

Machine Learning ◽

Fatigue Crack ◽

Fatigue Crack Growth ◽

Crack Growth ◽

Support Vector ◽

Structural Elements ◽

Constant Amplitude ◽

K Nearest Neighbors ◽

Vector Machines ◽

Stress Ratios

Important structural elements are often under the action of constant amplitude loading. Increasing their lifetime is an actual task and of great economic importance. To evaluate the lifetime of structural elements, it is necessary to be able to predict the fatigue crack growth rate (FCG). This task can be effectively solved by methods of machine learning, in particular by neural networks, boosted trees, support-vector machines, and k -nearest neighbors. The aim of the present work was to build the fatigue crack growth diagrams of steel 0.45% C subjected to constant amplitude loading at stress ratios R = 0, and R = –1 by the methods of machine learning. The obtained results are in good agreement with the experimental data.

Download Full-text

APTITUDE Framework for Learning Data Classification Based on Machine Learning

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2020.14.51 ◽

2020 ◽

Vol 14 ◽

Keyword(s):

Machine Learning ◽

Data Classification ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbors ◽

Course Content ◽

Applied Model ◽

Vector Machines ◽

Learning Data

Learning analytics refers to the machine learning to provide predictions of learner success and prescriptions to learners and teachers. The main goal of paper is to proposed APTITUDE framework for learning data classification in order to achieve an adaptation and recommendations a course content or flow of course activities. This framework has applied model for student learning prediction based on machine learning. The five machine learning algorithms are used to provide learning data classification: random forest, Naïve Bayes, k-nearest neighbors, logistic regression and support vector machines

Download Full-text

Human Papillomavirus Targeted Immunotherapy Outcome Prediction Using Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37197 ◽

2021 ◽

Vol 9 (VII) ◽

pp. 3598-3611

Author(s):

Vidya Moni

Keyword(s):

Machine Learning ◽

Human Papillomavirus ◽

Outcome Prediction ◽

Performance Comparison ◽

Gradient Boosting ◽

Support Vector ◽

Machine Learning Classification ◽

Nearest Neighbours ◽

Vector Machines ◽

Modern Machine

Warts caused by the Human Papillomavirus (HPV) is a highly contagious disease, and affects several million people across the globe every year, in the form of small lesions on the skin, commonly known as warts. Warts can be treated effectively with several methods, the most effective being Immunotherapy and Cryotherapy. Our research is focused on the performance comparison of modern Machine Learning classification techniques to predict the outcome (positive or negative) of Immunotherapy treatment given to a patient, by using patient data as input features to our classifiers. The precision, recall, f-measure and accuracy were used to compare the performance of the various classifiers considered in this study. We considered Logistic Regression, ZeroR, AdaBoost, K-Nearest Neighbours (KNN), Support Vector Machines (SVM), Gradient Boosting, Repeated Incremental Pruning to Produce Error Reduction (RIPPER), Decision Trees and Random Forests. The ZeroR classifier was used as a baseline to provide us with insights into the skewed nature of the data, so as to enable us to better understand the comparison in performance of the various classifiers.

Download Full-text

Machine Learning Based Toxicity Prediction: From Chemical Structural Description to Transcriptome Analysis

International Journal of Molecular Sciences ◽

10.3390/ijms19082358 ◽

2018 ◽

Vol 19 (8) ◽

pp. 2358 ◽

Cited By ~ 26

Author(s):

Yunyi Wu ◽

Guanyu Wang

Keyword(s):

Machine Learning ◽

Language Processing ◽

Learning Algorithm ◽

Input Parameter ◽

Support Vector ◽

Structural Description ◽

Toxicity Prediction ◽

K Nearest Neighbors ◽

Vector Machines ◽

The Cost

Toxicity prediction is very important to public health. Among its many applications, toxicity prediction is essential to reduce the cost and labor of a drug’s preclinical and clinical trials, because a lot of drug evaluations (cellular, animal, and clinical) can be spared due to the predicted toxicity. In the era of Big Data and artificial intelligence, toxicity prediction can benefit from machine learning, which has been widely used in many fields such as natural language processing, speech recognition, image recognition, computational chemistry, and bioinformatics, with excellent performance. In this article, we review machine learning methods that have been applied to toxicity prediction, including deep learning, random forests, k-nearest neighbors, and support vector machines. We also discuss the input parameter to the machine learning algorithm, especially its shift from chemical structural description only to that combined with human transcriptome data analysis, which can greatly enhance prediction accuracy.

Download Full-text