Combined In-silico and Machine Learning Approaches Toward Predicting Arrhythmic Risk in Post-infarction Patients

Frontiers in Physiology ◽

10.3389/fphys.2021.745349 ◽

2021 ◽

Vol 12 ◽

Author(s):

Mary M. Maleckar ◽

Lena Myklebust ◽

Julie Uv ◽

Per Magne Florvaag ◽

Vilde Strøm ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

In Silico ◽

Data Augmentation ◽

Nearest Neighbors ◽

Patient Data ◽

Patient Specific ◽

Support Vector ◽

Geometric Features ◽

K Nearest Neighbors

Background: Remodeling due to myocardial infarction (MI) significantly increases patient arrhythmic risk. Simulations using patient-specific models have shown promise in predicting personalized risk for arrhythmia. However, these are computationally- and time- intensive, hindering translation to clinical practice. Classical machine learning (ML) algorithms (such as K-nearest neighbors, Gaussian support vector machines, and decision trees) as well as neural network techniques, shown to increase prediction accuracy, can be used to predict occurrence of arrhythmia as predicted by simulations based solely on infarct and ventricular geometry. We present an initial combined image-based patient-specific in silico and machine learning methodology to assess risk for dangerous arrhythmia in post-infarct patients. Furthermore, we aim to demonstrate that simulation-supported data augmentation improves prediction models, combining patient data, computational simulation, and advanced statistical modeling, improving overall accuracy for arrhythmia risk assessment.Methods: MRI-based computational models were constructed from 30 patients 5 days post-MI (the “baseline” population). In order to assess the utility biophysical model-supported data augmentation for improving arrhythmia prediction, we augmented the virtual baseline patient population. Each patient ventricular and ischemic geometry in the baseline population was used to create a subfamily of geometric models, resulting in an expanded set of patient models (the “augmented” population). Arrhythmia induction was attempted via programmed stimulation at 17 sites for each virtual patient corresponding to AHA LV segments and simulation outcome, “arrhythmia,” or “no-arrhythmia,” were used as ground truth for subsequent statistical prediction (machine learning, ML) models. For each patient geometric model, we measured and used choice data features: the myocardial volume and ischemic volume, as well as the segment-specific myocardial volume and ischemia percentage, as input to ML algorithms. For classical ML techniques (ML), we trained k-nearest neighbors, support vector machine, logistic regression, xgboost, and decision tree models to predict the simulation outcome from these geometric features alone. To explore neural network ML techniques, we trained both a three - and a four-hidden layer multilayer perceptron feed forward neural networks (NN), again predicting simulation outcomes from these geometric features alone. ML and NN models were trained on 70% of randomly selected segments and the remaining 30% was used for validation for both baseline and augmented populations.Results: Stimulation in the baseline population (30 patient models) resulted in reentry in 21.8% of sites tested; in the augmented population (129 total patient models) reentry occurred in 13.0% of sites tested. ML and NN models ranged in mean accuracy from 0.83 to 0.86 for the baseline population, improving to 0.88 to 0.89 in all cases.Conclusion: Machine learning techniques, combined with patient-specific, image-based computational simulations, can provide key clinical insights with high accuracy rapidly and efficiently. In the case of sparse or missing patient data, simulation-supported data augmentation can be employed to further improve predictive results for patient benefit. This work paves the way for using data-driven simulations for prediction of dangerous arrhythmia in MI patients.

Download Full-text

Analisis Perbandingan Algoritma SVM, KNN, dan CNN untuk Klasifikasi Citra Cuaca

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2021824553 ◽

2021 ◽

Vol 8 (2) ◽

pp. 311

Author(s):

Mohammad Farid Naufal

Keyword(s):

Neural Network ◽

Machine Learning ◽

Computer Vision ◽

Support Vector Machine ◽

Convolutional Neural Network ◽

Cross Validation ◽

Nearest Neighbors ◽

Support Vector ◽

Classification Algorithms ◽

K Nearest Neighbors

Cuaca merupakan faktor penting yang dipertimbangkan untuk berbagai pengambilan keputusan. Klasifikasi cuaca manual oleh manusia membutuhkan waktu yang lama dan inkonsistensi. Computer vision adalah cabang ilmu yang digunakan komputer untuk mengenali atau melakukan klasifikasi citra. Hal ini dapat membantu pengembangan self autonomous machine agar tidak bergantung pada koneksi internet dan dapat melakukan kalkulasi sendiri secara real time. Terdapat beberapa algoritma klasifikasi citra populer yaitu K-Nearest Neighbors (KNN), Support Vector Machine (SVM), dan Convolutional Neural Network (CNN). KNN dan SVM merupakan algoritma klasifikasi dari Machine Learning sedangkan CNN merupakan algoritma klasifikasi dari Deep Neural Network. Penelitian ini bertujuan untuk membandingkan performa dari tiga algoritma tersebut sehingga diketahui berapa gap performa diantara ketiganya. Arsitektur uji coba yang dilakukan adalah menggunakan 5 cross validation. Beberapa parameter digunakan untuk mengkonfigurasikan algoritma KNN, SVM, dan CNN. Dari hasil uji coba yang dilakukan CNN memiliki performa terbaik dengan akurasi 0.942, precision 0.943, recall 0.942, dan F1 Score 0.942. AbstractWeather is an important factor that is considered for various decision making. Manual weather classification by humans is time consuming and inconsistent. Computer vision is a branch of science that computers use to recognize or classify images. This can help develop self-autonomous machines so that they are not dependent on an internet connection and can perform their own calculations in real time. There are several popular image classification algorithms, namely K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Convolutional Neural Network (CNN). KNN and SVM are Machine Learning classification algorithms, while CNN is a Deep Neural Networks classification algorithm. This study aims to compare the performance of that three algorithms so that the performance gap between the three is known. The test architecture is using 5 cross validation. Several parameters are used to configure the KNN, SVM, and CNN algorithms. From the test results conducted by CNN, it has the best performance with 0.942 accuracy, 0.943 precision, 0.942 recall, and F1 Score 0.942.

Download Full-text

A Machine Learning-based System for Financial Fraud Detection

10.5753/eniac.2021.18250 ◽

2021 ◽

Author(s):

João Paulo A. Andrade ◽

Leonardo S. Paulucio ◽

Thiago M. Paixão ◽

Rodrigo F. Berriel ◽

Teresa Cristina Janes Carneiro ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Nearest Neighbors ◽

Financial Data ◽

Support Vector ◽

Financial Fraud ◽

K Nearest Neighbors ◽

Governmental Agencies ◽

A Company

Companies created for money-laundering or as a means for taxevasion are harmful to the country's economy and society. This problem is usually tackled by governmental agencies by having officials to pore over companies' financial data and to single out those that exhibit fraudulent behavior. Such work tends to be slow-paced and tedious. This paper proposes a machine learning-based system capable of classifying whether a company is likely to be involved in fraud or not. Based on financial and tax data from various companies, four different classifiers – k-Nearest Neighbors, Random Forest, Support Vector Machine (SVM), and a Neural Network – were trained and then used to indicate fraud. The best-performing model achieved a macro-averaged F1-score of 92.98% with the Random Forest.

Download Full-text

The influence of knowledge governance and boundary-spanning search on innovation performance

Modern Physics Letters B ◽

10.1142/s0217984920503261 ◽

2020 ◽

Vol 34 (29) ◽

pp. 2050326

Author(s):

Ning Cao ◽

Jianjun Wang

Keyword(s):

Neural Network ◽

Detrimental Effect ◽

Nearest Neighbors ◽

Boundary Spanning ◽

Support Vector ◽

K Nearest Neighbors ◽

Machine Learning Methods ◽

High Knowledge ◽

Knowledge Governance ◽

Exploratory Innovation

The realization of exploratory innovation is a complex and nonlinear evolutionary problem. Existing works point out that it is closely related with knowledge governance and boundary-spanning search. However, the intricate relationship among them still lacks exact quantitative explanations. Motivated by this, using four machine learning methods, namely, linear regression (LR), neural network (NN), support vector machine (SVM) and k-nearest neighbors (KNN), we explore how boundary-spanning search combined with knowledge governance influences innovation. Results show that SVM has the highest values of both stability and goodness of fitting. The SVM results show that the combination of low knowledge governance and high boundary-spanning search boosts innovation most efficiently, while high knowledge governance combined with low boundary-spanning search caused the most detrimental effect on innovation. Our results reveal enhancing boundary-spanning search is essential and beneficial to innovation.

Download Full-text

Data Mining Techniques for Identification and Classification of Various Diseases in Plants

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b1110.1292s19 ◽

2019 ◽

Vol 9 (2S) ◽

pp. 676-680

Keyword(s):

Neural Network ◽

Data Mining ◽

Nearest Neighbors ◽

Crop Productivity ◽

Vital Role ◽

Support Vector ◽

Data Sets ◽

K Nearest Neighbors ◽

Data Mining Techniques

Data mining is currently being used in various applications; In research community it plays a vital role. This paper specify about data mining techniques for the preprocessing and classification of various disease in plants. Since various plants has different diseases based on that each of them has different data sets and different objectives for knowledge discovery. Data Mining Techniques applied on plants that it helps in segmentation and classification of diseased plants, it avoids Oral Inspection and helps to increase in crop productivity. This paper provides various classification techniques Such as K-Nearest Neighbors, Support Vector Machine, Principle component Analysis, Neural Network. Thus among various techniques neural network is effective for disease detection in plants.

Download Full-text

An Efficient and Fast Model Reduced Kernel KNN for Human Activity Recognition

Journal of Advanced Transportation ◽

10.1155/2021/2026895 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Zongying Liu ◽

Shaoxi Li ◽

Jiangling Hao ◽

Jingfeng Hu ◽

Mingyang Pan

Keyword(s):

Neural Network ◽

Activity Recognition ◽

Human Activity ◽

Processing Time ◽

Kernel Method ◽

Nearest Neighbors ◽

Human Activity Recognition ◽

Support Vector ◽

K Nearest Neighbors ◽

Proposed Model

With accumulation of data and development of artificial intelligence, human activity recognition attracts lots of attention from researchers. Many classic machine learning algorithms, such as artificial neural network, feed forward neural network, K-nearest neighbors, and support vector machine, achieve good performance for detecting human activity. However, these algorithms have their own limitations and their prediction accuracy still has space to improve. In this study, we focus on K-nearest neighbors (KNN) and solve its limitations. Firstly, kernel method is employed in model KNN, which transforms the input features to be the high-dimensional features. The proposed model KNN with kernel (K-KNN) improves the accuracy of classification. Secondly, a novel reduced kernel method is proposed and used in model K-KNN, which is named as Reduced Kernel KNN (RK-KNN). It reduces the processing time and enhances the classification performance. Moreover, this study proposes an approach of defining number of K neighbors, which reduces the parameter dependency problem. Based on the experimental works, the proposed RK-KNN obtains the best performance in benchmarks and human activity datasets compared with other models. It has super classification ability in human activity recognition. The accuracy of human activity data is 91.60% for HAPT and 92.67% for Smartphone, respectively. Averagely, compared with the conventional KNN, the proposed model RK-KNN increases the accuracy by 1.82% and decreases standard deviation by 0.27. The small gap of processing time between KNN and RK-KNN in all datasets is only 1.26 seconds.

Download Full-text

Analisis Perbandingan Algoritma Klasifikasi Citra Chest X-ray Untuk Deteksi Covid-19

Teknika ◽

10.34148/teknika.v10i2.331 ◽

2021 ◽

Vol 10 (2) ◽

pp. 96-103

Author(s):

Mohammad Farid Naufal ◽

Selvia Ferdiana Kusuma ◽

Kevin Christian Tanus ◽

Raynaldy Valentino Sukiwun ◽

Joseph Kristiano ◽

...

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Cross Validation ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

Support Vector ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

X Ray ◽

Chest X Ray

Kondisi pandemi global Covid-19 yang muncul diakhir tahun 2019 telah menjadi permasalahan utama seluruh negara di dunia. Covid-19 merupakan virus yang menyerang organ paru-paru dan dapat mengakibatkan kematian. Pasien Covid-19 banyak yang telah dirawat di rumah sakit sehingga terdapat data citra chest X-ray paru-paru pasien yang terjangkit Covid-19. Saat ini sudah banyak peneltian yang melakukan klasifikasi citra chest X-ray menggunakan Convolutional Neural Network (CNN) untuk membedakan paru-paru sehat, terinfeksi covid-19, dan penyakit paru-paru lainnya, namun belum ada penelitian yang mencoba membandingkan performa algoritma CNN dan machine learning klasik seperti Support Vector Machine (SVM), dan K-Nearest Neighbor (KNN) untuk mengetahui gap performa dan waktu eksekusi yang dibutuhkan. Penelitian ini bertujuan untuk membandingkan performa dan waktu eksekusi algoritma klasifikasi K-Nearest Neighbors (KNN), Support Vector Machine (SVM), dan CNN untuk mendeteksi Covid-19 berdasarkan citra chest X-Ray. Berdasarkan hasil pengujian menggunakan 5 Cross Validation, CNN merupakan algoritma yang memiliki rata-rata performa terbaik yaitu akurasi 0,9591, precision 0,9592, recall 0,9591, dan F1 Score 0,959 dengan waktu eksekusi rata-rata sebesar 3102,562 detik.

Download Full-text

PigLeg: prediction of swine phenotype using machine learning

PeerJ ◽

10.7717/peerj.8764 ◽

2020 ◽

Vol 8 ◽

pp. e8764 ◽

Cited By ~ 2

Author(s):

Siroj Bakoev ◽

Lyubov Getmantseva ◽

Maria Kolosova ◽

Olga Kostyunina ◽

Duane R. Chartier ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithms ◽

Average Daily Gain ◽

Nearest Neighbors ◽

The State ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbors ◽

Leg Weakness

Industrial pig farming is associated with negative technological pressure on the bodies of pigs. Leg weakness and lameness are the sources of significant economic loss in raising pigs. Therefore, it is important to identify the predictors of limb condition. This work presents assessments of the state of limbs using indicators of growth and meat characteristics of pigs based on machine learning algorithms. We have evaluated and compared the accuracy of prediction for nine ML classification algorithms (Random Forest, K-Nearest Neighbors, Artificial Neural Networks, C50Tree, Support Vector Machines, Naive Bayes, Generalized Linear Models, Boost, and Linear Discriminant Analysis) and have identified the Random Forest and K-Nearest Neighbors as the best-performing algorithms for predicting pig leg weakness using a small set of simple measurements that can be taken at an early stage of animal development. Measurements of Muscle Thickness, Back Fat amount, and Average Daily Gain were found to be significant predictors of the conformation of pig limbs. Our work demonstrates the utility and relative ease of using machine learning algorithms to assess the state of limbs in pigs based on growth rate and meat characteristics.

Download Full-text

Detection of Loss Zones while Drilling Using Different Machine Learning Techniques

Journal of Energy Resources Technology ◽

10.1115/1.4051553 ◽

2021 ◽

pp. 1-29

Author(s):

Ahmed Alsaihati ◽

Mahmoud Abughaban ◽

Salaheldin Elkatatny ◽

Abdulazeez Abdulraheem

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Random Forests ◽

Nearest Neighbors ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbors ◽

Learning Techniques ◽

Vector Machines ◽

Testing Set

Abstract Fluid loss into formations is a common operational issue that is frequently encountered when drilling across naturally or induced fractured formations. This could pose significant operational risks, such as well-control, stuck pipe, and wellbore instability, which, in turn, lead to an increase of well time and cost. This research aims to use and evaluate different machine learning techniques, namely: support vector machines, random forests, and K-nearest neighbors in detecting loss circulation occurrences while drilling using solely drilling surface parameters. Actual field data of seven wells, which had suffered partial or severe loss circulation, were used to build predictive models, while Well-8 was used to compare the performance of the developed models. Different performance metrics were used to evaluate the performance of the developed models. Recall, precision, and F1-score measures were used to evaluate the ability of the developed model to detect loss circulation occurrences. The results showed the K-nearest neighbors classifier achieved a high F1-score of 0.912 in detecting loss circulation occurrence in the testing set, while the random forests was the second-best classifier with almost the same F1-score of 0.910. The support vector machines achieved an F1-score of 0.83 in predicting the loss circulation occurrence in the testing set. The K-nearest neighbors outperformed other models in detecting the loss circulation occurrences in Well-8 with an F1-score of 0.80. The main contribution of this research as compared to previous studies is that it identifies losses events based on real-time measurements of the active pit volume.

Download Full-text

Machine Learning Approach to Dysphonia Detection

Applied Sciences ◽

10.3390/app8101927 ◽

2018 ◽

Vol 8 (10) ◽

pp. 1927 ◽

Cited By ~ 1

Author(s):

Zuzana Dankovičová ◽

Dávid Sovák ◽

Peter Drotár ◽

Liberios Vokorokos

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Nearest Neighbors ◽

Classification Model ◽

Support Vector ◽

Learning Approach ◽

K Nearest Neighbors ◽

Machine Learning Methods ◽

Machine Learning Approach ◽

Speech Features

This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech features and used these to train the classification model. As classifiers, three state-of-the-art methods were used: K-nearest neighbors, random forests, and support vector machine. We analyzed the performance of classifiers with and without gender taken into account. The experimental results showed that it is possible to recognize pathological speech with as high as a 91.3% classification accuracy.

Download Full-text

Using Item Response Theory for Explainable Machine Learning in Predicting Mortality in the Intensive Care Unit: Case-Based Approach

Journal of Medical Internet Research ◽

10.2196/20268 ◽

2020 ◽

Vol 22 (9) ◽

pp. e20268

Author(s):

Adrienne Kline ◽

Theresa Kline ◽

Zahra Shakeri Hossein Abad ◽

Joon Lee

Keyword(s):

Neural Network ◽

Machine Learning ◽

Intensive Care Unit ◽

Logistic Regression ◽

Intensive Care ◽

Item Response ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Linear Discriminant ◽

Case Based

Background Supervised machine learning (ML) is being featured in the health care literature with study results frequently reported using metrics such as accuracy, sensitivity, specificity, recall, or F1 score. Although each metric provides a different perspective on the performance, they remain to be overall measures for the whole sample, discounting the uniqueness of each case or patient. Intuitively, we know that all cases are not equal, but the present evaluative approaches do not take case difficulty into account. Objective A more case-based, comprehensive approach is warranted to assess supervised ML outcomes and forms the rationale for this study. This study aims to demonstrate how the item response theory (IRT) can be used to stratify the data based on how difficult each case is to classify, independent of the outcome measure of interest (eg, accuracy). This stratification allows the evaluation of ML classifiers to take the form of a distribution rather than a single scalar value. Methods Two large, public intensive care unit data sets, Medical Information Mart for Intensive Care III and electronic intensive care unit, were used to showcase this method in predicting mortality. For each data set, a balanced sample (n=8078 and n=21,940, respectively) and an imbalanced sample (n=12,117 and n=32,910, respectively) were drawn. A 2-parameter logistic model was used to provide scores for each case. Several ML algorithms were used in the demonstration to classify cases based on their health-related features: logistic regression, linear discriminant analysis, K-nearest neighbors, decision tree, naive Bayes, and a neural network. Generalized linear mixed model analyses were used to assess the effects of case difficulty strata, ML algorithm, and the interaction between them in predicting accuracy. Results The results showed significant effects (P<.001) for case difficulty strata, ML algorithm, and their interaction in predicting accuracy and illustrated that all classifiers performed better with easier-to-classify cases and that overall the neural network performed best. Significant interactions suggest that cases that fall in the most arduous strata should be handled by logistic regression, linear discriminant analysis, decision tree, or neural network but not by naive Bayes or K-nearest neighbors. Conventional metrics for ML classification have been reported for methodological comparison. Conclusions This demonstration shows that using the IRT is a viable method for understanding the data that are provided to ML algorithms, independent of outcome measures, and highlights how well classifiers differentiate cases of varying difficulty. This method explains which features are indicative of healthy states and why. It enables end users to tailor the classifier that is appropriate to the difficulty level of the patient for personalized medicine.

Download Full-text