A Smart Device for the Prediction of Epileptic Seizures using Machine Learning Algorithms

More than 65 million people live with epilepsy. The unpredictable nature of epileptic seizures drastically increases the risk of injury, especially in daily activities such as walking or driving. The purpose of this project is to develop an accurate prediction device that utilizes raw EEG data for the prediction of epileptic seizures to alert patients of an oncoming seizure beforehand to escape dangerous situations. Using the raw EEG data, features were extracted by computing the average power spectral density of different brain waves after applying the Fast Fourier Transform. These features were used as the input dataset to the machine learning algorithms. Each model is tested with new unseen data using various metrics such as accuracy, precision, recall, and F1 score. The highest performing algorithm, Random Forest (RF) produced a prediction accuracy of 99.0% and a precision of 99.3%. Channel importance is calculated for the RF algorithm. This analysis helped to reduce the number of channels from 22 before feature importance to only 7 channels without significant hits to performance metrics. Using the RF algorithm, an embedded program is developed to run on a portable, low-power hardware device to predict the onset of a seizure. The hardware includes BeagleBone Black microcontroller running open-source software and a Bluetooth transmitter-receiver to transmit the prediction to smartphone devices. By reducing the number of EEG channels to 7 channels, the system is more convenient for a future wearable device. Hardware with the ability to predict epileptic seizures can save many patients from potentially dangerous situations such as driving or swimming. It can help many patients in their daily lives by removing the uncertainty and improving their quality of life.

Download Full-text

Identifying the Main Risk Factors for Cardiovascular Diseases Prediction Using Machine Learning Algorithms

Mathematics ◽

10.3390/math9202537 ◽

2021 ◽

Vol 9 (20) ◽

pp. 2537

Author(s):

Luis Rolando Guarneros-Nolasco ◽

Nancy Aracely Cruz-Ramos ◽

Giner Alor-Hernández ◽

Lisbeth Rodríguez-Mazahua ◽

José Luis Sánchez-Cervantes

Keyword(s):

Machine Learning ◽

Cardiovascular Diseases ◽

Performance Metrics ◽

Learning Algorithms ◽

Predictive Performance ◽

Machine Learning Algorithms ◽

Algorithm Performance ◽

Body Regions ◽

Risks Factors ◽

Fold Cross Validation

Cardiovascular Diseases (CVDs) are a leading cause of death globally. In CVDs, the heart is unable to deliver enough blood to other body regions. As an effective and accurate diagnosis of CVDs is essential for CVD prevention and treatment, machine learning (ML) techniques can be effectively and reliably used to discern patients suffering from a CVD from those who do not suffer from any heart condition. Namely, machine learning algorithms (MLAs) play a key role in the diagnosis of CVDs through predictive models that allow us to identify the main risks factors influencing CVD development. In this study, we analyze the performance of ten MLAs on two datasets for CVD prediction and two for CVD diagnosis. Algorithm performance is analyzed on top-two and top-four dataset attributes/features with respect to five performance metrics –accuracy, precision, recall, f1-score, and roc-auc—using the train-test split technique and k-fold cross-validation. Our study identifies the top-two and top-four attributes from CVD datasets analyzing the performance of the accuracy metrics to determine that they are the best for predicting and diagnosing CVD. As our main findings, the ten ML classifiers exhibited appropriate diagnosis in classification and predictive performance with accuracy metric with top-two attributes, identifying three main attributes for diagnosis and prediction of a CVD such as arrhythmia and tachycardia; hence, they can be successfully implemented for improving current CVD diagnosis efforts and help patients around the world, especially in regions where medical staff is lacking.

Download Full-text

Haar Wavelet Pyramid-Based Melanoma Skin Cancer Identification With Ensemble of Machine Learning Algorithms

International Journal of Healthcare Information Systems and Informatics ◽

10.4018/ijhisi.20211001.oa24 ◽

2021 ◽

Vol 16 (4) ◽

pp. 1-15

Author(s):

Sudeep D. Thepade ◽

Gaurav Ramnani

Keyword(s):

Machine Learning ◽

Skin Cancer ◽

Health Informatics ◽

Performance Metrics ◽

Learning Algorithms ◽

Haar Wavelet ◽

Machine Learning Algorithms ◽

Computer Assisted ◽

Marginal Improvement ◽

Wavelet Pyramid

Melanoma is a mortal type of skin cancer. Early detection of melanoma significantly improves the patient’s chances of survival. Detection of melanoma at an early juncture demands expert doctors. The scarcity of such expert doctors is a major issue with healthcare systems globally. Computer-assisted diagnostics may prove helpful in this case. This paper proposes a health informatics system for melanoma identification using machine learning with dermoscopy skin images. In the proposed method, the features of dermoscopy skin images are extracted using the Haar wavelet pyramid various levels. These features are employed to train machine learning algorithms and ensembles for melanoma identification. The consideration of higher levels of Haar Wavelet Pyramid helps speed up the identification process. It is observed that the performance gradually improves from the Haar wavelet pyramid level 4x4 to 16x16, and shows marginal improvement further. The ensembles of machine learning algorithms have shown a boost in performance metrics compared to the use of individual machine learning algorithms.

Download Full-text

Automated Performance Metrics and Machine Learning Algorithms to Measure Surgeon Performance and Anticipate Clinical Outcomes in Robotic Surgery

JAMA Surgery ◽

10.1001/jamasurg.2018.1512 ◽

2018 ◽

Vol 153 (8) ◽

pp. 770 ◽

Cited By ~ 27

Author(s):

Andrew J. Hung ◽

Jian Chen ◽

Inderbir S. Gill

Keyword(s):

Machine Learning ◽

Robotic Surgery ◽

Clinical Outcomes ◽

Performance Metrics ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Surgeon Performance

Download Full-text

Systematic analysis of machine learning algorithms on EEG data for brain state intelligence

2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2015.7359788 ◽

2015 ◽

Cited By ~ 7

Author(s):

Alexander Chan ◽

Christopher E. Early ◽

Sishir Subedi ◽

Yuezhe Li ◽

Hong Lin

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Brain State ◽

Systematic Analysis ◽

Eeg Data

Download Full-text

Towards Predicting Student’s Dropout in University Courses Using Different Machine Learning Techniques

Applied Sciences ◽

10.3390/app11073130 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3130

Author(s):

Janka Kabathova ◽

Martin Drlik

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Classification Model ◽

Machine Learning Techniques ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Unseen Data ◽

E Learning ◽

The Impact

Early and precisely predicting the students’ dropout based on available educational data belongs to the widespread research topic of the learning analytics research field. Despite the amount of already realized research, the progress is not significant and persists on all educational data levels. Even though various features have already been researched, there is still an open question, which features can be considered appropriate for different machine learning classifiers applied to the typical scarce set of educational data at the e-learning course level. Therefore, the main goal of the research is to emphasize the importance of the data understanding, data gathering phase, stress the limitations of the available datasets of educational data, compare the performance of several machine learning classifiers, and show that also a limited set of features, which are available for teachers in the e-learning course, can predict student’s dropout with sufficient accuracy if the performance metrics are thoroughly considered. The data collected from four academic years were analyzed. The features selected in this study proved to be applicable in predicting course completers and non-completers. The prediction accuracy varied between 77 and 93% on unseen data from the next academic year. In addition to the frequently used performance metrics, the comparison of machine learning classifiers homogeneity was analyzed to overcome the impact of the limited size of the dataset on obtained high values of performance metrics. The results showed that several machine learning algorithms could be successfully applied to a scarce dataset of educational data. Simultaneously, classification performance metrics should be thoroughly considered before deciding to deploy the best performance classification model to predict potential dropout cases and design beneficial intervention mechanisms.

Download Full-text

A multipurpose machine learning approach to predict COVID-19 negative prognosis in São Paulo, Brazil

Scientific Reports ◽

10.1038/s41598-021-82885-y ◽

2021 ◽

Vol 11 (1) ◽

Cited By ~ 2

Author(s):

Fernando Timoteo Fernandes ◽

Tiago Almeida de Oliveira ◽

Cristiane Esteves Teixeira ◽

Andre Filipe de Moraes Batista ◽

Gabriel Dalla Costa ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Sao Paulo ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

São Paulo ◽

C Reactive Protein ◽

Reactive Protein ◽

Unseen Data ◽

Extreme Gradient Boosting

AbstractThe new coronavirus disease (COVID-19) is a challenge for clinical decision-making and the effective allocation of healthcare resources. An accurate prognostic assessment is necessary to improve survival of patients, especially in developing countries. This study proposes to predict the risk of developing critical conditions in COVID-19 patients by training multipurpose algorithms. We followed a total of 1040 patients with a positive RT-PCR diagnosis for COVID-19 from a large hospital from São Paulo, Brazil, from March to June 2020, of which 288 (28%) presented a severe prognosis, i.e. Intensive Care Unit (ICU) admission, use of mechanical ventilation or death. We used routinely-collected laboratory, clinical and demographic data to train five machine learning algorithms (artificial neural networks, extra trees, random forests, catboost, and extreme gradient boosting). We used a random sample of 70% of patients to train the algorithms and 30% were left for performance assessment, simulating new unseen data. In order to assess if the algorithms could capture general severe prognostic patterns, each model was trained by combining two out of three outcomes to predict the other. All algorithms presented very high predictive performance (average AUROC of 0.92, sensitivity of 0.92, and specificity of 0.82). The three most important variables for the multipurpose algorithms were ratio of lymphocyte per C-reactive protein, C-reactive protein and Braden Scale. The results highlight the possibility that machine learning algorithms are able to predict unspecific negative COVID-19 outcomes from routinely-collected data.

Download Full-text

Performance Evaluation of Different Machine Learning Classification Algorithms for Disease Diagnosis

International Journal of E-Health and Medical Communications ◽

10.4018/ijehmc.20211101.oa5 ◽

2021 ◽

Vol 12 (6) ◽

pp. 1-28

Author(s):

Munder Abdulatef Al-Hashem ◽

Ali Mohammad Alqudah ◽

Qasem Qananwah

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Performance Metrics ◽

Confusion Matrix ◽

Learning Algorithms ◽

Disease Diagnosis ◽

Machine Learning Algorithms ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Machine Learning Classification

Knowledge extraction within a healthcare field is a very challenging task since we are having many problems such as noise and imbalanced datasets. They are obtained from clinical studies where uncertainty and variability are popular. Lately, a wide number of machine learning algorithms are considered and evaluated to check their validity of being used in the medical field. Usually, the classification algorithms are compared against medical experts who are specialized in certain disease diagnoses and provide an effective methodological evaluation of classifiers by applying performance metrics. The performance metrics contain four criteria: accuracy, sensitivity, and specificity forming the confusion matrix of each used algorithm. We have utilized eight different well-known machine learning algorithms to evaluate their performances in six different medical datasets. Based on the experimental results we conclude that the XGBoost and K-Nearest Neighbor classifiers were the best overall among the used datasets and signs can be used for diagnosing various diseases.

Download Full-text

Classifying lymphoma and tuberculosis case reports using machine learning algorithms

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i5.3132 ◽

2021 ◽

Vol 10 (5) ◽

pp. 2857-2865

Author(s):

Moanda Diana Pholo ◽

Yskandar Hamam ◽

Abdel Baset Khalaf ◽

Chunling Du

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Performance Metrics ◽

Case Reports ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Tuberculosis Case ◽

Starting Point

Available literature reports several lymphoma cases misdiagnosed as tuberculosis, especially in countries with a heavy TB burden. This frequent misdiagnosis is due to the fact that the two diseases can present with similar symptoms. The present study therefore aims to analyse and explore TB as well as lymphoma case reports using Natural Language Processing tools and evaluate the use of machine learning to differentiate between the two diseases. As a starting point in the study, case reports were collected for each disease using web scraping. Natural language processing tools and text clustering were then used to explore the created dataset. Finally, six machine learning algorithms were trained and tested on the collected data, which contained 765 lymphoma and 546 tuberculosis case reports. Each method was evaluated using various performance metrics. The results indicated that the multi-layer perceptron model achieved the best accuracy (93.1%), recall (91.9%) and precision score (93.7%), thus outperforming other algorithms in terms of correctly classifying the different case reports.

Download Full-text

Identifying the Main Risk Factors for CVD Prediction Using Machine Learning Algorithms

10.20944/preprints202108.0471.v1 ◽

2021 ◽

Author(s):

Luis Rolando Guarneros-Nolasco ◽

Nancy Aracely Cruz-Ramos ◽

Giner Alor-Hernández ◽

Lisbeth Rodríguez-Mazahua ◽

José Luis Sánchez-Cervantes

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Performance Metrics ◽

Learning Algorithms ◽

Predictive Performance ◽

Machine Learning Algorithms ◽

Algorithm Performance ◽

Body Regions ◽

Risks Factors ◽

Fold Cross Validation

CVDs are a leading cause of death globally. In CVDs, the heart is unable to deliver enough blood to other body regions. Since effective and accurate diagnosis of CVDs is essential for CVD prevention and treatment, machine learning (ML) techniques can be effectively and reliably used to discern patients suffering from a CVD from those who do not suffer from any heart condition. Namely, machine learning algorithms (MLAs) play a key role in the diagnosis of CVDs through predictive models that allow us to identify the main risks factors influencing CVD development. In this study, we analyze the performance of ten MLAs on two datasets for CVD prediction and two for CVD diagnosis. Algorithm performance is analyzed on top-two and top-four dataset attributes/features with respect to five performance metrics –accuracy, precision, recall, f1-score, and roc-auc – using the train-test split technique and k-fold cross-validation. Our study identifies the top two and four attributes from each CVD diagnosis/prediction dataset. As our main findings, the ten MLAs exhibited appropriate diagnosis and predictive performance; hence, they can be successfully implemented for improving current CVD diagnosis efforts and help patients around the world, especially in regions where medical staff is lacking.

Download Full-text

Developing machine learning algorithms for meteorological temperature and humidity forecasting at Terengganu state in Malaysia

Scientific Reports ◽

10.1038/s41598-021-96872-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Marwah Sattar Hanoon ◽

Ali Najah Ahmed ◽

Nur’atiah Zaini ◽

Arif Razzaq ◽

Pavitra Kumar ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Air Temperature ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Unseen Data ◽

Temperature And Humidity ◽

Artificial Neural ◽

Artificial Neural Network Ann

AbstractAccurately predicting meteorological parameters such as air temperature and humidity plays a crucial role in air quality management. This study proposes different machine learning algorithms: Gradient Boosting Tree (G.B.T.), Random forest (R.F.), Linear regression (LR) and different artificial neural network (ANN) architectures (multi-layered perceptron, radial basis function) for prediction of such as air temperature (T) and relative humidity (Rh). Daily data over 24 years for Kula Terengganu station were obtained from the Malaysia Meteorological Department. Results showed that MLP-NN performs well among the others in predicting daily T and Rh with R of 0.7132 and 0.633, respectively. However, in monthly prediction T also MLP-NN model provided closer standards deviation to actual value and can be used to predict monthly T with R 0.8462. Whereas in prediction monthly Rh, the RBF-NN model's efficiency was higher than other models with R of 0.7113. To validate the performance of the trained both artificial neural network (ANN) architectures MLP-NN and RBF-NN, both were applied to an unseen data set from observation data in the region. The results indicated that on either architecture of ANN, there is good potential to predict daily and monthly T and Rh values with an acceptable range of accuracy.

Download Full-text