Feature engineering coupled machine learning algorithms for epileptic seizure forecasting from intracranial EEGs

AbstractEpilepsy is one of the major neurological disorders affecting nearly 1 percentage of the global population. The major blunt is born by under developed and developing countries due to expensive treatment of epileptic conditions. Further, the lack of proper forecasting methods for an occurrence of epileptic seizures in epileptic-drug resistant patients or patients not amenable for surgery affects their psychological behaviour and restricts their daily activities. The forecasting is usually performed by human experts that leave a wide gap for human-bias and human error. Therefore, in the current work, we have evaluated the efficiency of several machine learning algorithms to automatically identify the preictal patterns corresponding to epileptic seizures from intracranial EEG signals. The robustness of the machine learning algorithms were tested after the data set was pre-processed using carefully chosen feature engineering strategies viz. denoised Fourier transforms as well as cross-correlation across electrodes in time and frequency domain. Extensive experimentations were carried out to determine the best combination of feature engineering techniques and machine learning algorithms. The best combination of feature engineering techniques and machine learning algorithm resulted in 0.7685 AUC (Area under the Receiver Operating Characteristic curve) on the random test samples. The suggested approach was fairly good at prediction of epilepsy in random samples and therefore, it can be used in epileptic seizure forecasting in patients where medication/surgery is ineffective. Eventually, our strategy reveals a robust method for brain disorders forecasting from EEGs.

Download Full-text

A Surveillance on Machine Learning Algorithms and Its Applications

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9064 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4294-4298

Author(s):

B. R. Sunil Kumar ◽

B. S. Siddhartha ◽

S. N. Shwetha ◽

K. Arpitha

Keyword(s):

Machine Learning ◽

Health Care ◽

Sentiment Analysis ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

Data Set ◽

Pros And Cons ◽

Primary Advantage

This paper intends to use distinct machine learning algorithms and exploring its multi-features. The primary advantage of machine learning is, a machine learning algorithm can predict its work automatically by learning what to do with information. This paper reveals the concept of machine learning and its algorithms which can be used for different applications such as health care, sentiment analysis and many more. Sometimes the programmers will get confused which algorithm to apply for their applications. This paper provides an idea related to the algorithm used on the basis of how accurately it fits. Based on the collected data, one of the algorithms can be selected based upon its pros and cons. By considering the data set, the base model is developed, trained and tested. Then the trained model is ready for prediction and can be deployed on the basis of feasibility.

Download Full-text

A Self-Supervised Machine Learning Approach for Objective Live Cell Segmentation and Analysis

10.21203/rs.3.rs-147010/v1 ◽

2021 ◽

Author(s):

Marc Raphael ◽

Michael Robitaille ◽

Jeff Byers ◽

Joseph Christodoulides

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Label Cell ◽

Live Cell ◽

Machine Learning Algorithms ◽

Classification Model ◽

Supervised Machine Learning ◽

Cell Segmentation ◽

Data Set

Abstract Machine learning algorithms hold the promise of greatly improving live cell image analysis by way of (1) analyzing far more imagery than can be achieved by more traditional manual approaches and (2) by eliminating the subjective nature of researchers and diagnosticians selecting the cells or cell features to be included in the analyzed data set. Currently, however, even the most sophisticated model based or machine learning algorithms require user supervision, meaning the subjectivity problem is not removed but rather incorporated into the algorithm’s initial training steps and then repeatedly applied to the imagery. To address this roadblock, we have developed a self-supervised machine learning algorithm that recursively trains itself directly from the live cell imagery data, thus providing objective segmentation and quantification. The approach incorporates an optical flow algorithm component to self-label cell and background pixels for training, followed by the extraction of additional feature vectors for the automated generation of a cell/background classification model. Because it is self-trained, the software has no user-adjustable parameters and does not require curated training imagery. The algorithm was applied to automatically segment cells from their background for a variety of cell types and five commonly used imaging modalities - fluorescence, phase contrast, differential interference contrast (DIC), transmitted light and interference reflection microscopy (IRM). The approach is broadly applicable in that it enables completely automated cell segmentation for long-term live cell phenotyping applications, regardless of the input imagery’s optical modality, magnification or cell type.

Download Full-text

A Self-Supervised Machine Learning Approach for Objective Live Cell Segmentation and Analysis

10.1101/2021.01.07.425773 ◽

2021 ◽

Author(s):

Michael C. Robitaille ◽

Jeff M. Byers ◽

Joseph A. Christodoulides ◽

Marc P. Raphael

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Label Cell ◽

Live Cell ◽

Machine Learning Algorithms ◽

Classification Model ◽

Supervised Machine Learning ◽

Cell Segmentation ◽

Data Set

Machine learning algorithms hold the promise of greatly improving live cell image analysis by way of (1) analyzing far more imagery than can be achieved by more traditional manual approaches and (2) by eliminating the subjective nature of researchers and diagnosticians selecting the cells or cell features to be included in the analyzed data set. Currently, however, even the most sophisticated model based or machine learning algorithms require user supervision, meaning the subjectivity problem is not removed but rather incorporated into the algorithm's initial training steps and then repeatedly applied to the imagery. To address this roadblock, we have developed a self-supervised machine learning algorithm that recursively trains itself directly from the live cell imagery data, thus providing objective segmentation and quantification. The approach incorporates an optical flow algorithm component to self-label cell and background pixels for training, followed by the extraction of additional feature vectors for the automated generation of a cell/background classification model. Because it is self-trained, the software has no user-adjustable parameters and does not require curated training imagery. The algorithm was applied to automatically segment cells from their background for a variety of cell types and five commonly used imaging modalities - fluorescence, phase contrast, differential interference contrast (DIC), transmitted light and interference reflection microscopy (IRM). The approach is broadly applicable in that it enables completely automated cell segmentation for long-term live cell phenotyping applications, regardless of the input imagery's optical modality, magnification or cell type.

Download Full-text

Deducing of Optimal Machine Learning Algorithms for Heterogeneity

10.36227/techrxiv.17162147 ◽

2021 ◽

Author(s):

Omar Alfarisi ◽

Zeyar Aung ◽

Mohamed Sassi

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Learning Algorithms ◽

Synthetic Data ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Algorithm ◽

Data Set ◽

Optimal Machine

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneity, we identified Random Forest, among others, to be the best algorithm.

Download Full-text

Framework of Intelligent System for Machine Learning Algorithm Selection in Social Sciences

Journal of Software ◽

10.17706/jsw.17.1.21-28 ◽

2022 ◽

pp. 21-28

Author(s):

Dijana Oreški ◽

Keyword(s):

Machine Learning ◽

Intelligent System ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Algorithm Selection ◽

Data Set ◽

Ranking Algorithms ◽

Meta Learning ◽

The Right

The ability to generate data has never been as powerful as today when three quintile bytes of data are generated daily. In the field of machine learning, a large number of algorithms have been developed, which can be used for intelligent data analysis and to solve prediction and descriptive problems in different domains. Developed algorithms have different effects on different problems.If one algorithmworks better on one dataset,the same algorithm may work worse on another data set. The reason is that each dataset has different features in terms of local and global characteristics. It is therefore imperative to know intrinsic algorithms behavior on different types of datasets andchoose the right algorithm for the problem solving. To address this problem, this papergives scientific contribution in meta learning field by proposing framework for identifying the specific characteristics of datasets in two domains of social sciences:education and business and develops meta models based on: ranking algorithms, calculating correlation of ranks, developing a multi-criteria model, two-component index and prediction based on machine learning algorithms. Each of the meta models serve as the basis for the development of intelligent system version. Application of such framework should include a comparative analysis of a large number of machine learning algorithms on a large number of datasetsfromsocial sciences.

Download Full-text

Amino Acid Composition and Charge Based Prediction of Antisepsis Peptides by Random Forest Machine Learning Algorithm

10.1101/2021.09.26.461860 ◽

2021 ◽

Author(s):

Aayushi Rathore ◽

Anu Saini ◽

Navjot Kaur ◽

Aparna Singh ◽

Ojasvi Dutta ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Learning Algorithms ◽

Multiple Organ ◽

The Body ◽

Machine Learning Algorithms ◽

Support Vector ◽

Data Set ◽

Organ Systems

ABSTRACTSepsis is a severe infectious disease with high mortality, and it occurs when chemicals released in the bloodstream to fight an infection trigger inflammation throughout the body and it can cause a cascade of changes that damage multiple organ systems, leading them to fail, even resulting in death. In order to reduce the possibility of sepsis or infection antiseptics are used and process is known as antisepsis. Antiseptic peptides (ASPs) show properties similar to antigram-negative peptides, antigram-positive peptides and many more. Machine learning algorithms are useful in screening and identification of therapeutic peptides and thus provide initial filters or built confidence before using time consuming and laborious experimental approaches. In this study, various machine learning algorithms like Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbour (KNN) and Logistic Regression (LR) were evaluated for prediction of ASPs. Moreover, the characteristics physicochemical features of ASPs were also explored to use them in machine learning. Both manual and automatic feature selection methodology was employed to achieve best performance of machine learning algorithms. A 5-fold cross validation and independent data set validation proved RF as the best model for prediction of ASPs. Our RF model showed an accuracy of 97%, Matthew’s Correlation Coefficient (MCC) of 0.93, which are indication of a robust and good model. To our knowledge this is the first attempt to build a machine learning classifier for prediction of ASPs.

Download Full-text

P6568Forecasting atrial fibrillation using machine learning techniques

European Heart Journal ◽

10.1093/eurheartj/ehz746.1157 ◽

2019 ◽

Vol 40 (Supplement_1) ◽

Cited By ~ 1

Author(s):

J.-M Gregoire ◽

N Subramanian ◽

D Papazian ◽

H Bersini

Keyword(s):

Machine Learning ◽

Atrial Fibrillation ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Neural Net ◽

Data Set ◽

Rr Intervals ◽

Learning Techniques

Abstract Background Forecasting atrial fibrillation (AF) a few minutes before its onset has been studied, mainly based on heart rate variability parameters, derived from 24-hour ECG Holter monitorings. However, these studies have shown conflicting, non-clinically applicable results. Nowadays, machine learning algorithms have proven their ability to anticipate events. Therefore, forecasting AF before its onset should be (re)assessed using machine learning techniques. A reliable forecasting could improve results of preventive pacing in patients with cardiac electronic implanted devices (CEID). Purpose To forecast an oncoming AF episode in individual patients using machine learning techniques. To evaluate the effect if the onset of an AF episode can be forecasted on longer time frames. Methods The totality of the raw data of a data set of 10484 ECG Holter monitorings was retrospectively analyzed and all AF episodes were annotated. Onset of each AF episode was determined with a precision of 5 msec. We only took AF events into consideration if they lasted longer than 30 seconds. Of all patients in the dataset, 140 presented paroxysmal AF (286 recorded AF episodes). We only used RR intervals to predict the presence of AF. We developed two different types of machine learning algorithms with different computational power requirements: a “dynamic” deep and recurrent neural net (RNN) and a “static” decision-tree with adaboost (boosting trees) more suitable for embedded devices. These algorithms were trained on one set of patients (around 90%) and tested on the remaining set of patients (around 10%). Results The performance figures are summarized in the table. Both algorithms can be tuned to increase their specificity (at a loss of sensitivity) or vice versa, depending on the objective. Performance of forecasting algorithms RR-distance Boosting trees AUC RNN AUC 30–1 RR-Interval(s) before an AF event 97.1% 98.77% 60–31 RR-Intervals before an AF event 97.5% 99.1% 90–61 RR-Intervals before an AF event 96.9% 99.1% 120–91 RR-Inervals before an AF event 98.2% 98.9% AUC for Area Under ROC Curves. Conclusion Based upon this retrospective study, we show that AF can be forecasted on an individual level with high predictive power using machine learning algorithm, with little drop-off of predictive value within the studied distances (1–120 RR intervals before a potential AF episode). We believe that the embedding of our new algorithm(s) in CEID's could open the way to innovative therapies that significantly decrease AF burden in selected implanted patients.

Download Full-text

Machine Learning Algorithms for Analysis of DNA Data Sets

Machine Learning Algorithms for Problem Solving in Computational Applications ◽

10.4018/978-1-4666-1833-6.ch004 ◽

2012 ◽

pp. 47-58 ◽

Cited By ~ 2

Author(s):

John Yearwood ◽

Adil Bagirov ◽

Andrei V. Kelarev

Keyword(s):

Machine Learning ◽

Dna Sequences ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Local Alignment ◽

Data Sets ◽

Data Set ◽

Applications Of Machine Learning ◽

New Machine

The applications of machine learning algorithms to the analysis of data sets of DNA sequences are very important. The present chapter is devoted to the experimental investigation of applications of several machine learning algorithms for the analysis of a JLA data set consisting of DNA sequences derived from non-coding segments in the junction of the large single copy region and inverted repeat A of the chloroplast genome in Eucalyptus collected by Australian biologists. Data sets of this sort represent a new situation, where sophisticated alignment scores have to be used as a measure of similarity. The alignment scores do not satisfy properties of the Minkowski metric, and new machine learning approaches have to be investigated. The authors’ experiments show that machine learning algorithms based on local alignment scores achieve very good agreement with known biological classes for this data set. A new machine learning algorithm based on graph partitioning performed best for clustering of the JLA data set. Our novel k-committees algorithm produced most accurate results for classification. Two new examples of synthetic data sets demonstrate that the authors’ k-committees algorithm can outperform both the Nearest Neighbour and k-medoids algorithms simultaneously.

Download Full-text

Essentials of Predicting Epileptic Seizures Based on EEG Using Machine Learning: A Review

The Open Biomedical Engineering Journal ◽

10.2174/1874120702115010090 ◽

2021 ◽

Vol 15 (1) ◽

pp. 90-104

Author(s):

Vibha Patel ◽

Jaishree Tailor ◽

Amit Ganatra

Keyword(s):

Machine Learning ◽

Epileptic Seizure ◽

Learning Algorithm ◽

Epileptic Seizures ◽

Machine Learning Algorithms ◽

Seizure Prediction ◽

Positive Ratio ◽

Essential Components ◽

Epileptic Seizure Prediction ◽

Evaluation Parameters

Objective: Epilepsy is one of the chronic diseases, which requires exceptional attention. The unpredictability of the seizures makes it worse for a person suffering from epilepsy. Methods: The challenge to predict seizures using modern machine learning algorithms and computing resources would be a boon to a person with epilepsy and its caregivers. Researchers have shown great interest in the task of epileptic seizure prediction for a few decades. However, the results obtained have not clinical applicability because of the high false-positive ratio. The lack of standard practices in the field of epileptic seizure prediction makes it challenging for novice ones to follow the research. The chances of reproducibility of the result are negligible due to the unavailability of implementation environment-related details, use of standard datasets, and evaluation parameters. Results: Work here presents the essential components required for the prediction of epileptic seizures, which includes the basics of epilepsy, its treatment, and the need for seizure prediction algorithms. It also gives a detailed comparative analysis of datasets used by different researchers, tools and technologies used, different machine learning algorithm considerations, and evaluation parameters. Conclusion: The main goal of this paper is to synthesize different methodologies for creating a broad view of the state-of-the-art in the field of seizure prediction.

Download Full-text

Application of Convolutional Neural Network Algorithms for Advancing Sedentary and Activity Bout Classification

Journal for the Measurement of Physical Behaviour ◽

10.1123/jmpb.2020-0016 ◽

2020 ◽

pp. 1-9

Author(s):

Supun Nakandala ◽

Marta M. Jankowska ◽

Fatima Tuz-Zahra ◽

John Bellettiere ◽

Jordan A. Carlson ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Logistic Regression ◽

Random Forest ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Feature Engineering ◽

Free Living ◽

Data Set

Background: Machine learning has been used for classification of physical behavior bouts from hip-worn accelerometers; however, this research has been limited due to the challenges of directly observing and coding human behavior “in the wild.” Deep learning algorithms, such as convolutional neural networks (CNNs), may offer better representation of data than other machine learning algorithms without the need for engineered features and may be better suited to dealing with free-living data. The purpose of this study was to develop a modeling pipeline for evaluation of a CNN model on a free-living data set and compare CNN inputs and results with the commonly used machine learning random forest and logistic regression algorithms. Method: Twenty-eight free-living women wore an ActiGraph GT3X+ accelerometer on their right hip for 7 days. A concurrently worn thigh-mounted activPAL device captured ground truth activity labels. The authors evaluated logistic regression, random forest, and CNN models for classifying sitting, standing, and stepping bouts. The authors also assessed the benefit of performing feature engineering for this task. Results: The CNN classifier performed best (average balanced accuracy for bout classification of sitting, standing, and stepping was 84%) compared with the other methods (56% for logistic regression and 76% for random forest), even without performing any feature engineering. Conclusion: Using the recent advancements in deep neural networks, the authors showed that a CNN model can outperform other methods even without feature engineering. This has important implications for both the model’s ability to deal with the complexity of free-living data and its potential transferability to new populations.

Download Full-text