scholarly journals Towards Interpretable Machine Learning in EEG Analysis

2021 ◽  
Author(s):  
Maged Mortaga ◽  
Alexander Brenner ◽  
Ekaterina Kutafina

In this paper a machine learning model for automatic detection of abnormalities in electroencephalography (EEG) is dissected into parts, so that the influence of each part on the classification accuracy score can be examined. The most successful setup of several shallow artificial neural networks aggregated via voting results in accuracy of 81%. Stepwise simplification of the model shows the expected decrease in accuracy, but a naive model with thresholding of a single extracted feature (relative wavelet energy) is still able to achieve 75%, which remains strongly above the random guess baseline of 54%. These results suggest the feasibility of building a simple classification model ensuring accuracy scores close to the state-of-the-art research but remaining fully interpretable.

As Artificial Intelligence penetrates all aspects of human life, more and more questions about ethical practices and fair uses arise, which has motivated the research community to look inside and develop methods to interpret these Artificial Intelligence/Machine Learning models. This concept of interpretability can not only help with the ethical questions but also can provide various insights into the working of these machine learning models, which will become crucial in trust-building and understanding how a model makes decisions. Furthermore, in many machine learning applications, the feature of interpretability is the primary value that they offer. However, in practice, many developers select models based on the accuracy score and disregarding the level of interpretability of that model, which can be chaotic as predictions by many high accuracy models are not easily explainable. In this paper, we introduce the concept of Machine Learning Model Interpretability, Interpretable Machine learning, and the methods used for interpretation and explanations.


2020 ◽  
pp. 1-2
Author(s):  
Zhang- sensen

mild cognitive impairment (MCI) is a condition between healthy elderly people and alzheimer's disease (AD). At present, brain network analysis based on machine learning methods can help diagnose MCI. In this paper, the brain network is divided into several subnets based on the shortest path,and the feature vectors of each subnet are extracted and classified. In order to make full use of subnet information, this paper adopts integrated classification model for classification.Each base classification model can predict the classification of a subnet,and the classification results of all subnets are calculated as the classification results of brain network.In order to verify the effectiveness of this method,a brain network of 66 people was constructed and a comparative experiment was carried out.The experimental results show that the classification accuracy of the integrated classification model proposed in this paper is 19% higher than that of SVM,which effectively improves the classification accuracy


An Individual method of living on with a daily existence it directly influences on your overall health. Since stress is the significant infection of our human body. Like depression, heart attack and mental illness. WHO says “Globally, more than 264 million people of all ages suffer from depression.”[8]. Also the report says that most of the time people are stressed because of their work. 10.7% of People disorder with stress, anxiety and depression [8]. There are different method to discovering stress ex. Smart watches, chest belt, and extraordinary machine. Our principle objective is to figure out pressure progressively utilizing smart watches through their Sensor. There are different kinds of sensor available to find stress such as PPG, GSR, HRV, ECG and temperature. Smart watches contain a wide range of data through various sensor. This kind of gathered information are applied on various machine learning method. Like linear regression, SVM, KNN, decision tree. Technique have distinct, comparing accuracy and chooses best Machine learning model. This paper investigation have different analysis to find and compare accuracy by various sensors data. It is also check whether using one sensor or multiple sensors such as HRV, ECG or GSR and PPG to predict the better accuracy score for stress detection.


Author(s):  
Xianping Du ◽  
Onur Bilgen ◽  
Hongyi Xu

Abstract Machine learning for classification has been used widely in engineering design, for example, feasible domain recognition and hidden pattern discovery. Training an accurate machine learning model requires a large dataset; however, high computational or experimental costs are major issues in obtaining a large dataset for real-world problems. One possible solution is to generate a large pseudo dataset with surrogate models, which is established with a smaller set of real training data. However, it is not well understood whether the pseudo dataset can benefit the classification model by providing more information or deteriorates the machine learning performance due to the prediction errors and uncertainties introduced by the surrogate model. This paper presents a preliminary investigation towards this research question. A classification-and-regressiontree model is employed to recognize the design subspaces to support design decision-making. It is implemented on the geometric design of a vehicle energy-absorbing structure based on finite element simulations. Based on a small set of real-world data obtained by simulations, a surrogate model based on Gaussian process regression is employed to generate pseudo datasets for training. The results showed that the tree-based method could help recognize feasible design domains efficiently. Furthermore, the additional information provided by the surrogate model enhances the accuracy of classification. One important conclusion is that the accuracy of the surrogate model determines the quality of the pseudo dataset and hence, the improvements in the machine learning model.


2020 ◽  
Author(s):  
Charalambos Themistocleous ◽  
Bronte Ficek ◽  
Kimberly Webster ◽  
Dirk-Bart den Ouden ◽  
Argye E. Hillis ◽  
...  

AbstractBackgroundThe classification of patients with Primary Progressive Aphasia (PPA) into variants is time-consuming, costly, and requires combined expertise by clinical neurologists, neuropsychologists, speech pathologists, and radiologists.ObjectiveThe aim of the present study is to determine whether acoustic and linguistic variables provide accurate classification of PPA patients into one of three variants: nonfluent PPA, semantic PPA, and logopenic PPA.MethodsIn this paper, we present a machine learning model based on Deep Neural Networks (DNN) for the subtyping of patients with PPA into three main variants, using combined acoustic and linguistic information elicited automatically via acoustic and linguistic analysis. The performance of the DNN was compared to the classification accuracy of Random Forests, Support Vector Machines, and Decision Trees, as well as expert clinicians’ classifications.ResultsThe DNN model outperformed the other machine learning models with 80% classification accuracy, providing reliable subtyping of patients with PPA into variants and it even outperformed auditory classification of patients into variants by clinicians.ConclusionsWe show that the combined speech and language markers from connected speech productions provide information about symptoms and variant subtyping in PPA. The end-to-end automated machine learning approach we present can enable clinicians and researchers to provide an easy, quick and inexpensive classification of patients with PPA.


2022 ◽  
Vol 9 (1) ◽  
pp. 0-0

This article investigates the impact of data-complexity and team-specific characteristics on machine learning competition scores. Data from five real-world binary classification competitions hosted on Kaggle.com were analyzed. The data-complexity characteristics were measured in four aspects including standard measures, sparsity measures, class imbalance measures, and feature-based measures. The results showed that the higher the level of the data-complexity characteristics was, the lower the predictive ability of the machine learning model was as well. Our empirical evidence revealed that the imbalance ratio of the target variable was the most important factor and exhibited a nonlinear relationship with the model’s predictive abilities. The imbalance ratio adversely affected the predictive performance when it reached a certain level. However, mixed results were found for the impact of team-specific characteristics measured by team size, team expertise, and the number of submissions on team performance. For high-performing teams, these factors had no impact on team score.


Author(s):  
Tsehay Admassu Assegie

Machine-learning approaches have become greatly applicable in disease diagnosis and prediction process. This is because of the accuracy and better precision of the machine learning models in disease prediction. However, different machine learning models have different accuracy and precision on disease prediction. Selecting the better model that would result in better disease prediction accuracy and precision is an open research problem. In this study, we have proposed machine learning model for liver disease prediction using Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) learning algorithms and we have evaluated the accuracy and precision of the models on liver disease prediction using the Indian liver disease data repository. The analysis of result showed 82.90% accuracy for SVM and 72.64% accuracy for the KNN algorithm. Based on the accuracy score of SVM and KNN on experimental test results, the SVM is better in performance on the liver disease prediction than the KNN algorithm.  


Author(s):  
Muhammad Irfan ◽  
Setio Basuki ◽  
Yufis Azhar

Maternal mortality rate (MMR) in Indonesia intercensal population survey (SUPAS) was considered high. For pregnancy risk detection, the public health center (puskesmas) applies a Poedji Rochjati screening card (KSPR) demonstrating 20 features. In addition to KSPR, pregnancy risk monitoring has been assisted with a pregnancy control card. Because of the differences in the number of features between the two control cards, it is necessary to make agreements between them. Our objectives are determining the most influential features, exploring the links among features on the KSPR and pregnancy control cards, and building a machine learning model for predicting pregnancy risk. For the first objective, we use correlation-based feature selection (CFS) and C5.0 algorithm. The next objective was answered by the union operation in the features produced by the two techniques. By performing the machine learning experiment on these features, the accuracy of the XGBoost algorithm demonstrated the hightest results of 94% followed by random forest, Naïve Bayes, and k-Nearest neighbor algorithms, 87%, 66%, and 60% respectively. Interpretability aspects are implemented with SHAP and LIME to provide more insight for classification model. In conclusion, the similarity feature generated in the two interpretation approaches confirmed that Cesar was dominant in determining pregnancy risk.


Electronics ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 219 ◽  
Author(s):  
Sweta Bhattacharya ◽  
Siva Rama Krishnan S ◽  
Praveen Kumar Reddy Maddikunta ◽  
Rajesh Kaluri ◽  
Saurabh Singh ◽  
...  

The enormous popularity of the internet across all spheres of human life has introduced various risks of malicious attacks in the network. The activities performed over the network could be effortlessly proliferated, which has led to the emergence of intrusion detection systems. The patterns of the attacks are also dynamic, which necessitates efficient classification and prediction of cyber attacks. In this paper we propose a hybrid principal component analysis (PCA)-firefly based machine learning model to classify intrusion detection system (IDS) datasets. The dataset used in the study is collected from Kaggle. The model first performs One-Hot encoding for the transformation of the IDS datasets. The hybrid PCA-firefly algorithm is then used for dimensionality reduction. The XGBoost algorithm is implemented on the reduced dataset for classification. A comprehensive evaluation of the model is conducted with the state of the art machine learning approaches to justify the superiority of our proposed approach. The experimental results confirm the fact that the proposed model performs better than the existing machine learning models.


Sign in / Sign up

Export Citation Format

Share Document