Harvesting Brain Signal using Machine Learning Methods

Author(s):  
Kevin Matsuno ◽  
Vidya Nandikolla

Abstract Brain computer interface (BCI) systems are developed in biomedical fields to increase the quality of life. The development of a six class BCI controller to operate a semi-autonomous robotic arm is presented. The controller uses the following mental tasks: imagined left/right hand squeeze, imagined left/right foot tap, rest, one physical task, and jaw clench. To design a controller, the locations of active electrodes are verified and an appropriate machine learning algorithm is determined. Three subjects, ages ranging between 22-27, participated in five sessions of motor imagery experiments to record their brainwaves. These recordings were analyzed using event related potential plots and topographical maps to determine active electrodes. BCILAB was used to train two, three, five, and six class BCI controllers using linear discriminant analysis (LDA) and relevance vector machine (RVM) machine learning methods. The subjects' data was used to compare the two-method's performance in terms of error rate percentage. While a two class BCI controller showed the same accuracy for both methods, the three and five class BCI controllers showed the RVM approach having a higher accuracy than the LDA approach. For the five-class controller, error rate percentage was 33.3% for LDA and 29.2% for RVM. The six class BCI controller error rate percentage for both LDA and RVM was 34.5%. While the percentage values are the same, RVM was chosen as the desired machine learning algorithm based on the trend seen in the three and five class controller performances.

Author(s):  
G. Pilania ◽  
P. V. Balachandran ◽  
J. E. Gubernatis ◽  
T. Lookman

We explored the use of machine learning methods for classifying whether a particularABO3chemistry forms a perovskite or non-perovskite structured solid. Starting with three sets of feature pairs (the tolerance and octahedral factors, theAandBionic radii relative to the radius of O, and the bond valence distances between theAandBions from the O atoms), we used machine learning to create a hyper-dimensional partial dependency structure plot using all three feature pairs or any two of them. Doing so increased the accuracy of our predictions by 2–3 percentage points over using any one pair. We also included the Mendeleev numbers of theAandBatoms to this set of feature pairs. Doing this and using the capabilities of our machine learning algorithm, the gradient tree boosting classifier, enabled us to generate a new type of structure plot that has the simplicity of one based on using just the Mendeleev numbers, but with the added advantages of having a higher accuracy and providing a measure of likelihood of the predicted structure.


2020 ◽  
pp. 5-18
Author(s):  
N. N. Kiselyova ◽  
◽  
V. A. Dudarev ◽  
V. V. Ryazanov ◽  
O. V. Sen’ko ◽  
...  

New chalcospinels of the most common compositions were predicted: AIBIIICIVX4 (X — S or Se) and AIIBIIICIIIS4 (A, B, and C are various chemical elements). They are promising for the search for new materials for magneto-optical memory elements, sensors and anodes in sodium-ion batteries. The parameter “a” values of their crystal lattice are estimated. When predicting only the values of chemical elements properties were used. The calculations were carried out using machine learning programs that are part of the information-analytical system developed by the authors (various ensembles of algorithms of: the binary decision trees, the linear machine, the search for logical regularities of classes, the support vector machine, Fisher linear discriminant, the k-nearest neighbors, the learning a multilayer perceptron and a neural network), — for predicting chalcospinels not yet obtained, as well as an extensive family of regression methods, presented in the scikit-learn package for the Python language, and multilevel machine learning methods that were proposed by the authors — for estimation of the new chalcospinels lattice parameter value). The prediction accuracy of new chalcospinels according to the results of the cross-validation is not lower than 80%, and the prediction accuracy of the parameter of their crystal lattice (according to the results of calculating the mean absolute error (when cross-validation in the leave-one-out mode)) is ± 0.1 Å. The effectiveness of using multilevel machine learning methods to predict the physical properties of substances was shown.


2021 ◽  
Author(s):  
Quentin Lenouvel ◽  
Vincent Génot ◽  
Philippe Garnier ◽  
Benoit Lavraud ◽  
Sergio Toledo

<p>The understanding of magnetic reconnection's physical processes has considerably been improved thanks to the data of the Magnetopsheric Multiscale mission (MMS). However, a lot of work still has to be done to better characterize the core of the reconnection process : the electron diffusion region (EDR). We previously developed a machine learning algorithm to automatically detect EDR candidates, in order to increase the available list of events identified in the literature. However, identifying the parameters that are the most relevant to describe EDRs is complex, all the more that some of the small scale plasma/fields parameters show limitations in some configurations such as for low particle densities or large guide fields cases. In this study, we perform a statistical study of previously reported dayside EDRs as well as newly reported EDR candidates found using machine learning methods. We also show different single and multi-spacecraft parameters that can be used to better identify dayside EDRs in time series from MMS data recorded at the magnetopause. And finally we show an analysis of the link between the guide field and the strength of the energy conversion around each EDR.</p>


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Cindy Feng ◽  
George Kephart ◽  
Elizabeth Juarez-Colunga

Abstract Background Coronavirus disease (COVID-19) presents an unprecedented threat to global health worldwide. Accurately predicting the mortality risk among the infected individuals is crucial for prioritizing medical care and mitigating the healthcare system’s burden. The present study aimed to assess the predictive accuracy of machine learning methods to predict the COVID-19 mortality risk. Methods We compared the performance of classification tree, random forest (RF), extreme gradient boosting (XGBoost), logistic regression, generalized additive model (GAM) and linear discriminant analysis (LDA) to predict the mortality risk among 49,216 COVID-19 positive cases in Toronto, Canada, reported from March 1 to December 10, 2020. We used repeated split-sample validation and k-steps-ahead forecasting validation. Predictive models were estimated using training samples, and predictive accuracy of the methods for the testing samples was assessed using the area under the receiver operating characteristic curve, Brier’s score, calibration intercept and calibration slope. Results We found XGBoost is highly discriminative, with an AUC of 0.9669 and has superior performance over conventional tree-based methods, i.e., classification tree or RF methods for predicting COVID-19 mortality risk. Regression-based methods (logistic, GAM and LASSO) had comparable performance to the XGBoost with slightly lower AUCs and higher Brier’s scores. Conclusions XGBoost offers superior performance over conventional tree-based methods and minor improvement over regression-based methods for predicting COVID-19 mortality risk in the study population.


2021 ◽  
Vol 8 ◽  
Author(s):  
Si Yang ◽  
Chenxi Li ◽  
Yang Mei ◽  
Wen Liu ◽  
Rong Liu ◽  
...  

Different geographical origins can lead to great variance in coffee quality, taste, and commercial value. Hence, controlling the authenticity of the origin of coffee beans is of great importance for producers and consumers worldwide. In this study, terahertz (THz) spectroscopy, combined with machine learning methods, was investigated as a fast and non-destructive method to classify the geographic origin of coffee beans, comparing it with the popular machine learning methods, including convolutional neural network (CNN), linear discriminant analysis (LDA), and support vector machine (SVM) to obtain the best model. The curse of dimensionality will cause some classification methods which are struggling to train effective models. Thus, principal component analysis (PCA) and genetic algorithm (GA) were applied for LDA and SVM to create a smaller set of features. The first nine principal components (PCs) with an accumulative contribution rate of 99.9% extracted by PCA and 21 variables selected by GA were the inputs of LDA and SVM models. The results demonstrate that the excellent classification (accuracy was 90% in a prediction set) could be achieved using a CNN method. The results also indicate variable selecting as an important step to create an accurate and robust discrimination model. The performances of LDA and SVM algorithms could be improved with spectral features extracted by PCA and GA. The GA-SVM has achieved 75% accuracy in a prediction set, while the SVM and PCA-SVM have achieved 50 and 65% accuracy, respectively. These results demonstrate that THz spectroscopy, together with machine learning methods, is an effective and satisfactory approach for classifying geographical origins of coffee beans, suggesting the techniques to tap the potential application of deep learning in the authenticity of agricultural products while expanding the application of THz spectroscopy.


2020 ◽  
Vol 4 (s1) ◽  
pp. 135-135
Author(s):  
James Keoni Morris ◽  
Josh L. Gowin ◽  
Melanie L. Schwandt ◽  
Nancy Diazgranados ◽  
Vijay A. Ramchandani

OBJECTIVES/GOALS: To test if a machine learning algorithm could predict a person’s capacity to binge drink and explore what measures might be important for identifying individuals at risk for high-intensity binge drinking behaviors. METHODS/STUDY POPULATION: The sample included 1177 (474 female) non-treatment-seeking drinkers (age: 18-91 years), that were assigned to a group based on their heaviest drinking day reported in a 90-Day Alcohol Timeline Followback questionnaire. The groups were Non-Bingers (female: 12 drinks, male:>15 drinks). The sample was divided into a training sample (N = 884) and a testing sample (N = 293). A machine learning algorithm called random forest was then used to generate a predictive model based on measures of substance use, personality traits, and trauma. The model was applied to the testing sample to determine accuracy. RESULTS/ANTICIPATED RESULTS: The first model correctly assigned 190 out of 293 subjects, giving it a total error rate of 0.35, with lowest rates for non-binge (0.19) and high-intensity (0.18), while medium-intensity had the highest error rate (0.86). The most important variables for the accuracy of the model included: total score on the Alcohol Use Disorder Identification Test, first five sub-score of the Self-Reported Effects of Alcohol, Compulsive Drinking subscale, and presence of a current psychiatric diagnosis. As a follow-up analysis, we built and tested another random forest model without the use of drinking dependence measures. This model had a total error rate of 0.39, and introduced other important variables such as smoking behaviors, perceived stress, IQ, and number of negative life events. DISCUSSION/SIGNIFICANCE OF IMPACT: Our study showed that it was possible for a machine learning algorithm to predict binge drinking intensity better than chance. Drinking patterns were the most robust predictors, and stress, IQ, and psychiatric diagnoses were also useful in predicting binge drinking intensity.


2020 ◽  
Vol 21 (17) ◽  
pp. 6364
Author(s):  
Huayu Zhang ◽  
Edwin O. W. Bredewold ◽  
Dianne Vreeken ◽  
Jacques. M. G. J. Duijs ◽  
Hetty C. de Boer ◽  
...  

Atherosclerosis is the underlying pathology in a major part of cardiovascular disease, the leading cause of mortality in developed countries. The infiltration of monocytes into the vessel walls of large arteries is a key denominator of atherogenesis, making monocytes accountable for the development of atherosclerosis. With the development of high-throughput transcriptome profiling platforms and cytometric methods for circulating cells, it is now feasible to study in-depth the predicted functional change of circulating monocytes reflected by changes of gene expression in certain pathways and correlate the changes to disease outcome. Neuroimmune guidance cues comprise a group of circulating- and cell membrane-associated signaling proteins that are progressively involved in monocyte functions. Here, we employed the CIRCULATING CELLS study cohort to classify cardiovascular disease patients and healthy individuals in relation to their expression of neuroimmune guidance cues in circulating monocytes. To cope with the complexity of human datasets featured by noisy data, nonlinearity and multidimensionality, we assessed various machine-learning methods. Of these, the linear discriminant analysis, Naïve Bayesian model and stochastic gradient boost model yielded perfect or near-perfect sensibility and specificity and revealed that expression levels of the neuroimmune guidance cues SEMA6B, SEMA6D and EPHA2 in circulating monocytes were of predictive values for cardiovascular disease outcome.


2020 ◽  
Vol 4 (1) ◽  
pp. 1-6
Author(s):  
Irzal Ahmad Sabilla ◽  
Chastine Fatichah

Vegetables are ingredients for flavoring, such as tomatoes and chilies. A Both of these ingredients are processed to accompany the people's staple food in the form of sauce and seasoning. In supermarkets, these vegetables can be found easily, but many people do not understand how to choose the type and quality of chilies and tomatoes. This study discusses the classification of types of cayenne, curly, green, red chilies, and tomatoes with good and bad conditions using machine learning and contrast enhancement techniques. The machine learning methods used are Support Vector Machine (SVM), K-Nearest Neighbor (K-NN), Linear Discriminant Analysis (LDA), and Random Forest (RF). The results of testing the best method are measured based on the value of accuracy. In addition to the accuracy of this study, it also measures the speed of computation so that the methods used are efficient.


Sign in / Sign up

Export Citation Format

Share Document