scholarly journals An approach of re-organizing input dataset to enhance the quality of emotion recognition using the bio-signals dataset of MIT

2021 ◽  
Vol 10 (6) ◽  
pp. 3220-3227
Author(s):  
Van-Dung Pham ◽  
Thanh-Long Cung

The purpose of this paper is to propose an approach of re-organizing input data to recognize emotion based on short signal segments and increase the quality of emotional recognition using physiological signals. MIT's long physiological signal set was divided into two new datasets, with shorter and overlapped segments. Three different classification methods (support vector machine, random forest, and multilayer perceptron) were implemented to identify eight emotional states based on statistical features of each segment in these two datasets. By re-organizing the input dataset, the quality of recognition results was enhanced. The random forest shows the best classification result among three implemented classification methods, with an accuracy of 97.72% for eight emotional states, on the overlapped dataset. This approach shows that, by re-organizing the input dataset, the high accuracy of recognition results can be achieved without the use of EEG and ECG signals.

Author(s):  
Chenguang Li ◽  
Hongjun Yang ◽  
Long Cheng

AbstractAs a relatively new physiological signal of brain, functional near-infrared spectroscopy (fNIRS) is being used more and more in brain–computer interface field, especially in the task of motor imagery. However, the classification accuracy based on this signal is relatively low. To improve the accuracy of classification, this paper proposes a new experimental paradigm and only uses fNIRS signals to complete the classification task of six subjects. Notably, the experiment is carried out in a non-laboratory environment, and movements of motion imagination are properly designed. And when the subjects are imagining the motions, they are also subvocalizing the movements to prevent distraction. Therefore, according to the motor area theory of the cerebral cortex, the positions of the fNIRS probes have been slightly adjusted compared with other methods. Next, the signals are classified by nine classification methods, and the different features and classification methods are compared. The results show that under this new experimental paradigm, the classification accuracy of 89.12% and 88.47% can be achieved using the support vector machine method and the random forest method, respectively, which shows that the paradigm is effective. Finally, by selecting five channels with the largest variance after empirical mode decomposition of the original signal, similar classification results can be achieved.


Foods ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 1411
Author(s):  
José Luis P. Calle ◽  
Marta Ferreiro-González ◽  
Ana Ruiz-Rodríguez ◽  
Gerardo F. Barbero ◽  
José Á. Álvarez ◽  
...  

Sherry wine vinegar is a Spanish gourmet product under Protected Designation of Origin (PDO). Before a vinegar can be labeled as Sherry vinegar, the product must meet certain requirements as established by its PDO, which, in this case, means that it has been produced following the traditional solera and criadera ageing system. The quality of the vinegar is determined by many factors such as the raw material, the acetification process or the aging system. For this reason, mainly producers, but also consumers, would benefit from the employment of effective analytical tools that allow precisely determining the origin and quality of vinegar. In the present study, a total of 48 Sherry vinegar samples manufactured from three different starting wines (Palomino Fino, Moscatel, and Pedro Ximénez wine) were analyzed by Fourier-transform infrared (FT-IR) spectroscopy. The spectroscopic data were combined with unsupervised exploratory techniques such as hierarchical cluster analysis (HCA) and principal component analysis (PCA), as well as other nonparametric supervised techniques, namely, support vector machine (SVM) and random forest (RF), for the characterization of the samples. The HCA and PCA results present a clear grouping trend of the vinegar samples according to their raw materials. SVM in combination with leave-one-out cross-validation (LOOCV) successfully classified 100% of the samples, according to the type of wine used for their production. The RF method allowed selecting the most important variables to develop the characteristic fingerprint (“spectralprint”) of the vinegar samples according to their starting wine. Furthermore, the RF model reached 100% accuracy for both LOOCV and out-of-bag (OOB) sets.


Water ◽  
2021 ◽  
Vol 13 (18) ◽  
pp. 2457
Author(s):  
Manel Naloufi ◽  
Françoise S. Lucas ◽  
Sami Souihi ◽  
Pierre Servais ◽  
Aurélie Janne ◽  
...  

Exposure to contaminated water during aquatic recreational activities can lead to gastrointestinal diseases. In order to decrease the exposure risk, the fecal indicator bacteria Escherichia coli is routinely monitored, which is time-consuming, labor-intensive, and costly. To assist the stakeholders in the daily management of bathing sites, models have been developed to predict the microbiological quality. However, model performances are highly dependent on the quality of the input data which are usually scarce. In our study, we proposed a conceptual framework for optimizing the selection of the most adapted model, and to enrich the training dataset. This frameword was successfully applied to the prediction of Escherichia coli concentrations in the Marne River (Paris Area, France). We compared the performance of six machine learning (ML)-based models: K-nearest neighbors, Decision Tree, Support Vector Machines, Bagging, Random Forest, and Adaptive boosting. Based on several statistical metrics, the Random Forest model presented the best accuracy compared to the other models. However, 53.2 ± 3.5% of the predicted E. coli densities were inaccurately estimated according to the mean absolute percentage error (MAPE). Four parameters (temperature, conductivity, 24 h cumulative rainfall of the previous day the sampling, and the river flow) were identified as key variables to be monitored for optimization of the ML model. The set of values to be optimized will feed an alert system for monitoring the microbiological quality of the water through combined strategy of in situ manual sampling and the deployment of a network of sensors. Based on these results, we propose a guideline for ML model selection and sampling optimization.


2022 ◽  
Vol 355 ◽  
pp. 03008
Author(s):  
Yang Zhang ◽  
Lei Zhang ◽  
Yabin Ma ◽  
Jinsen Guan ◽  
Zhaoxia Liu ◽  
...  

In this study, an electronic nose model composed of seven kinds of metal oxide semiconductor sensors was developed to distinguish the milk source (the dairy farm to which milk belongs), estimate the content of milk fat and protein in milk, to identify the authenticity and evaluate the quality of milk. The developed electronic nose is a low-cost and non-destructive testing equipment. (1) For the identification of milk sources, this paper uses the method of combining the electronic nose odor characteristics of milk and the component characteristics to distinguish different milk sources, and uses Principal Component Analysis (PCA) and Linear Discriminant Analysis , LDA) for dimensionality reduction analysis, and finally use three machine learning algorithms such as Logistic Regression (LR), Support Vector Machine (SVM) and Random Forest (RF) to build a milk source (cow farm) Identify the model and evaluate and compare the classification effects. The experimental results prove that the classification effect of the SVM-LDA model based on the electronic nose odor characteristics is better than other single feature models, and the accuracy of the test set reaches 91.5%. The RF-LDA and SVM-LDA models based on the fusion feature of the two have the best effect Set accuracy rate is as high as 96%. (2) The three algorithms, Gradient Boosting Decision Tree (GBDT), Extreme Gradient Boosting (XGBoost) and Random Forest (RF), are used to construct the electronic nose odor data for milk fat rate and protein rate. The method of estimating the model, the results show that the RF model has the best estimation performance( R2 =0.9399 for milk fat; R2=0.9301for milk protein). And it prove that the method proposed in this study can improve the estimation accuracy of milk fat and protein, which provides a technical basis for predicting the quality of dairy products.


2021 ◽  
Vol 4 (2(112)) ◽  
pp. 58-72
Author(s):  
Chingiz Kenshimov ◽  
Zholdas Buribayev ◽  
Yedilkhan Amirgaliyev ◽  
Aisulyu Ataniyazova ◽  
Askhat Aitimov

In the course of our research work, the American, Russian and Turkish sign languages were analyzed. The program of recognition of the Kazakh dactylic sign language with the use of machine learning methods is implemented. A dataset of 5000 images was formed for each gesture, gesture recognition algorithms were applied, such as Random Forest, Support Vector Machine, Extreme Gradient Boosting, while two data types were combined into one database, which caused a change in the architecture of the system as a whole. The quality of the algorithms was also evaluated. The research work was carried out due to the fact that scientific work in the field of developing a system for recognizing the Kazakh language of sign dactyls is currently insufficient for a complete representation of the language. There are specific letters in the Kazakh language, because of the peculiarities of the spelling of the language, problems arise when developing recognition systems for the Kazakh sign language. The results of the work showed that the Support Vector Machine and Extreme Gradient Boosting algorithms are superior in real-time performance, but the Random Forest algorithm has high recognition accuracy. As a result, the accuracy of the classification algorithms was 98.86 % for Random Forest, 98.68 % for Support Vector Machine and 98.54 % for Extreme Gradient Boosting. Also, the evaluation of the quality of the work of classical algorithms has high indicators. The practical significance of this work lies in the fact that scientific research in the field of gesture recognition with the updated alphabet of the Kazakh language has not yet been conducted and the results of this work can be used by other researchers to conduct further research related to the recognition of the Kazakh dactyl sign language, as well as by researchers, engaged in the development of the international sign language


2021 ◽  
Vol 7 (2) ◽  
pp. 863-866
Author(s):  
Yedukondala Rao Veeranki ◽  
Nagarajan Ganapathy ◽  
Ramakrishnan Swaminathan

Abstract In this work, the feasibility of time-frequency methods, namely short-time Fourier transform, Choi Williams distribution, and smoothed pseudo-Wigner-Ville distribution in the classification of happy and sad emotional states using Electrodermal activity signals have been explored. For this, the annotated happy and sad signals are obtained from an online public database and decomposed into phasic components. The time-frequency analysis has been performed on the phasic components using three different methods. Four statistical features, namely mean, variance, kurtosis, and skewness are extracted from each method. Four classifiers, namely logistic regression, Naive Bayes, random forest, and support vector machine, have been used for the classification. The combination of the smoothed pseudo-Wigner-Ville distribution and random forest yields the highest F-measure of 68.74% for classifying happy and sad emotional states. Thus, it appears that the suggested technique could be helpful in the diagnosis of clinical conditions linked to happy and sad emotional states.


2018 ◽  
Vol 8 (9) ◽  
pp. 1757-1762 ◽  
Author(s):  
Jie Zhang ◽  
Licai Yang ◽  
Zhonghua Su ◽  
Xueqin Mao ◽  
Kan Luo ◽  
...  

Background: Noise is unavoidable in the physiological signal measurement system. Poor quality signals can affect the results of analysis and disable the following clinical diagnosis. Thus, it is necessary to perform signal quality assessment before we interpreting the signal. Objective: In this work, we describe a method combing support vector machine (SVM) and multi-feature fusion for assessing the signal quality of pulsatile waveforms, concentrating on the photoplethysmogram (PPG). Methods: PPG signals from 53 healthy volunteers were recorded. Each had a 5 min length. Signal quality in each heart beat was manual annotated by clinical expert, and then the signal quality in 5 s episode was automatically calculated according to the results from each beat segments, resulting in a total of 13,294 5-s PPG segments. Then a SVM was trained to classify clean/noisy PPG recordings by inputting a set of twelve signal quality features. Further experiments were carried out to verify the proposed SVM based signal quality classifier method. Results: An average accuracy of 87.90%, a sensitivity of 88.10% and a specificity of 87.66% were found on the 10-fold cross validation. Conclusions: The signal quality of PPGs can be accurately classified by using the proposed method.


2020 ◽  
Vol 10 (12) ◽  
pp. 4382
Author(s):  
Daniela Alexandra Embus ◽  
Andres Julián Castillo ◽  
Fulvio Yesid Vivas ◽  
Oscar Mauricio Caicedo ◽  
Armando Ordóñez

Network selection plays a pivotal role in ensuring efficient handover management. Some existing approaches for network selection may use one criterion, such as RSSI (Received Signal Strength Indicator) or SINR (Signal to Interference Noise Ratio). However, these approaches are reactive and may lead to incorrect decisions due to the limited information. Other multi-criteria-based approaches use techniques, such as statistical mathematics, heuristics methods, and neural networks, to optimize the network selection. However, these approaches have shortcomings related to their computational complexity and the unnecessary and frequent handovers. This paper introduces NetSel-RF, a multi-criteria model, based on supervised learning, for network selection in WiFi networks. Here, we describe the created dataset, the data preparation and the evaluation of diverse supervised learning techniques (Random Forest, Support Vector Machine, Adaptive Random Forest, Hoeffding Adaptive Tree, and Hoedding Tree techniques). Our evaluation results show that Random Forest outperforms other algorithms in terms of its accuracy and Matthews correlation coefficient. Additionally, NetSel-RF performs better than the Signal Strong First approach and behaves similarly to the Analytic Hierarchy Process–Technique for Order Preferences by Similarity to the Ideal Solution (AHP-TOPSIS) approach in terms of the number of handovers and throughput drops. Unlike the latter, NetSel-RF is proactive and therefore is more efficient regarding Quality of Services (QoS) and Quality of Experience (QoE) since the end-devices perform the handover before the network link quality degrades.


2021 ◽  
Vol 13 (11) ◽  
pp. 2039
Author(s):  
Joon Jin Song ◽  
Melissa Innerst ◽  
Kyuhee Shin ◽  
Bo-Young Ye ◽  
Minho Kim ◽  
...  

Estimating precipitation area is important for weather forecasting as well as real-time application. This paper aims to develop an analytical framework for efficient precipitation area estimation using S-band dual-polarization radar measurements. Several types of factors, such as types of sensors, thresholds, and models, are considered and compared to form a data set. After building the appropriate data set, this paper yields a rigorous comparison of classification methods in statistical (logistic regression and linear discriminant analysis) and machine learning (decision tree, support vector machine, and random forest). To achieve better performance, spatial classification is considered by incorporating latitude and longitude of observation location into classification, compared with non-spatial classification. The data used in this study were collected by rain detector and present weather sensor in a network of automated weather systems (AWS), and an S-band dual-polarimetric weather radar during ten different rainfall events of varying lengths. The mean squared prediction error (MSPE) from leave-one-out cross validation (LOOCV) is computed to assess the performance of the methods. Of the methods, the decision tree and random forest methods result in the lowest MSPE, and spatial classification outperforms non-spatial classification. Particularly, machine-learning-based spatial classification methods accurately estimate the precipitation area in the northern areas of the study region.


Sign in / Sign up

Export Citation Format

Share Document