scholarly journals Ultra-Short Window Length and Feature Importance Analysis for Cognitive Load Detection from Wearable Sensors

Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 613
Author(s):  
Jaakko Tervonen ◽  
Kati Pettersson ◽  
Jani Mäntyjärvi

Human cognitive capabilities are under constant pressure in the modern information society. Cognitive load detection would be beneficial in several applications of human–computer interaction, including attention management and user interface adaptation. However, current research into accurate and real-time biosignal-based cognitive load detection lacks understanding of the optimal and minimal window length in data segmentation which would allow for more timely, continuous state detection. This study presents a comparative analysis of ultra-short (30 s or less) window lengths in cognitive load detection with a wearable device. Heart rate, heart rate variability, galvanic skin response, and skin temperature features are extracted at six different window lengths and used to train an Extreme Gradient Boosting classifier to detect between cognitive load and rest. A 25 s window showed the highest accury (67.6%), which is similar to earlier studies using the same dataset. Overall, model accuracy tended to decrease as the window length decreased, and lowest performance (60.0%) was observed with a 5 s window. The contribution of different physiological features to the classification performance and the most useful features that react in short windows are also discussed. The analysis provides a promising basis for future real-time applications with wearable sensors.

2019 ◽  
Vol 9 (22) ◽  
pp. 4833 ◽  
Author(s):  
Ardo Allik ◽  
Kristjan Pilt ◽  
Deniss Karai ◽  
Ivo Fridolin ◽  
Mairo Leier ◽  
...  

The aim of this study was to develop an optimized physical activity classifier for real-time wearable systems with the focus on reducing the requirements on device power consumption and memory buffer. Classification parameters evaluated in this study were the sampling frequency of the acceleration signal, window length of the classification fragment, and the number of classification features, found with different feature selection methods. For parameter evaluation, a decision tree classifier was created based on the acceleration signals recorded during tests, where 25 healthy test subjects performed various physical activities. Overall average F1-score achieved in this study was about 0.90. Similar F1-scores were achieved with the evaluated window lengths of 5 s (0.92 ± 0.02) and 3 s (0.91 ± 0.02), while classification performance with 1 s were lower (0.87 ± 0.02). Tested sampling frequencies of 50 Hz, 25 Hz, and 13 Hz had similar results with most classified activity types, with an exception of outdoor cycling, where differences were significant. Using forward sequential feature selection enabled the decreasing of the number of features from initial 110 features to about 12 features without lowering the classification performance. The results of this study have been used for developing more efficient real-time physical activity classifiers.


Author(s):  
Chuyuan Wang ◽  
Linxuan Zhang ◽  
Chongdang Liu

In order to deal with the dynamic production environment with frequent fluctuation of processing time, robotic cell needs an efficient scheduling strategy which meets the real-time requirements. This paper proposes an adaptive scheduling method based on pattern classification algorithm to guide the online scheduling process. The method obtains the scheduling knowledge of manufacturing system from the production data and establishes an adaptive scheduler, which can adjust the scheduling rules according to the current production status. In the process of establishing scheduler, how to choose essential attributes is the main difficulty. In order to solve the low performance and low efficiency problem of embedded feature selection method, based on the application of Extreme Gradient Boosting model (XGBoost) to obtain the adaptive scheduler, an improved hybrid optimization algorithm which integrates Gini impurity of XGBoost model into Particle Swarm Optimization (PSO) is employed to acquire the optimal subset of features. The results based on simulated robotic cell system show that the proposed PSO-XGBoost algorithm outperforms existing pattern classification algorithms and the newly learned adaptive model can improve the basic dispatching rules. At the same time, it can meet the demand of real-time scheduling.


Protein-Protein Interactions referred as PPIs perform significant role in biological functions like cell metabolism, immune response, signal transduction etc. Hot spots are small fractions of residues in interfaces and provide substantial binding energy in PPIs. Therefore, identification of hot spots is important to discover and analyze molecular medicines and diseases. The current strategy, alanine scanning isn't pertinent to enormous scope applications since the technique is very costly and tedious. The existing computational methods are poor in classification performance as well as accuracy in prediction. They are concerned with the topological structure and gene expression of hub proteins. The proposed system focuses on hot spots of hub proteins by eliminating redundant as well as highly correlated features using Pearson Correlation Coefficient and Support Vector Machine based feature elimination. Extreme Gradient boosting and LightGBM algorithms are used to ensemble a set of weak classifiers to form a strong classifier. The proposed system shows better accuracy than the existing computational methods. The model can also be used to predict accurate molecular inhibitors for specific PPIs


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Tianjun Li ◽  
Long Chen ◽  
Min Gan

Abstract Background Mass spectra are usually acquired from the Liquid Chromatography-Mass Spectrometry (LC-MS) analysis for isotope labeled proteomics experiments. In such experiments, the mass profiles of labeled (heavy) and unlabeled (light) peptide pairs are represented by isotope clusters (2D or 3D) that provide valuable information about the studied biological samples in different conditions. The core task of quality control in quantitative LC-MS experiment is to filter out low-quality peptides with questionable profiles. The commonly used methods for this problem are the classification approaches. However, the data imbalance problems in previous control methods are often ignored or mishandled. In this study, we introduced a quality control framework based on the extreme gradient boosting machine (XGBoost), and carefully addressed the imbalanced data problem in this framework. Results In the XGBoost based framework, we suggest the application of the Synthetic minority over-sampling technique (SMOTE) to re-balance data and use the balanced data to train the boosted trees as the classifier. Then the classifier is applied to other data for the peptide quality assessment. Experimental results show that our proposed framework increases the reliability of peptide heavy-light ratio estimation significantly. Conclusions Our results indicate that this framework is a powerful method for the peptide quality assessment. For the feature extraction part, the extracted ion chromatogram (XIC) based features contribute to the peptide quality assessment. To solve the imbalanced data problem, SMOTE brings a much better classification performance. Finally, the XGBoost is capable for the peptide quality control. Overall, our proposed framework provides reliable results for the further proteomics studies.


2021 ◽  
Vol 9 ◽  
Author(s):  
Apeksha Shah ◽  
Swati Ahirrao ◽  
Sharnil Pandya ◽  
Ketan Kotecha ◽  
Suresh Rathod

Cardiovascular disease (CVD) is considered to be one of the most epidemic diseases in the world today. Predicting CVDs, such as cardiac arrest, is a difficult task in the area of healthcare. The healthcare industry has a vast collection of datasets for analysis and prediction purposes. Somehow, the predictions made on these publicly available datasets may be erroneous. To make the prediction accurate, real-time data need to be collected. This study collected real-time data using sensors and stored it on a cloud computing platform, such as Google Firebase. The acquired data is then classified using six machine-learning algorithms: Artificial Neural Network (ANN), Random Forest Classifier (RFC), Gradient Boost Extreme Gradient Boosting (XGBoost) classifier, Support Vector Machine (SVM), Naïve Bayes (NB), and Decision Tree (DT). Furthermore, we have presented two novel gender-based risk classification and age-wise risk classification approach in the undertaken study. The presented approaches have used Kaplan-Meier and Cox regression survival analysis methodologies for risk detection and classification. The presented approaches also assist health experts in identifying the risk probability risk and the 10-year risk score prediction. The proposed system is an economical alternative to the existing system due to its low cost. The outcome obtained shows an enhanced level of performance with an overall accuracy of 98% using DT on our collected dataset for cardiac risk prediction. We also introduced two risk classification models for gender- and age-wise people to detect their survival probability. The outcome of the proposed model shows accurate probability in both classes.


2020 ◽  
Vol 10 (19) ◽  
pp. 6681 ◽  
Author(s):  
Zhizhen Liu ◽  
Hong Chen ◽  
Xiaoke Sun ◽  
Hengrui Chen

The development of the intelligent transport system has created conditions for solving the supply–demand imbalance of public transportation services. For example, forecasting the demand for online taxi-hailing could help to rebalance the resource of taxis. In this research, we introduced a method to forecast real-time online taxi-hailing demand. First, we analyze the relation between taxi demand and online taxi-hailing demand. Next, we propose six models containing different information based on backpropagation neural network (BPNN) and extreme gradient boosting (XGB) to forecast online taxi-hailing demand. Finally, we present a real-time online taxi-hailing demand forecasting model considering the projected taxi demand (“PTX”). The results indicate that including more information leads to better prediction performance, and the results show that including the information of projected taxi demand leads to a reduction of MAPE from 0.190 to 0.183 and an RMSE reduction from 23.921 to 21.050, and it increases R2 from 0.845 to 0.853. The analysis indicates the demand regularity of online taxi-hailing and taxi, and the experiment realizes real-time prediction of online taxi-hailing by considering the projected taxi demand. The proposed method can help to schedule online taxi-hailing resources in advance.


2021 ◽  
Vol 2138 (1) ◽  
pp. 012009
Author(s):  
Huimin Zhang ◽  
Renshuang Ding ◽  
Qi Zhang ◽  
Mingxing Fang ◽  
Guanghua Zhang ◽  
...  

Abstract Given the subjectivity and non-real-time of disease scoring system and invasive parameters in evaluating the development of acute respiratory distress syndrome (ARDS), combined with noninvasive parameters, this paper proposed an ARDS severity recognition model based on extreme gradient boosting (XGBoost). Firstly, the physiological parameters of patients were extracted based on the MIMIC-III database for statistical analysis, and the outliers and unbalanced samples were processed by the interquartile range and synthetic minority oversampling technique. Then, Pearson correlation coefficient and random forest were used as hybrid feature selection to score the noninvasive parameters comprehensively, and essential parameters for identifying diseases were obtained. Finally, XGBoost combined with grid search cross-validation to determine the best hyper-parameters of the model to realize the accurate classification of disease degree. The experimental results show that the model’s area under the curve (AUC) is as high as 0.98, and the accuracy is 0.90; the total score of blood oxygen saturation (SpO2) is 0.625, which could be used as an essential parameter to evaluate the severity of ARDS. Compared with traditional methods, this model has excellent advantages in real-time and accuracy and could provide more accurate diagnosis and treatment suggestions for medical staff.


2020 ◽  
Author(s):  
Ravindra Kumar Singh ◽  
Harsh Kumar Verma

Abstract The extensive usage of social media polarity analysis claims the need for real-time analytics and runtime outcomes on dashboards. In data analytics, only 30% of the time is consumed in modeling and evaluation stages and 70% is consumed in data engineering tasks. There are lots of machine learning algorithms to achieve a desirable outcome in prediction points of view, but they lack in handling data and their transformation so-called data engineering tasks, and reducing its time remained still challenging. The contribution of this research paper is to encounter the mentioned challenges by presenting a parallelly, scalable, effective, responsive and fault-tolerant framework to perform end-to-end data analytics tasks in real-time and batch-processing manner. An experimental analysis on Twitter posts supported the claims and signifies the benefits of parallelism of data processing units. This research has highlighted the importance of processing mentioned URLs and embedded images along with post content to boost the prediction efficiency. Furthermore, this research additionally provided a comparison of naive Bayes, support vector machines, extreme gradient boosting and long short-term memory (LSTM) machine learning techniques for sentiment analysis on Twitter posts and concluded LSTM as the most effective technique in this regard.


Sign in / Sign up

Export Citation Format

Share Document