Physical Activity Classification with Wearable Sensor Data in an eCoach Recommendation System (Preprint)

2021 ◽  
Author(s):  
Ayan Chatterjee

UNSTRUCTURED Leading a sedentary lifestyle may cause numerous health problems. Therefore, sedentary lifestyle changes should be given priority to avoid severe damage. Research in eHealth can provide methods to enrich personal healthcare with Information and Communication Technologies (ICTs). An eCoach system may allow people to manage a healthy lifestyle with health state monitoring and personalized recommendations. Using machine learning (ML) techniques, this study investigated the possibility of classifying daily physical activity for adults into the following classes - sedentary, low active, active, active, highly active, and rigorous active. The daily total step count, total daily minutes of sedentary time, low physical activity (LPA), medium physical activity (MPA), and vigorous physical activity (VPA) served as input for the classification models. We first used publicly available Fitbit data to build the classification models. Second, using the transfer learning approach, we re-used the top five best-performing models on a real dataset as collected from the MOX2-5 wearable medical-grade activity sensor. We found that ensemble ExtraTreesClassifier with an estimator value of 150 outperformed other classifiers with a mean accuracy score of 99.72% for single feature and support vector classifier (SVC) with “linear” kernel outpaced other classifiers with a mean accuracy score of 99.14% for five features, for the public Fitbit datasets. To demonstrate the practical usefulness of the classifiers, we conceptualized how the classifier model can be used in an eCoach prototype system to attain personalized activity goals (e.g., stay active for the entire week). After transfer learning, K-Nearest-Neighbor (KNN) outpaced the other four classifiers for a single feature, and SVC with a “linear” kernel outdid the other four classifiers for multiple features.

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1668
Author(s):  
Zongming Dai ◽  
Kai Hu ◽  
Jie Xie ◽  
Shengyu Shen ◽  
Jie Zheng ◽  
...  

Traditional co-word networks do not discriminate keywords of researcher interest from general keywords. Co-word networks are therefore often too general to provide knowledge if interest to domain experts. Inspired by the recent work that uses an automatic method to identify the questions of interest to researchers like “problems” and “solutions”, we try to answer a similar question “what sensors can be used for what kind of applications”, which is great interest in sensor- related fields. By generalizing the specific questions as “questions of interest”, we built a knowledge network considering researcher interest, called bipartite network of interest (BNOI). Different from a co-word approaches using accurate keywords from a list, BNOI uses classification models to find possible entities of interest. A total of nine feature extraction methods including N-grams, Word2Vec, BERT, etc. were used to extract features to train the classification models, including naïve Bayes (NB), support vector machines (SVM) and logistic regression (LR). In addition, a multi-feature fusion strategy and a voting principle (VP) method are applied to assemble the capability of the features and the classification models. Using the abstract text data of 350 remote sensing articles, features are extracted and the models trained. The experiment results show that after removing the biased words and using the ten-fold cross-validation method, the F-measure of “sensors” and “applications” are 93.2% and 85.5%, respectively. It is thus demonstrated that researcher questions of interest can be better answered by the constructed BNOI based on classification results, comparedwith the traditional co-word network approach.


Author(s):  
Daniel Febrian Sengkey ◽  
Agustinus Jacobus ◽  
Fabian Johanes Manoppo

Support vector machine (SVM) is a known method for supervised learning in sentiment analysis and there are many studies about the use of SVM in classifying the sentiments in lecturer evaluation. SVM has various parameters that can be tuned and kernels that can be chosen to improve the classifier accuracy. However, not all options have been explored. Therefore, in this study we compared the four SVM kernels: radial, linear, polynomial, and sigmoid, to discover how each kernel influences the accuracy of the classifier. To make a proper assessment, we used our labeled dataset of students’ evaluations toward the lecturer. The dataset was split, one for training the classifier, and another one for testing the model. As an addition, we also used several different ratios of the training:testing dataset. The split ratios are 0.5 to 0.95, with the increment factor of 0.05. The dataset was split randomly, hence the splitting-training-testing processes were repeated 1,000 times for each kernel and splitting ratio. Therefore, at the end of the experiment, we got 40,000 accuracy data. Later, we applied statistical methods to see whether the differences are significant. Based on the statistical test, we found that in this particular case, the linear kernel significantly has higher accuracy compared to the other kernels. However, there is a tradeoff, where the results are getting more varied with a higher proportion of data used for training.


2021 ◽  
Author(s):  
Clemens Heistracher ◽  
Anahid Jalali ◽  
Indu Strobl ◽  
Axel Suendermann ◽  
Sebastian Meixner ◽  
...  

<div>Abstract—An increasing number of industrial assets are equipped with IoT sensor platforms and the industry now expects data-driven maintenance strategies with minimal deployment costs. However, gathering labeled training data for supervised tasks such as anomaly detection is costly and often difficult to implement in operational environments. Therefore, this work aims to design and implement a solution that reduces the required amount of data for training anomaly classification models on time series sensor data and thereby brings down the overall deployment effort of IoT anomaly detection sensors. We set up several in-lab experiments using three peristaltic pumps and investigated approaches for transferring trained anomaly detection models across assets of the same type. Our experiments achieved promising effectiveness and provide initial evidence that transfer learning could be a suitable strategy for using pretrained anomaly classification models across industrial assets of the same type with minimal prior labeling and training effort. This work could serve as a starting point for more general, pretrained sensor data embeddings, applicable to a wide range of assets.</div>


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1692 ◽  
Author(s):  
Iván Silva ◽  
José Eugenio Naranjo

Identifying driving styles using classification models with in-vehicle data can provide automated feedback to drivers on their driving behavior, particularly if they are driving safely. Although several classification models have been developed for this purpose, there is no consensus on which classifier performs better at identifying driving styles. Therefore, more research is needed to evaluate classification models by comparing performance metrics. In this paper, a data-driven machine-learning methodology for classifying driving styles is introduced. This methodology is grounded in well-established machine-learning (ML) methods and literature related to driving-styles research. The methodology is illustrated through a study involving data collected from 50 drivers from two different cities in a naturalistic setting. Five features were extracted from the raw data. Fifteen experts were involved in the data labeling to derive the ground truth of the dataset. The dataset fed five different models (Support Vector Machines (SVM), Artificial Neural Networks (ANN), fuzzy logic, k-Nearest Neighbor (kNN), and Random Forests (RF)). These models were evaluated in terms of a set of performance metrics and statistical tests. The experimental results from performance metrics showed that SVM outperformed the other four models, achieving an average accuracy of 0.96, F1-Score of 0.9595, Area Under the Curve (AUC) of 0.9730, and Kappa of 0.9375. In addition, Wilcoxon tests indicated that ANN predicts differently to the other four models. These promising results demonstrate that the proposed methodology may support researchers in making informed decisions about which ML model performs better for driving-styles classification.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4509 ◽  
Author(s):  
Alok Kumar Chowdhury ◽  
Dian Tjondronegoro ◽  
Vinod Chandran ◽  
Jinglan Zhang ◽  
Stewart G. Trost

This study examined the feasibility of a non-laboratory approach that uses machine learning on multimodal sensor data to predict relative physical activity (PA) intensity. A total of 22 participants completed up to 7 PA sessions, where each session comprised 5 trials (sitting and standing, comfortable walk, brisk walk, jogging, running). Participants wore a wrist-strapped sensor that recorded heart-rate (HR), electrodermal activity (Eda) and skin temperature (Temp). After each trial, participants provided ratings of perceived exertion (RPE). Three classifiers, including random forest (RF), neural network (NN) and support vector machine (SVM), were applied independently on each feature set to predict relative PA intensity as low (RPE ≤ 11), moderate (RPE 12–14), or high (RPE ≥ 15). Then, both feature fusion and decision fusion of all combinations of sensor modalities were carried out to investigate the best combination. Among the single modality feature sets, HR provided the best performance. The combination of modalities using feature fusion provided a small improvement in performance. Decision fusion did not improve performance over HR features alone. A machine learning approach using features from HR provided acceptable predictions of relative PA intensity. Adding features from other sensing modalities did not significantly improve performance.


2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
Chunhua Zhao ◽  
zhangwen Lin ◽  
Jinling Tan ◽  
Hengxing Hu ◽  
Qian Li

Aiming at solving the acquisition problems of wear particle data of large-modulus gear teeth and few training datasets, an integrated model of LCNNE based on transfer learning is proposed in this paper. Firstly, the wear particles are diagnosed and classified by connecting a new joint loss function and two pretrained models VGG19 and GoogLeNet. Subsequently, the wear particles in gearbox lubricating oil are chosen as the experimental object to make a comparison. Compared with the other four models’ experimental results, the model superiority in wear particle identification and classification is verified. Taking five models as feature extractors and support vector machines as classifiers, the experimental results and comparative analysis reveal that the LCNNE model is better than the other four models because its feature expression ability is stronger than that of the other four models.


2021 ◽  
Author(s):  
Clemens Heistracher ◽  
Anahid Jalali ◽  
Indu Strobl ◽  
Axel Suendermann ◽  
Sebastian Meixner ◽  
...  

<div>Abstract—An increasing number of industrial assets are equipped with IoT sensor platforms and the industry now expects data-driven maintenance strategies with minimal deployment costs. However, gathering labeled training data for supervised tasks such as anomaly detection is costly and often difficult to implement in operational environments. Therefore, this work aims to design and implement a solution that reduces the required amount of data for training anomaly classification models on time series sensor data and thereby brings down the overall deployment effort of IoT anomaly detection sensors. We set up several in-lab experiments using three peristaltic pumps and investigated approaches for transferring trained anomaly detection models across assets of the same type. Our experiments achieved promising effectiveness and provide initial evidence that transfer learning could be a suitable strategy for using pretrained anomaly classification models across industrial assets of the same type with minimal prior labeling and training effort. This work could serve as a starting point for more general, pretrained sensor data embeddings, applicable to a wide range of assets.</div>


Foods ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 861
Author(s):  
Lemonia-Christina Fengou ◽  
Alexandra Lianou ◽  
Panagiοtis Tsakanikas ◽  
Fady Mohareb ◽  
George-John E. Nychas

Minced meat is a vulnerable to adulteration food commodity because species- and/or tissue-specific morphological characteristics cannot be easily identified. Hence, the economically motivated adulteration of minced meat is rather likely to be practiced. The objective of this work was to assess the potential of spectroscopy-based sensors in detecting fraudulent minced meat substitution, specifically of (i) beef with bovine offal and (ii) pork with chicken (and vice versa) both in fresh and frozen-thawed samples. For each case, meat pieces were minced and mixed so that different levels of adulteration with a 25% increment were achieved while two categories of pure meat also were considered. From each level of adulteration, six different samples were prepared. In total, 120 samples were subjected to visible (Vis) and fluorescence (Fluo) spectra and multispectral image (MSI) acquisition. Support Vector Machine classification models were developed and evaluated. The MSI-based models outperformed the ones based on the other sensors with accuracy scores varying from 87% to 100%. The Vis-based models followed in terms of accuracy with attained scores varying from 57% to 97% while the lowest performance was demonstrated by the Fluo-based models. Overall, spectroscopic data hold a considerable potential for the detection and quantification of minced meat adulteration, which, however, appears to be sensor-specific.


Sensors ◽  
2020 ◽  
Vol 20 (14) ◽  
pp. 3863 ◽  
Author(s):  
Christian Post ◽  
Christian Rietz ◽  
Wolfgang Büscher ◽  
Ute Müller

The aim of this study was to develop classification models for mastitis and lameness treatments in Holstein dairy cows as the target variables based on continuous data from herd management software with modern machine learning methods. Data was collected over a period of 40 months from a total of 167 different cows with daily individual sensor information containing milking parameters, pedometer activity, feed and water intake, and body weight (in the form of differently aggregated data) as well as the entered treatment data. To identify the most important predictors for mastitis and lameness treatments, respectively, Random Forest feature importance, Pearson’s correlation and sequential forward feature selection were applied. With the selected predictors, various machine learning models such as Logistic Regression (LR), Support Vector Machine (SVM), K-nearest neighbors (KNN), Gaussian Naïve Bayes (GNB), Extra Trees Classifier (ET) and different ensemble methods such as Random Forest (RF) were trained. Their performance was compared using the receiver operator characteristic (ROC) area-under-curve (AUC), as well as sensitivity, block sensitivity and specificity. In addition, sampling methods were compared: Over- and undersampling as compensation for the expected unbalanced training data had a high impact on the ratio of sensitivity and specificity in the classification of the test data, but with regard to AUC, random oversampling and SMOTE (Synthetic Minority Over-sampling) even showed significantly lower values than with non-sampled data. The best model, ET, obtained a mean AUC of 0.79 for mastitis and 0.71 for lameness, respectively, based on testing data from practical conditions and is recommended by us for this type of data, but GNB, LR and RF were only marginally worse, and random oversampling and SMOTE even showed significantly lower values than without sampling. We recommend the use of these models as a benchmark for similar self-learning classification tasks. The classification models presented here retain their interpretability with the ability to present feature importances to the farmer in contrast to the “black box” models of Deep Learning methods.


1994 ◽  
Vol 72 (01) ◽  
pp. 058-064 ◽  
Author(s):  
Goya Wannamethee ◽  
A Gerald Shaper

SummaryThe relationship between haematocrit and cardiovascular risk factors, particularly blood pressure and blood lipids, has been examined in detail in a large prospective study of 7735 middle-aged men drawn from general practices in 24 British towns. The analyses are restricted to the 5494 men free of any evidence of ischaemic heart disease at screening.Smoking, body mass index, physical activity, alcohol intake and lung function (FEV1) were factors strongly associated with haematocrit levels independent of each other. Age showed a significant but small independent association with haematocrit. Non-manual workers had slightly higher haematocrit levels than manual workers; this difference increased considerably and became significant after adjustment for the other risk factors. Diabetics showed significantly lower levels of haematocrit than non-diabetics. In the univariate analysis, haematocrit was significantly associated with total serum protein (r = 0*18), cholesterol (r = 0.16), triglyceride (r = 0.15), diastolic blood pressure (r = 0.17) and heart rate (r = 0.14); all at p <0.0001. A weaker but significant association was seen with systolic blood pressure (r = 0.09, p <0.001). These relationships remained significant even after adjustment for age, smoking, body mass index, physical activity, alcohol intake, lung function, presence of diabetes, social class and for each of the other biological variables; the relationship with systolic blood pressure was considerably weakened. No association was seen with blood glucose and HDL-cholesterol. This study has shown significant associations between several lifestyle characteristics and the haematocrit and supports the findings of a significant relationship between the haematocrit and blood lipids and blood pressure. It emphasises the role of the haematocrit in assessing the risk of ischaemic heart disease and stroke in individuals, and the need to take haematocrit levels into account in determining the importance of other cardiovascular risk factors.


Sign in / Sign up

Export Citation Format

Share Document