Physical Activity Classification with Wearable Sensor Data in an eCoach Recommendation System (Preprint)

Bipartite Network of Interest (BNOI): Extending Co-Word Network with Interest of Researchers Using Sensor Data and Corresponding Applications as an Example

Sensors ◽

10.3390/s21051668 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1668

Author(s):

Zongming Dai ◽

Kai Hu ◽

Jie Xie ◽

Shengyu Shen ◽

Jie Zheng ◽

...

Keyword(s):

Feature Fusion ◽

Extraction Methods ◽

Knowledge Network ◽

Sensor Data ◽

Support Vector ◽

Bipartite Network ◽

Classification Models ◽

Text Data ◽

Domain Experts ◽

Problems And Solutions

Traditional co-word networks do not discriminate keywords of researcher interest from general keywords. Co-word networks are therefore often too general to provide knowledge if interest to domain experts. Inspired by the recent work that uses an automatic method to identify the questions of interest to researchers like “problems” and “solutions”, we try to answer a similar question “what sensors can be used for what kind of applications”, which is great interest in sensor- related fields. By generalizing the specific questions as “questions of interest”, we built a knowledge network considering researcher interest, called bipartite network of interest (BNOI). Different from a co-word approaches using accurate keywords from a list, BNOI uses classification models to find possible entities of interest. A total of nine feature extraction methods including N-grams, Word2Vec, BERT, etc. were used to extract features to train the classification models, including naïve Bayes (NB), support vector machines (SVM) and logistic regression (LR). In addition, a multi-feature fusion strategy and a voting principle (VP) method are applied to assemble the capability of the features and the classification models. Using the abstract text data of 350 remote sensing articles, features are extracted and the models trained. The experiment results show that after removing the biased words and using the ten-fold cross-validation method, the F-measure of “sensors” and “applications” are 93.2% and 85.5%, respectively. It is thus demonstrated that researcher questions of interest can be better answered by the constructed BNOI based on classification results, comparedwith the traditional co-word network approach.

Download Full-text

Effects of kernels and the proportion of training data on the accuracy of SVM sentiment analysis in lecturer evaluation

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v9.i4.pp734-743 ◽

2020 ◽

Vol 9 (4) ◽

pp. 734

Author(s):

Daniel Febrian Sengkey ◽

Agustinus Jacobus ◽

Fabian Johanes Manoppo

Keyword(s):

Support Vector Machine ◽

Sentiment Analysis ◽

Statistical Methods ◽

Statistical Test ◽

The Other ◽

Training Data ◽

Support Vector ◽

Linear Kernel ◽

Linear Polynomial ◽

Accuracy Data

Support vector machine (SVM) is a known method for supervised learning in sentiment analysis and there are many studies about the use of SVM in classifying the sentiments in lecturer evaluation. SVM has various parameters that can be tuned and kernels that can be chosen to improve the classifier accuracy. However, not all options have been explored. Therefore, in this study we compared the four SVM kernels: radial, linear, polynomial, and sigmoid, to discover how each kernel influences the accuracy of the classifier. To make a proper assessment, we used our labeled dataset of students’ evaluations toward the lecturer. The dataset was split, one for training the classifier, and another one for testing the model. As an addition, we also used several different ratios of the training:testing dataset. The split ratios are 0.5 to 0.95, with the increment factor of 0.05. The dataset was split randomly, hence the splitting-training-testing processes were repeated 1,000 times for each kernel and splitting ratio. Therefore, at the end of the experiment, we got 40,000 accuracy data. Later, we applied statistical methods to see whether the differences are significant. Based on the statistical test, we found that in this particular case, the linear kernel significantly has higher accuracy compared to the other kernels. However, there is a tradeoff, where the results are getting more varied with a higher proportion of data used for training.

Download Full-text

Transfer Learning Strategies for Anomaly Detection in IoT Vibration Data

10.36227/techrxiv.16844521.v1 ◽

2021 ◽

Author(s):

Clemens Heistracher ◽

Anahid Jalali ◽

Indu Strobl ◽

Axel Suendermann ◽

Sebastian Meixner ◽

...

Keyword(s):

Anomaly Detection ◽

Learning Strategies ◽

Transfer Learning ◽

Training Data ◽

Sensor Data ◽

Classification Models ◽

Vibration Data ◽

Starting Point ◽

Wide Range ◽

Anomaly Classification

<div>Abstract—An increasing number of industrial assets are equipped with IoT sensor platforms and the industry now expects data-driven maintenance strategies with minimal deployment costs. However, gathering labeled training data for supervised tasks such as anomaly detection is costly and often difficult to implement in operational environments. Therefore, this work aims to design and implement a solution that reduces the required amount of data for training anomaly classification models on time series sensor data and thereby brings down the overall deployment effort of IoT anomaly detection sensors. We set up several in-lab experiments using three peristaltic pumps and investigated approaches for transferring trained anomaly detection models across assets of the same type. Our experiments achieved promising effectiveness and provide initial evidence that transfer learning could be a suitable strategy for using pretrained anomaly classification models across industrial assets of the same type with minimal prior labeling and training effort. This work could serve as a starting point for more general, pretrained sensor data embeddings, applicable to a wide range of assets.</div>

Download Full-text

A Systematic Methodology to Evaluate Prediction Models for Driving Style Classification

Sensors ◽

10.3390/s20061692 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1692 ◽

Cited By ~ 6

Author(s):

Iván Silva ◽

José Eugenio Naranjo

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Performance Metrics ◽

Prediction Models ◽

Statistical Tests ◽

Area Under The Curve ◽

The Other ◽

Support Vector ◽

Classification Models ◽

K Nearest Neighbor

Identifying driving styles using classification models with in-vehicle data can provide automated feedback to drivers on their driving behavior, particularly if they are driving safely. Although several classification models have been developed for this purpose, there is no consensus on which classifier performs better at identifying driving styles. Therefore, more research is needed to evaluate classification models by comparing performance metrics. In this paper, a data-driven machine-learning methodology for classifying driving styles is introduced. This methodology is grounded in well-established machine-learning (ML) methods and literature related to driving-styles research. The methodology is illustrated through a study involving data collected from 50 drivers from two different cities in a naturalistic setting. Five features were extracted from the raw data. Fifteen experts were involved in the data labeling to derive the ground truth of the dataset. The dataset fed five different models (Support Vector Machines (SVM), Artificial Neural Networks (ANN), fuzzy logic, k-Nearest Neighbor (kNN), and Random Forests (RF)). These models were evaluated in terms of a set of performance metrics and statistical tests. The experimental results from performance metrics showed that SVM outperformed the other four models, achieving an average accuracy of 0.96, F1-Score of 0.9595, Area Under the Curve (AUC) of 0.9730, and Kappa of 0.9375. In addition, Wilcoxon tests indicated that ANN predicts differently to the other four models. These promising results demonstrate that the proposed methodology may support researchers in making informed decisions about which ML model performs better for driving-styles classification.

Download Full-text

Prediction of Relative Physical Activity Intensity Using Multimodal Sensing of Physiological Data

Sensors ◽

10.3390/s19204509 ◽

2019 ◽

Vol 19 (20) ◽

pp. 4509 ◽

Cited By ~ 2

Author(s):

Alok Kumar Chowdhury ◽

Dian Tjondronegoro ◽

Vinod Chandran ◽

Jinglan Zhang ◽

Stewart G. Trost

Keyword(s):

Physical Activity ◽

Machine Learning ◽

Perceived Exertion ◽

Feature Fusion ◽

Electrodermal Activity ◽

Decision Fusion ◽

Sensor Data ◽

Physiological Data ◽

Support Vector ◽

Improve Performance

This study examined the feasibility of a non-laboratory approach that uses machine learning on multimodal sensor data to predict relative physical activity (PA) intensity. A total of 22 participants completed up to 7 PA sessions, where each session comprised 5 trials (sitting and standing, comfortable walk, brisk walk, jogging, running). Participants wore a wrist-strapped sensor that recorded heart-rate (HR), electrodermal activity (Eda) and skin temperature (Temp). After each trial, participants provided ratings of perceived exertion (RPE). Three classifiers, including random forest (RF), neural network (NN) and support vector machine (SVM), were applied independently on each feature set to predict relative PA intensity as low (RPE ≤ 11), moderate (RPE 12–14), or high (RPE ≥ 15). Then, both feature fusion and decision fusion of all combinations of sensor modalities were carried out to investigate the best combination. Among the single modality feature sets, HR provided the best performance. The combination of modalities using feature fusion provided a small improvement in performance. Decision fusion did not improve performance over HR features alone. A machine learning approach using features from HR provided acceptable predictions of relative PA intensity. Adding features from other sensing modalities did not significantly improve performance.

Download Full-text

A New Transfer Learning Ensemble Model with New Training Methods for Gear Wear Particle Recognition

Shock and Vibration ◽

10.1155/2022/3696091 ◽

2022 ◽

Vol 2022 ◽

pp. 1-10

Author(s):

Chunhua Zhao ◽

zhangwen Lin ◽

Jinling Tan ◽

Hengxing Hu ◽

Qian Li

Keyword(s):

Transfer Learning ◽

Wear Particle ◽

Experimental Results ◽

The Other ◽

Particle Identification ◽

Lubricating Oil ◽

Support Vector ◽

Training Methods ◽

Wear Particles ◽

Feature Expression

Aiming at solving the acquisition problems of wear particle data of large-modulus gear teeth and few training datasets, an integrated model of LCNNE based on transfer learning is proposed in this paper. Firstly, the wear particles are diagnosed and classified by connecting a new joint loss function and two pretrained models VGG19 and GoogLeNet. Subsequently, the wear particles in gearbox lubricating oil are chosen as the experimental object to make a comparison. Compared with the other four models’ experimental results, the model superiority in wear particle identification and classification is verified. Taking five models as feature extractors and support vector machines as classifiers, the experimental results and comparative analysis reveal that the LCNNE model is better than the other four models because its feature expression ability is stronger than that of the other four models.

Download Full-text

Transfer Learning Strategies for Anomaly Detection in IoT Vibration Data

10.36227/techrxiv.16844521 ◽

2021 ◽

Author(s):

Clemens Heistracher ◽

Anahid Jalali ◽

Indu Strobl ◽

Axel Suendermann ◽

Sebastian Meixner ◽

...

Keyword(s):

Anomaly Detection ◽

Learning Strategies ◽

Transfer Learning ◽

Training Data ◽

Sensor Data ◽

Classification Models ◽

Vibration Data ◽

Starting Point ◽

Wide Range ◽

Anomaly Classification

<div>Abstract—An increasing number of industrial assets are equipped with IoT sensor platforms and the industry now expects data-driven maintenance strategies with minimal deployment costs. However, gathering labeled training data for supervised tasks such as anomaly detection is costly and often difficult to implement in operational environments. Therefore, this work aims to design and implement a solution that reduces the required amount of data for training anomaly classification models on time series sensor data and thereby brings down the overall deployment effort of IoT anomaly detection sensors. We set up several in-lab experiments using three peristaltic pumps and investigated approaches for transferring trained anomaly detection models across assets of the same type. Our experiments achieved promising effectiveness and provide initial evidence that transfer learning could be a suitable strategy for using pretrained anomaly classification models across industrial assets of the same type with minimal prior labeling and training effort. This work could serve as a starting point for more general, pretrained sensor data embeddings, applicable to a wide range of assets.</div>

Download Full-text

Detection of Meat Adulteration Using Spectroscopy-Based Sensors

Foods ◽

10.3390/foods10040861 ◽

2021 ◽

Vol 10 (4) ◽

pp. 861

Author(s):

Lemonia-Christina Fengou ◽

Alexandra Lianou ◽

Panagiοtis Tsakanikas ◽

Fady Mohareb ◽

George-John E. Nychas

Keyword(s):

Spectroscopic Data ◽

Morphological Characteristics ◽

The Other ◽

Support Vector ◽

Classification Models ◽

Considerable Potential ◽

Minced Meat ◽

Food Commodity ◽

Different Levels ◽

Detection And Quantification

Minced meat is a vulnerable to adulteration food commodity because species- and/or tissue-specific morphological characteristics cannot be easily identified. Hence, the economically motivated adulteration of minced meat is rather likely to be practiced. The objective of this work was to assess the potential of spectroscopy-based sensors in detecting fraudulent minced meat substitution, specifically of (i) beef with bovine offal and (ii) pork with chicken (and vice versa) both in fresh and frozen-thawed samples. For each case, meat pieces were minced and mixed so that different levels of adulteration with a 25% increment were achieved while two categories of pure meat also were considered. From each level of adulteration, six different samples were prepared. In total, 120 samples were subjected to visible (Vis) and fluorescence (Fluo) spectra and multispectral image (MSI) acquisition. Support Vector Machine classification models were developed and evaluated. The MSI-based models outperformed the ones based on the other sensors with accuracy scores varying from 87% to 100%. The Vis-based models followed in terms of accuracy with attained scores varying from 57% to 97% while the lowest performance was demonstrated by the Fluo-based models. Overall, spectroscopic data hold a considerable potential for the detection and quantification of minced meat adulteration, which, however, appears to be sensor-specific.

Download Full-text

Using Sensor Data to Detect Lameness and Mastitis Treatment Events in Dairy Cows: A Comparison of Classification Models

Sensors ◽

10.3390/s20143863 ◽

2020 ◽

Vol 20 (14) ◽

pp. 3863 ◽

Cited By ~ 1

Author(s):

Christian Post ◽

Christian Rietz ◽

Wolfgang Büscher ◽

Ute Müller

Keyword(s):

Machine Learning ◽

Random Forest ◽

Dairy Cows ◽

Sensitivity And Specificity ◽

Training Data ◽

Sensor Data ◽

Support Vector ◽

Classification Models ◽

Sampled Data ◽

Learning Methods

The aim of this study was to develop classification models for mastitis and lameness treatments in Holstein dairy cows as the target variables based on continuous data from herd management software with modern machine learning methods. Data was collected over a period of 40 months from a total of 167 different cows with daily individual sensor information containing milking parameters, pedometer activity, feed and water intake, and body weight (in the form of differently aggregated data) as well as the entered treatment data. To identify the most important predictors for mastitis and lameness treatments, respectively, Random Forest feature importance, Pearson’s correlation and sequential forward feature selection were applied. With the selected predictors, various machine learning models such as Logistic Regression (LR), Support Vector Machine (SVM), K-nearest neighbors (KNN), Gaussian Naïve Bayes (GNB), Extra Trees Classifier (ET) and different ensemble methods such as Random Forest (RF) were trained. Their performance was compared using the receiver operator characteristic (ROC) area-under-curve (AUC), as well as sensitivity, block sensitivity and specificity. In addition, sampling methods were compared: Over- and undersampling as compensation for the expected unbalanced training data had a high impact on the ratio of sensitivity and specificity in the classification of the test data, but with regard to AUC, random oversampling and SMOTE (Synthetic Minority Over-sampling) even showed significantly lower values than with non-sampled data. The best model, ET, obtained a mean AUC of 0.79 for mastitis and 0.71 for lameness, respectively, based on testing data from practical conditions and is recommended by us for this type of data, but GNB, LR and RF were only marginally worse, and random oversampling and SMOTE even showed significantly lower values than without sampling. We recommend the use of these models as a benchmark for similar self-learning classification tasks. The classification models presented here retain their interpretability with the ability to present feature importances to the farmer in contrast to the “black box” models of Deep Learning methods.

Download Full-text

Haematocrit: Relationships with Blood Lipids, Blood Pressure and other Cardiovascular Risk Factors

Thrombosis and Haemostasis ◽

10.1055/s-0038-1648811 ◽

1994 ◽

Vol 72 (01) ◽

pp. 058-064 ◽

Cited By ~ 14

Author(s):

Goya Wannamethee ◽

A Gerald Shaper

Keyword(s):

Physical Activity ◽

Risk Factors ◽

Blood Pressure ◽

Body Mass Index ◽

Cardiovascular Risk ◽

Cardiovascular Risk Factors ◽

Alcohol Intake ◽

Blood Lipids ◽

The Other ◽

The Relationship

SummaryThe relationship between haematocrit and cardiovascular risk factors, particularly blood pressure and blood lipids, has been examined in detail in a large prospective study of 7735 middle-aged men drawn from general practices in 24 British towns. The analyses are restricted to the 5494 men free of any evidence of ischaemic heart disease at screening.Smoking, body mass index, physical activity, alcohol intake and lung function (FEV1) were factors strongly associated with haematocrit levels independent of each other. Age showed a significant but small independent association with haematocrit. Non-manual workers had slightly higher haematocrit levels than manual workers; this difference increased considerably and became significant after adjustment for the other risk factors. Diabetics showed significantly lower levels of haematocrit than non-diabetics. In the univariate analysis, haematocrit was significantly associated with total serum protein (r = 0*18), cholesterol (r = 0.16), triglyceride (r = 0.15), diastolic blood pressure (r = 0.17) and heart rate (r = 0.14); all at p <0.0001. A weaker but significant association was seen with systolic blood pressure (r = 0.09, p <0.001). These relationships remained significant even after adjustment for age, smoking, body mass index, physical activity, alcohol intake, lung function, presence of diabetes, social class and for each of the other biological variables; the relationship with systolic blood pressure was considerably weakened. No association was seen with blood glucose and HDL-cholesterol. This study has shown significant associations between several lifestyle characteristics and the haematocrit and supports the findings of a significant relationship between the haematocrit and blood lipids and blood pressure. It emphasises the role of the haematocrit in assessing the risk of ischaemic heart disease and stroke in individuals, and the need to take haematocrit levels into account in determining the importance of other cardiovascular risk factors.

Download Full-text