scholarly journals An Ensemble Method for Missing Data of Environmental Sensor Considering Univariate and Multivariate Characteristics

Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7595
Author(s):  
Chanyoung Choi ◽  
Haewoong Jung ◽  
Jaehyuk Cho

With rapid urbanization, awareness of environmental pollution is growing rapidly and, accordingly, interest in environmental sensors that measure atmospheric and indoor air quality is increasing. Since these IoT-based environmental sensors are sensitive and value reliability, it is essential to deal with missing values, which are one of the causes of reliability problems. Characteristics that can be used to impute missing values in environmental sensors are the time dependency of single variables and the correlation between multivariate variables. However, in the existing method of imputing missing values, only one characteristic has been used and there has been no case where both characteristics were used. In this work, we introduced a new ensemble imputation method reflecting this. First, the cases in which missing values occur frequently were divided into four cases and were generated into the experimental data: communication error (aperiodic, periodic), sensor error (rapid change, measurement range). To compare the existing method with the proposed method, five methods of univariate imputation and five methods of multivariate imputation—both of which are widely used—were used as a single model to predict missing values for the four cases. The values predicted by a single model were applied to the ensemble method. Among the ensemble methods, the weighted average and stacking methods were used to derive the final predicted values and replace the missing values. Finally, the predicted values, substituted with the original data, were evaluated by a comparison between the mean absolute error (MAE) and the root mean square error (RMSE). The proposed ensemble method generally performed better than the single method. In addition, this method simultaneously considers the correlation between variables and time dependence, which are characteristics that must be considered in the environmental sensor. As a result, our proposed ensemble technique can contribute to the replacement of the missing values generated by environmental sensors, which can help to increase the reliability of environmental sensor data.

Author(s):  
Tyler F. Rooks ◽  
Andrea S. Dargie ◽  
Valeta Carol Chancey

Abstract A shortcoming of using environmental sensors for the surveillance of potentially concussive events is substantial uncertainty regarding whether the event was caused by head acceleration (“head impacts”) or sensor motion (with no head acceleration). The goal of the present study is to develop a machine learning model to classify environmental sensor data obtained in the field and evaluate the performance of the model against the performance of the proprietary classification algorithm used by the environmental sensor. Data were collected from Soldiers attending sparring sessions conducted under a U.S. Army Combatives School course. Data from one sparring session were used to train a decision tree classification algorithm to identify good and bad signals. Data from the remaining sparring sessions were kept as an external validation set. The performance of the proprietary algorithm used by the sensor was also compared to the trained algorithm performance. The trained decision tree was able to correctly classify 95% of events for internal cross-validation and 88% of events for the external validation set. Comparatively, the proprietary algorithm was only able to correctly classify 61% of the events. In general, the trained algorithm was better able to predict when a signal was good or bad compared to the proprietary algorithm. The present study shows it is possible to train a decision tree algorithm using environmental sensor data collected in the field.


Author(s):  
Gerald Bloom ◽  
Hayley MacGregor

Rapid development has brought significant economic and health benefits, but it has also exposed populations to new health risks. Public health as a scientific discipline and major government responsibility developed during the nineteenth century to help mitigate these risks. Public health actions need to take into account large inequalities in the benefits and harms associated with development between countries, between social groups, and between generations. This is especially important in the present context of very rapid change. It is important to acknowledge the global nature of the challenges people face and the need to involve countries with different cultures and historical legacies in arriving at consensus on an ethical basis for global cooperation in addressing these challenges. This chapter provides an analysis of these issues, using examples on the management of health risks associated with global development and rapid urbanization and on the emergence of organisms that are resistant to antibiotics.


Sensors ◽  
2018 ◽  
Vol 18 (9) ◽  
pp. 2884 ◽  
Author(s):  
Xiaobo Chen ◽  
Cheng Chen ◽  
Yingfeng Cai ◽  
Hai Wang ◽  
Qiaolin Ye

The problem of missing values (MVs) in traffic sensor data analysis is universal in current intelligent transportation systems because of various reasons, such as sensor malfunction, transmission failure, etc. Accurate imputation of MVs is the foundation of subsequent data analysis tasks since most analysis algorithms need complete data as input. In this work, a novel MVs imputation approach termed as kernel sparse representation with elastic net regularization (KSR-EN) is developed for reconstructing MVs to facilitate analysis with traffic sensor data. The idea is to represent each sample as a linear combination of other samples due to inherent spatiotemporal correlation, as well as periodicity of daily traffic flow. To discover few yet correlated samples and make full use of the valuable information, a combination of l1-norm and l2-norm is employed to penalize the combination coefficients. Moreover, the linear representation among samples is extended to nonlinear representation by mapping input data space into high-dimensional feature space, which further enhances the recovery performance of our proposed approach. An efficient iterative algorithm is developed for solving KSR-EN model. The proposed method is verified on both an artificially simulated dataset and a public road network traffic sensor data. The results demonstrate the effectiveness of the proposed approach in terms of MVs imputation.


Author(s):  
Sherong Zhang ◽  
Ting Liu ◽  
Chao Wang

Abstract Building safety assessment based on single sensor data has the problems of low reliability and high uncertainty. Therefore, this paper proposes a novel multi-source sensor data fusion method based on Improved Dempster–Shafer (D-S) evidence theory and Back Propagation Neural Network (BPNN). Before data fusion, the improved self-support function is adopted to preprocess the original data. The process of data fusion is divided into three steps: Firstly, the feature of the same kind of sensor data is extracted by the adaptive weighted average method as the input source of BPNN. Then, BPNN is trained and its output is used as the basic probability assignment (BPA) of D-S evidence theory. Finally, Bhattacharyya Distance (BD) is introduced to improve D-S evidence theory from two aspects of evidence distance and conflict factors, and multi-source data fusion is realized by D-S synthesis rules. In practical application, a three-level information fusion framework of the data level, the feature level, and the decision level is proposed, and the safety status of buildings is evaluated by using multi-source sensor data. The results show that compared with the fusion result of the traditional D-S evidence theory, the algorithm improves the accuracy of the overall safety state assessment of the building and reduces the MSE from 0.18 to 0.01%.


Author(s):  
Alejandro Llaves ◽  
Oscar Corcho ◽  
Peter Taylor ◽  
Kerry Taylor

This paper presents a generic approach to integrate environmental sensor data efficiently, allowing the detection of relevant situations and events in near real-time through continuous querying. Data variety is addressed with the use of the Semantic Sensor Network ontology for observation data modelling, and semantic annotations for environmental phenomena. Data velocity is handled by distributing sensor data messaging and serving observations as RDF graphs on query demand. The stream processing engine presented in the paper, morph-streams++, provides adapters for different data formats and distributed processing of streams in a cluster. An evaluation of different configurations for parallelization and semantic annotation parameters proves that the described approach reduces the average latency of message processing in some cases.


Electronics ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 859
Author(s):  
Jingjing Gu ◽  
Zhiteng Dong ◽  
Cai Zhang ◽  
Xiaojiang Du ◽  
Mohsen Guizani

Applying parachutes-deployed Wireless Sensor Network (WSN) in monitoring the high-altitude space is a promising solution for its effectiveness and cost. However, both the high deviation of data and the rapid change of various environment factors (air pressure, temperature, wind speed, etc.) pose a great challenge. To this end, we solve this challenge with data compensation in dynamic stress measurements of parachutes during the working stage. Specifically, we construct a data compensation model to correct the deviation based on neural network by taking into account a variety of environmental parameters, and name it as Data Compensation based on Back Propagation Neural Network (DC-BPNN). Then, for improving the speed and accuracy of training the DC-BPNN, we propose a novel Adaptive Artificial Bee Colony (AABC) algorithm. We also address its stability of solution by deriving a stability bound. Finally, to verify the real performance, we conduct a set of real implemented experiments of airdropped WSN.


Sign in / Sign up

Export Citation Format

Share Document