scholarly journals Enhanced Application of Principal Component Analysis in Machine Learning for Imputation of Missing Traffic Data

2019 ◽  
Vol 9 (10) ◽  
pp. 2149 ◽  
Author(s):  
Yoon-Young Choi ◽  
Heeseung Shon ◽  
Young-Ji Byon ◽  
Dong-Kyu Kim ◽  
Seungmo Kang

Missing value imputation approaches have been widely used to support and maintain the quality of traffic data. Although the spatiotemporal dependency-based approaches can improve the imputation performance for large and continuous missing patterns, additionally considering traffic states can lead to more reliable results. In order to improve the imputation performances further, a section-based approach is also needed. This study proposes a novel approach for identifying traffic-states of different spots of road sections that comprise, namely, a section-based traffic state (SBTS), and determining their spatiotemporal dependencies customized for each SBTS, for missing value imputations. A principal component analysis (PCA) was employed, and angles obtained from the first principal component were used to identify the SBTSs. The pre-processing was combined with a support vector machine for developing the imputation model. It was found that the segmentation of the SBTS using the angles and considering the spatiotemporal dependency for each state by the proposed approach outperformed other existing models.

2011 ◽  
Vol 460-461 ◽  
pp. 716-723
Author(s):  
Dan Ma ◽  
Zhan Qing Chen ◽  
Ji Da Huang ◽  
Hao Jin Lv

The quality of the miner lamp power supply (MLPS) affects the performance of the miner lamp, while the safety performance and quality of the miner lamp are closely related to the safety production in coal mines. The factors, which affect the quality of power supply, are screened through the principal component analysis (PCA). After training the principal extracted component by PCA, the measurement model for the MLPS is set up based on support vector machine (SVM), meanwhile, the Gaussian function, which functions as the kernel function of SVM are selected to simulate, the test results indicate that the measurement model based on PCA-SVM could be used as the detection of the MLPS, which can better ensures the quality reliability of the MLPS.


Energies ◽  
2019 ◽  
Vol 12 (1) ◽  
pp. 196 ◽  
Author(s):  
Lihui Zhang ◽  
Riletu Ge ◽  
Jianxue Chai

China’s energy consumption issues are closely associated with global climate issues, and the scale of energy consumption, peak energy consumption, and consumption investment are all the focus of national attention. In order to forecast the amount of energy consumption of China accurately, this article selected GDP, population, industrial structure and energy consumption structure, energy intensity, total imports and exports, fixed asset investment, energy efficiency, urbanization, the level of consumption, and fixed investment in the energy industry as a preliminary set of factors; Secondly, we corrected the traditional principal component analysis (PCA) algorithm from the perspective of eliminating “bad points” and then judged a “bad spot” sample based on signal reconstruction ideas. Based on the above content, we put forward a robust principal component analysis (RPCA) algorithm and chose the first five principal components as main factors affecting energy consumption, including: GDP, population, industrial structure and energy consumption structure, urbanization; Then, we applied the Tabu search (TS) algorithm to the least square to support vector machine (LSSVM) optimized by the particle swarm optimization (PSO) algorithm to forecast China’s energy consumption. We collected data from 1996 to 2010 as a training set and from 2010 to 2016 as the test set. For easy comparison, the sample data was input into the LSSVM algorithm and the PSO-LSSVM algorithm at the same time. We used statistical indicators including goodness of fit determination coefficient (R2), the root means square error (RMSE), and the mean radial error (MRE) to compare the training results of the three forecasting models, which demonstrated that the proposed TS-PSO-LSSVM forecasting model had higher prediction accuracy, generalization ability, and higher training speed. Finally, the TS-PSO-LSSVM forecasting model was applied to forecast the energy consumption of China from 2017 to 2030. According to predictions, we found that China shows a gradual increase in energy consumption trends from 2017 to 2030 and will breakthrough 6000 million tons in 2030. However, the growth rate is gradually tightening and China’s energy consumption economy will transfer to a state of diminishing returns around 2026, which guides China to put more emphasis on the field of energy investment.


Author(s):  
Hongjuan Yao ◽  
Xiaoqiang Zhao ◽  
Wei Li ◽  
Yongyong Hui

Batch process generally has varying dynamic characteristic that causes low fault detection rate and high false alarm rate, and it is necessary and urgent to monitor batch process. This paper proposes a global enhanced multiple neighborhoods preserving embedding based fault detection strategy for dynamic batch process. Firstly, the angle neighbor is defined and selected to compensate for the insufficient expression for the spatial similarity of samples only by using the distance neighbor, and the time neighbor is introduced to describe the time correlations between samples. These three types of neighbors can fully characterize the similarity of the samples in time and space. Secondly, considering the minimum reconstruction error and the order information of three types of neighbors, an enhanced objective function is constructed to prevent the loss of order information when neighborhood preserving embedding (NPE) calculates the reconstruction weights. Furthermore, the enhanced objective function and a global objective function are organically combined to extract both global and local features, to describe process dynamics and visualize process data in a low-dimensional space. Finally, a monitoring index based on support vector data description is constructed to eliminate adverse effects of non-Gaussian data for monitoring performance. The advantages of the proposed method over principal component analysis, neighborhood preserving embedding, dynamic principal component analysis and time NPE are demonstrated by a numerical example and the penicillin fermentation process simulation.


Sign in / Sign up

Export Citation Format

Share Document