scholarly journals An Efficient Method to Improve the Clustering Performance using Hybrid Robust Principal Component Analysis-Spectral biclustering in Rainfall Patterns Identification

Author(s):  
Shazlyn Milleana Shaharudin ◽  
Shuhaida Ismail ◽  
Siti Mariana Che Mat Nor ◽  
Norhaiza Ahmad

<p>In this study, hybrid RPCA-spectral biclustering model is proposed in identifying the Peninsular Malaysia rainfall pattern. This model is a combination between Robust Principal Component Analysis (RPCA) and bi-clustering in order to overcome the skewness problem that existed in the Peninsular Malaysia rainfall data. The ability of Robust PCA is more resilient to outlier given that it assesses every observation and downweights the ones which deviate from the data center compared to classical PCA. Meanwhile, two way-clustering able to simultaneously cluster along two variables and exhibit a high correlation compared to one-way cluster analysis. The experimental results showed that the best cumulative percentage of variation in between 65% - 70% for both Robust and classical PCA. Meanwhile, the number of clusters has improved from six disjointed cluster in Robust PCA-kMeans to eight disjointed cluster for the proposed model. Further analysis shows that the proposed model has smaller variation with the values of 0.0034 compared to 0.030 in Robust PCA-kMeans model. Evident from this analysis, it is proven that the proposed RPCA-spectral biclustering model is predominantly acclimatized to the identifying rainfall patterns in Peninsular Malaysia due to the small variation of the clustering result.</p>

Author(s):  
Shazlyn Milleana Shaharudin ◽  
Norhaiza Ahmad ◽  
Siti Mariana Che Mat Nor

This paper presents a modified correlation in principal component analysis (PCA) for selection number of clusters in identifying rainfall patterns. The approach of a clustering as guided by PCA is extensively employed in data with high dimension especially in identifying the spatial distribution patterns of daily torrential rainfall. Typically, a common method of identifying rainfall patterns for climatological investigation employed T mode-based Pearson correlation matrix to extract the relative variance retained. However, the data of rainfall in Peninsular Malaysia involved skewed observations in the direction of higher values with pure tendencies of values that are positive. Therefore, using Pearson correlation which was basing on PCA on rainfall set of data has the potentioal to influence the partitions of cluster as well as producing exceptionally clusters that are eneven in a space with high dimension. For current research, to resolve the unbalanced clusters challenge regarding the patterns of rainfall caused by the skewed character of the data, a robust dimension reduction method in PCA was employed. Thus, it led to the introduction of a robust measure in PCA with Tukey’s biweight correlation to downweigh observations along with the optimal breakdown point to obtain PCA’s quantity of components. Outcomes of this study displayed a highly substantial progress for the robust PCA, contrasting with the PCA-based Pearson correlation in respects to the average amount of acquired clusters and indicated 70% variance cumulative percentage at the breakdown point of 0.4.


Author(s):  
S.M. Shaharudin ◽  
N. Ahmad ◽  
N.H. Zainuddin ◽  
N.S. Mohamed

A robust dimension reduction method in Principal Component Analysis (PCA) was used to rectify the issue of unbalanced clusters in rainfall patterns due to the skewed nature of rainfall data. A robust measure in PCA using Tukey’s biweight correlation to downweigh observations was introduced and the optimum breakdown point to extract the number of components in PCA using this approach is proposed. A set of simulated data matrix that mimicked the real data set was used to determine an appropriate breakdown point for robust PCA and  compare the performance of the both approaches. The simulated data indicated a breakdown point of 70% cumulative percentage of variance gave a good balance in extracting the number of components .The results showed a  more significant and substantial improvement with the robust PCA than the PCA based Pearson correlation in terms of the average number of clusters obtained and its cluster quality.


Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3527
Author(s):  
Melanija Vezočnik ◽  
Roman Kamnik ◽  
Matjaz B. Juric

Inertial sensor-based step length estimation has become increasingly important with the emergence of pedestrian-dead-reckoning-based (PDR-based) indoor positioning. So far, many refined step length estimation models have been proposed to overcome the inaccuracy in estimating distance walked. Both the kinematics associated with the human body during walking and actual step lengths are rarely used in their derivation. Our paper presents a new step length estimation model that utilizes acceleration magnitude. To the best of our knowledge, we are the first to employ principal component analysis (PCA) to characterize the experimental data for the derivation of the model. These data were collected from anatomical landmarks on the human body during walking using a highly accurate optical measurement system. We evaluated the performance of the proposed model for four typical smartphone positions for long-term human walking and obtained promising results: the proposed model outperformed all acceleration-based models selected for the comparison producing an overall mean absolute stride length estimation error of 6.44 cm. The proposed model was also least affected by walking speed and smartphone position among acceleration-based models and is unaffected by smartphone orientation. Therefore, the proposed model can be used in the PDR-based indoor positioning with an important advantage that no special care regarding orientation is needed in attaching the smartphone to a particular body segment. All the sensory data acquired by smartphones that we utilized for evaluation are publicly available and include more than 10 h of walking measurements.


Energies ◽  
2019 ◽  
Vol 12 (1) ◽  
pp. 196 ◽  
Author(s):  
Lihui Zhang ◽  
Riletu Ge ◽  
Jianxue Chai

China’s energy consumption issues are closely associated with global climate issues, and the scale of energy consumption, peak energy consumption, and consumption investment are all the focus of national attention. In order to forecast the amount of energy consumption of China accurately, this article selected GDP, population, industrial structure and energy consumption structure, energy intensity, total imports and exports, fixed asset investment, energy efficiency, urbanization, the level of consumption, and fixed investment in the energy industry as a preliminary set of factors; Secondly, we corrected the traditional principal component analysis (PCA) algorithm from the perspective of eliminating “bad points” and then judged a “bad spot” sample based on signal reconstruction ideas. Based on the above content, we put forward a robust principal component analysis (RPCA) algorithm and chose the first five principal components as main factors affecting energy consumption, including: GDP, population, industrial structure and energy consumption structure, urbanization; Then, we applied the Tabu search (TS) algorithm to the least square to support vector machine (LSSVM) optimized by the particle swarm optimization (PSO) algorithm to forecast China’s energy consumption. We collected data from 1996 to 2010 as a training set and from 2010 to 2016 as the test set. For easy comparison, the sample data was input into the LSSVM algorithm and the PSO-LSSVM algorithm at the same time. We used statistical indicators including goodness of fit determination coefficient (R2), the root means square error (RMSE), and the mean radial error (MRE) to compare the training results of the three forecasting models, which demonstrated that the proposed TS-PSO-LSSVM forecasting model had higher prediction accuracy, generalization ability, and higher training speed. Finally, the TS-PSO-LSSVM forecasting model was applied to forecast the energy consumption of China from 2017 to 2030. According to predictions, we found that China shows a gradual increase in energy consumption trends from 2017 to 2030 and will breakthrough 6000 million tons in 2030. However, the growth rate is gradually tightening and China’s energy consumption economy will transfer to a state of diminishing returns around 2026, which guides China to put more emphasis on the field of energy investment.


2020 ◽  
Vol 5 (5) ◽  
Author(s):  
Isabel Scherl ◽  
Benjamin Strom ◽  
Jessica K. Shang ◽  
Owen Williams ◽  
Brian L. Polagye ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document