A new singular spectrum analysis approach for processing incomplete time series polluted by multiplicative noise

Author(s):  
Yunzhong Shen ◽  
Fengwei Wang ◽  
Qiujie Chen

<p>Since a time series is usually incomplete, the missing data are usually interpolated before employing singular spectrum analysis (SSA). We develop a new SSA for processing incomplete time series based on the property that an original time series can be reproduced from its principal components which are then estimated based on minimum norm criterion. When an incomplete time series is polluted by multiplicative noise, we first convert the multiplicative noise to additive noise by multiplying the signal estimate of the time series, then process the time series with weighted SSA, where the weight factor is determined according to the variance of additive noise, since the converted additive noise is heterogeneous. The proposed SSA approach is employed to process the real incomplete time series data of suspended-sediment concentration from San Francisco Bay compared to the traditional SSA and homomorphic log-transformation SSA approach. The first 10 principal components derived by our proposed SSA approach can capture more of the total variance and with less fitting error than traditional SSA approach and homomorphic log-transformation SSA approach. Furthermore, the results from the simulation cases conform that our proposed SSA outperform both traditional and homomorphic log-transformation SSA approaches.</p>

2014 ◽  
Vol 1 (2) ◽  
pp. 1947-1966
Author(s):  
Y. Shen ◽  
F. Peng ◽  
B. Li

Abstract. Singular spectrum analysis (SSA) is a powerful technique for time series analysis. Based on the property that the original time series can be reproduced from its principal components, this contribution will develop an improved SSA (ISSA) for processing the incomplete time series and the modified SSA (SSAM) of Schoellhamer (2001) is its special case. The approach was evaluated with the synthetic and real incomplete time series data of suspended-sediment concentration from San Francisco Bay. The result from the synthetic time series with missing data shows that the relative errors of the principal components reconstructed by ISSA are much smaller than those reconstructed by SSAM. Moreover, when the percentage of the missing data over the whole time series reaches 60%, the improvements of relative errors are up to 19.64, 41.34, 23.27 and 50.30% for the first four principal components, respectively. Besides, both the mean absolute errors and mean root mean squared errors of the reconstructed time series by ISSA are also much smaller than those by SSAM. The respective improvements are 34.45 and 33.91% when the missing data accounts for 60%. The results from real incomplete time series also show that the SD derived by ISSA is 12.27 mg L−1, smaller than 13.48 mg L−1 derived by SSAM.


2015 ◽  
Vol 22 (4) ◽  
pp. 371-376 ◽  
Author(s):  
Y. Shen ◽  
F. Peng ◽  
B. Li

Abstract. Singular spectrum analysis (SSA) is a powerful technique for time series analysis. Based on the property that the original time series can be reproduced from its principal components, this contribution develops an improved SSA (ISSA) for processing the incomplete time series and the modified SSA (SSAM) of Schoellhamer (2001) is its special case. The approach is evaluated with the synthetic and real incomplete time series data of suspended-sediment concentration from San Francisco Bay. The result from the synthetic time series with missing data shows that the relative errors of the principal components reconstructed by ISSA are much smaller than those reconstructed by SSAM. Moreover, when the percentage of the missing data over the whole time series reaches 60 %, the improvements of relative errors are up to 19.64, 41.34, 23.27 and 50.30 % for the first four principal components, respectively. Both the mean absolute error and mean root mean squared error of the reconstructed time series by ISSA are also smaller than those by SSAM. The respective improvements are 34.45 and 33.91 % when the missing data accounts for 60 %. The results from real incomplete time series also show that the standard deviation (SD) derived by ISSA is 12.27 mg L−1, smaller than the 13.48 mg L−1 derived by SSAM.


2018 ◽  
Vol 17 (02) ◽  
pp. 1850017 ◽  
Author(s):  
Mahdi Kalantari ◽  
Masoud Yarmohammadi ◽  
Hossein Hassani ◽  
Emmanuel Sirimal Silva

Missing values in time series data is a well-known and important problem which many researchers have studied extensively in various fields. In this paper, a new nonparametric approach for missing value imputation in time series is proposed. The main novelty of this research is applying the [Formula: see text] norm-based version of Singular Spectrum Analysis (SSA), namely [Formula: see text]-SSA which is robust against outliers. The performance of the new imputation method has been compared with many other established methods. The comparison is done by applying them to various real and simulated time series. The obtained results confirm that the SSA-based methods, especially [Formula: see text]-SSA can provide better imputation in comparison to other methods.


Author(s):  
S.M. Shaharudin ◽  
N. Ahmad ◽  
N.H. Zainuddin

<p>Identifying the local time scale of the torrential rainfall pattern through Singular Spectrum Analysis (SSA) is useful to separate the trend and noise components. However, SSA poses two main issues which are torrential rainfall time series data have coinciding singular values and the leading components from eigenvector obtained from the decomposing time series matrix are usually assesed by graphical inference lacking in a specific statistical measure. In consequences to both issues, the extracted trend from SSA tended to flatten out and did not show any distinct pattern.  This problem was approached in two ways. First, an Iterative Oblique SSA (Iterative O-SSA) was presented to make adjustment to the singular values data. Second, a measure was introduced to group the decomposed eigenvector based on Robust Sparse K-means (RSK-Means). As the results, the extracted trend using modification of SSA appeared to fit the original time series and looked more flexible compared to SSA.</p>


Atmosphere ◽  
2018 ◽  
Vol 9 (9) ◽  
pp. 334 ◽  
Author(s):  
Hamid Ghafarian Malamiri ◽  
Iman Rousta ◽  
Haraldur Olafsson ◽  
Hadi Zare ◽  
Hao Zhang

Land surface temperature (LST) is a basic parameter in energy exchange between the land and the atmosphere, and is frequently used in many sciences such as climatology, hydrology, agriculture, ecology, etc. Time series of satellite LST data have usually deficient, missing, and unacceptable data caused by the presence of clouds in images, the presence of dust in the atmosphere, and sensor failure. In this study, the singular spectrum analysis (SSA) algorithm was used to resolve the problem of missing and outlier data caused by cloud cover. The region studied in the present research included an image frame of the Moderate Resolution Imaging Spectroradiometer (MODIS) with horizontal number 22 and vertical number 05 (h22v05). This image involved a large part of Iran, Turkmenistan, and the Caspian Sea. In this study, MODIS LST products (MOD11A1) were used during 2015 with approximately 1 km × 1 km spatial resolution and day/night LST data (daily temporal resolution). On average, the data have 36.37% gaps in each pixel profile with 730 day/night LST data. The results of the SSA algorithm in the reconstruction of LST images indicated a root mean square error (RMSE) of 2.95 Kelvin (K) between the original and reconstructed LST time series data in the study region. In general, the findings showed that the SSA algorithm using spatio-temporal interpolation can be effectively used to resolve the problem of missing data caused by cloud cover.


2021 ◽  
Author(s):  
Shu Kaneko ◽  
Katsumi Hattori ◽  
Toru Mogi ◽  
Chie Yoshino

&lt;p&gt;Off the coast of the Boso Peninsula, there is a triple junction of the Pacific Plate, the Philippine Sea Plate, and the North American Plate and the Boso Peninsula is one of the seismically active areas in Japan. There are also epicenter areas such as the 1703 Genroku Kanto Earthquake (M8.2), the 1923 Taisho Kanto Earthquake (M7.9), and the Boso Slow Slip which occurs every 6 years, which are geologically interesting places. To estimate the subsurface resistivity structure of the whole Boso area, Magnetotelluric (MT) survey with 41 sites (inter-sites distance of 7 km) has been conducted in 2014-2016, using U43 (12 sites, 1 Hz sampling ; Tierra Technica) and MTU-5, 5A, net (41 sites, 15, 150, and 2400 Hz sampling; Phoenix Geophysics). However, the Boso area is greatly affected by leak current from DC-driven trains, factories, and power lines, so the observed data are contaminated by artificial noises. When we tried to apply the conventional noise reduction method (e.g., remote reference (Gamble et al., 1979) and BIRRP (Chave and Thomson, 2004)) in frequency domain, the obtained MT sounding curve was not ideal. In particular, the phase between the periods of 20 and 400 sec was close to 0 degrees. It suggests that the method used is insufficient to reduce the near-field effect for the Boso data. Thus, we developed a new noise reduction method using MSSA (Multi-channel Singular Spectrum Analysis) as a pre-processing method in time domain.&lt;/p&gt;&lt;p&gt;The procedure is as follows;&lt;/p&gt;&lt;p&gt;(1) Decompose 6 component data (Hx, Hy, Ex, Ey, Hxr and Hyr: H and E means magnetic and electric field, respectively, x and y indicates NS and EW component, and r denotes the reference field observed at a quiet station) using MSSA into 6&amp;#215;M principal components (PCs). &amp;#160;Here, M shows the window length of MSSA.&lt;/p&gt;&lt;p&gt;(2) Check contribution and periods of each PC and eliminate the PCs which are corresponding to the longer periods of variation. That is &amp;#8220;detrend&amp;#8221; of the original data.&lt;/p&gt;&lt;p&gt;(3) Apply the second MSSA to the detrended time series data to separate signals and noises shorter than 400 sec.&lt;/p&gt;&lt;p&gt;(4) Calculating correlation coefficients between H and Hr and between E and Hr for each PC and select the PCs with higher correlation to reconstruct time series data to make MT analysis.&lt;/p&gt;&lt;p&gt;Then, we perform MT analysis by BIRRP to estimate apparent resistivity,&lt;/p&gt;&lt;p&gt;As a result, the coherences of H-Hr, and E-Hr were improved and the MT sounding curve became smoother than those results by the conventional noise reduction methods. This indicated that the effectiveness of the proposed noise reduction. However, further investigation in different periods and sites will be required.&lt;/p&gt;


Author(s):  
Hamid Reza Ghafarian Malamiri ◽  
Iman Rousta ◽  
Haraldur Olafsson ◽  
Hadi Zare ◽  
Hao Zhang

Land Surface Temperature (LST) is a basic parameter in energy exchange between the land and atmosphere and is frequently used in many sciences such as climatology, hydrology, agriculture, ecology, etc. LST time series data have usually deficient, missing and unacceptable data caused by the presence of clouds in images, presence of dust in atmosphere and sensor failure. In this study, Singular Spectrum Analysis (SSA) algorithm was used to resolve the problem of missing and outlier data caused by cloud cover. The region studied in the present research included an image frame of MODIS with horizontal number 22 and vertical number 05 (h22v05). This image involved a large part of Iran and Turkmenistan and Caspian Sea. In this study, MODIS LST sensor (MOD11A1) was used during 2015 with 1&times;1 Km spatial resolution and day/night LST data (daily temporal resolution). The results of the data quality showed that cloud cover caused 36.37% of missing data in the studied time series with 730 day/night LST images. Further, the results of SSA algorithm in reconstruction of LST images indicated the Root Mean Square Error (RMSE) of 2.95 K between the original and reconstructed data in LST time series in the study region. In general, the findings showed that SSA algorithm using spatio-temporal interpolation in LST time series can be effectively used to resolve the problem of missing data caused by cloud cover.


Sign in / Sign up

Export Citation Format

Share Document