scholarly journals Forecasting of Steam Coal Price Based on Robust Regularized Kernel Regression and Empirical Mode Decomposition

2021 ◽  
Vol 9 ◽  
Author(s):  
Xiangwan Fu ◽  
Mingzhu Tang ◽  
Dongqun Xu ◽  
Jun Yang ◽  
Donglin Chen ◽  
...  

Aiming at the problem of difficulties in modeling the nonlinear relation in the steam coal dataset, this article proposes a forecasting method for the price of steam coal based on robust regularized kernel regression and empirical mode decomposition. By selecting the polynomial kernel function, the robust loss function and L2 regular term to construct a robust regularized kernel regression model are used. The polynomial kernel function does not depend on the kernel parameters and can mine the global rules in the dataset so that improves the forecasting stability of the kernel model. This method maps the features to the high-dimensional space by using the polynomial kernel function to transform the nonlinear law in the original feature space into linear law in the high-dimensional space and helps learn the linear law in the high-dimensional feature space by using the linear model. The Huber loss function is selected to reduce the influence of abnormal noise in the dataset on the model performance, and the L2 regular term is used to reduce the risk of model overfitting. We use the combined model based on empirical mode decomposition (EMD) and auto regressive integrated moving average (ARIMA) model to compensate for the error of robust regularized kernel regression model, thus making up for the limitations of the single forecasting model. Finally, we use the steam coal dataset to verify the proposed model and such model has an optimal evaluation index value compared to other contrast models after the model performance is evaluated as per the evaluation index such as RMSE, MAE, and mean absolute percentage error.

Sensors ◽  
2018 ◽  
Vol 18 (7) ◽  
pp. 2325 ◽  
Author(s):  
Yong Lv ◽  
Houzhuang Zhang ◽  
Cancan Yi

As a multichannel signal processing method based on data-driven, multivariate empirical mode decomposition (MEMD) has attracted much attention due to its potential ability in self-adaption and multi-scale decomposition for multivariate data. Commonly, the uniform projection scheme on a hypersphere is used to estimate the local mean. However, the unbalanced data distribution in high-dimensional space often conflicts with the uniform samples and its performance is sensitive to the noise components. Considering the common fact that the vibration signal is generated by three sensors located in different measuring positions in the domain of the structural health monitoring for the key equipment, thus a novel trivariate empirical mode decomposition via convex optimization was proposed for rolling bearing condition identification in this paper. For the trivariate data matrix, the low-rank matrix approximation via convex optimization was firstly conducted to achieve the denoising. It is worthy to note that the non-convex penalty function as a regularization term is introduced to enhance the performance. Moreover, the non-uniform sample scheme was determined by applying singular value decomposition (SVD) to the obtained low-rank trivariate data and then the approach used in conventional MEMD algorithm was employed to estimate the local mean. Numerical examples of synthetic defined by the fault model and real data generated by the fault rolling bearing on the experimental bench are provided to demonstrate the fruitful applications of the proposed method.


2011 ◽  
Vol 187 ◽  
pp. 319-325
Author(s):  
Wen Ming Cao ◽  
Xiong Feng Li ◽  
Li Juan Pu

Biometric Pattern Recognition aim at finding the best coverage of per kind of sample’s distribution in the feature space. This paper employed geometric algebra to determine local continuum (connected) direction and connected path of same kind of target of SAR images of the complex geometrical body in high dimensional space. We researched the property of the GA Neuron of the coverage body in high dimensional space and studied a kind of SAR ATR(SAR automatic target recognition) technique which works with small data amount and result to high recognizing rate. Finally, we verified our algorithm with MSTAR (Moving and Stationary Target Acquisition and Recognition) [1] data set.


2004 ◽  
Vol 16 (8) ◽  
pp. 1705-1719 ◽  
Author(s):  
Kazushi Ikeda

The generalization properties of learning classifiers with a polynomial kernel function are examined. In kernel methods, input vectors are mapped into a high-dimensional feature space where the mapped vectors are linearly separated. It is well-known that a linear dichotomy has an average generalization error or a learning curve proportional to the dimension of the input space and inversely proportional to the number of given examples in the asymptotic limit. However, it does not hold in the case of kernel methods since the feature vectors lie on a submanifold in the feature space, called the input surface. In this letter, we discuss how the asymptotic average generalization error depends on the relationship between the input surface and the true separating hyperplane in the feature space where the essential dimension of the true separating polynomial, named the class, is important. We show its upper bounds in several cases and confirm these using computer simulations.


2014 ◽  
Vol 598 ◽  
pp. 432-435
Author(s):  
Shi Jiao Zhu ◽  
Cheng Jian Liu ◽  
Qing Wang

In this paper, a method is proposed using the motion vector in Gabor space to recognize smoke region. Smoke orbit is described as up-forward and has some similar sharp in feature space. The propose method is assessed by calculating 45,90,135 degree upward vectors , and determined the possibility of smoke region. Different images were tested using smoke video scenes, and it meets the desired expectation. The next step will be in-depth study represents a high-dimensional space .


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Harsh Patel ◽  
David M. Vock ◽  
G. Elisabeta Marai ◽  
Clifton D. Fuller ◽  
Abdallah S. R. Mohamed ◽  
...  

AbstractTo improve risk prediction for oropharyngeal cancer (OPC) patients using cluster analysis on the radiomic features extracted from pre-treatment Computed Tomography (CT) scans. 553 OPC Patients randomly split into training (80%) and validation (20%), were classified into 2 or 3 risk groups by applying hierarchical clustering over the co-occurrence matrix obtained from a random survival forest (RSF) trained over 301 radiomic features. The cluster label was included together with other clinical data to train an ensemble model using five predictive models (Cox, random forest, RSF, logistic regression, and logistic-elastic net). Ensemble performance was evaluated over the independent test set for both recurrence free survival (RFS) and overall survival (OS). The Kaplan–Meier curves for OS stratified by cluster label show significant differences for both training and testing (p val < 0.0001). When compared to the models trained using clinical data only, the inclusion of the cluster label improves AUC test performance from .62 to .79 and from .66 to .80 for OS and RFS, respectively. The extraction of a single feature, namely a cluster label, to represent the high-dimensional radiomic feature space reduces the dimensionality and sparsity of the data. Moreover, inclusion of the cluster label improves model performance compared to clinical data only and offers comparable performance to the models including raw radiomic features.


2020 ◽  
Vol 9 (1) ◽  
pp. 45
Author(s):  
Maysa Ibrahem Almulla Khalaf ◽  
John Q Gan

This paper proposes two hybrid feature subset selection approaches based on the combination (union or intersection) of both supervised and unsupervised filter approaches before using a wrapper, aiming to obtain low-dimensional features with high accuracy and interpretability and low time consumption. Experiments with the proposed hybrid approaches have been conducted on seven high-dimensional feature datasets. The classifiers adopted are support vector machine (SVM), linear discriminant analysis (LDA), and K-nearest neighbour (KNN). Experimental results have demonstrated the advantages and usefulness of the proposed methods in feature subset selection in high-dimensional space in terms of the number of selected features and time spent to achieve the best classification accuracy.


Author(s):  
Muhammad Amjad

Advances in manifold learning have proven to be of great benefit in reducing the dimensionality of large complex datasets. Elements in an intricate dataset will typically belong in high-dimensional space as the number of individual features or independent variables will be extensive. However, these elements can be integrated into a low-dimensional manifold with well-defined parameters. By constructing a low-dimensional manifold and embedding it into high-dimensional feature space, the dataset can be simplified for easier interpretation. In spite of this elemental dimensionality reduction, the dataset’s constituents do not lose any information, but rather filter it with the hopes of elucidating the appropriate knowledge. This paper will explore the importance of this method of data analysis, its applications, and its extensions into topological data analysis.


2021 ◽  
Vol 2021 ◽  
pp. 1-22
Author(s):  
Jiancheng Gong ◽  
Xiaoqiang Yang ◽  
Fan Pan ◽  
Wuqiang Liu ◽  
Fuming Zhou

Rotating machinery refers to machinery that executes specific functions mainly relying on their rotation. They are widely used in engineering applications. Bearings and gearboxes play a key role in rotating machinery, and their states can directly affect the operation status of the whole rotating machinery. Accurate fault detection and judgment of bearing, gearbox, and other key parts are of great significance to the rotating machinery’s normal operation. A new fault feature extraction algorithm for rotating machinery called Improved Multivariate Multiscale Amplitude-Aware Permutation Entropy (ImvMAAPE) is proposed in this paper, and the application of an improved coarse-grained method in fault feature extraction of multichannel signals is realized in this method. This algorithm is combined with the Uniform Phase Empirical Mode Decomposition (UPEMD) method and the t-distributed Stochastic Neighbor Embedding (t-SNE) method, forming a new time-frequency multiscale feature extraction method. Firstly, the multichannel vibration signals are decomposed adaptively into sets of Intrinsic Mode Functions (IMFs) using UPEMD; then, the IMF components containing the main fault information are screened by correlation analysis to get the reconstructed signals. The ImvMAAPE values of the reconstructed signals are calculated to generate the initial high-dimensional fault features, and the t-SNE method with excellent nonlinear dimensionality reduction performance is then used to reduce the dimensionality of the initial high-dimensional fault feature vectors. Finally, the low dimensional feature vectors with high quality are input to the random forest (RF) classifier to identify and judge the fault types. Experiments were conducted to verify whether this method has higher accuracy and robustness than other methods.


2021 ◽  
Author(s):  
Harsh Patel ◽  
David Vock ◽  
Elisabeta Marai ◽  
Clifton Fuller ◽  
Abdallah Mohamed ◽  
...  

Abstract OBJECTIVE: To improve risk prediction for oropharyngeal cancer (OPC) patients using cluster analysis on the radiomic features extracted from pre-treatment Computed Tomography (CT) scans.MATERIALS AND METHODS: OPC Patients were classified into 2 or 3 risk groups by applying hierarchical clustering over the co-occurrence matrix obtained from a random survival forest (RSF) trained over 301 radiomic features. The cluster label was included together with other clinical data to train an ensemble model using five predictive models (Cox, random forest, RSF, logistic regression, and logistic-elastic net). Ensemble performance was evaluated over an independent test set for both recurrence free survival (RFS) and overall survival (OS). RESULTS: The Kaplan-Meier curves for OS stratified by cluster label show significant differences for both training (p-val<0.0001) and testing (p-val=0.005). Inclusion of the cluster label outperforms clinical data only improving AUC from .60 to .76 and from .63 to .75 for OS and RFS, respectively. CONCLUSION: The extraction of a single feature, namely a cluster label, to represent the high-dimensional radiomic feature space reduces the dimensionality and sparsity of the data. Moreover, inclusion of the cluster label improves model performance compared to clinical data only and compares to the raw radiomic features performance.


Sign in / Sign up

Export Citation Format

Share Document