υ-Nonparallel parametric margin fuzzy support vector machine

2021 ◽  
pp. 1-17
Author(s):  
Hongmei Ju ◽  
Yafang Zhang ◽  
Ye Zhao

Classification problem is an important research direction in machine learning. υ-nonparallel support vector machine (υ-NPSVM) is an important classifier used to solve classification problems. It is widely used because of its structural risk minimization principle, kernel trick, and sparsity. However, when solving classification problems, υ-NPSVM will encounter the problem of sample noises and heteroscedastic noise structure, which will affect its performance. In this paper, two improvements are made on the υ-NPSVM model, and a υ-nonparallel parametric margin fuzzy support vector machine (par-υ-FNPSVM) is established. On the one hand, for the noises that may exist in the data set, the neighbor information is used to add fuzzy membership to the samples, so that the contribution of each sample to the classification is treated differently. On the other hand, in order to reduce the effect of heteroscedastic structure, an insensitive loss function is introduced. The advantages of the new model are verified through UCI machine learning standard data set experiments. Finally, Friedman test and Bonferroni-Dunn test are used to verify the statistical significance of it.

2021 ◽  
Vol 40 (1) ◽  
pp. 1457-1470
Author(s):  
Hongmei Ju ◽  
Ye Zhao ◽  
Yafang Zhang

Classification problem is an important research direction in machine learning. Nonparallel support vector machine (NPSVM) is an important classifier used to solve classification problems. It is widely used because of its structural risk minimization principle, kernel trick, and sparsity. When solving multi-class classification problems, NPSVM will encounter the problem of sample noises, low discrimination speed and unrecognized regions, which will affect its performance. In this paper, based on the multi-class NPSVM model, two improvements are made, and a directed acyclic graph fuzzy nonparallel support vector machine (DAG-F-NPSVM) model is established. On the one hand, for the noises that may exist in the data set, the density information is used to add fuzzy membership to the samples, so that the contribution of each samples to the classification is treated differently. On the other hand, in order to reduce the decision time and solve the problem of unrecognized regions, the theory of directed acyclic graph (DAG) is introduced. Finally, the advantages of the new model in classification accuracy and decision speed is verified through UCI machine learning standard data set experiments. Finally, Friedman test and Bonferroni-Dunn test are used to verify the statistical significance of this new method.


2013 ◽  
Vol 438-439 ◽  
pp. 1167-1170
Author(s):  
Xu Chao Shi ◽  
Ying Fei Gao

The compression index is an important soil property that is essential to many geotechnical designs. As the determination of the compression index from consolidation tests is relatively time-consuming. Support Vector Machine (SVM) is a statistical learning theory based on a structural risk minimization principle that minimizes both error and weight terms. Considering the fact that parameters in SVM model are difficult to be decided, a genetic SVM was presented in which the parameters in SVM method are optimized by Genetic Algorithm (GA). Taking plasticity index, water content, void ration and density of soil as primary influence factors, the prediction model of compression index based on GA-SVM approach was obtained. The results of this study showed that the GA-SVM approach has the potential to be a practical tool for predicting compression index of soil.


2021 ◽  
Author(s):  
Qifei Zhao ◽  
Xiaojun Li ◽  
Yunning Cao ◽  
Zhikun Li ◽  
Jixin Fan

Abstract Collapsibility of loess is a significant factor affecting engineering construction in loess area, and testing the collapsibility of loess is costly. In this study, A total of 4,256 loess samples are collected from the north, east, west and middle regions of Xining. 70% of the samples are used to generate training data set, and the rest are used to generate verification data set, so as to construct and validate the machine learning models. The most important six factors are selected from thirteen factors by using Grey Relational analysis and multicollinearity analysis: burial depth、water content、specific gravity of soil particles、void rate、geostatic stress and plasticity limit. In order to predict the collapsibility of loess, four machine learning methods: Support Vector Machine (SVM), Random Subspace Based Support Vector Machine (RSSVM), Random Forest (RF) and Naïve Bayes Tree (NBTree), are studied and compared. The receiver operating characteristic (ROC) curve indicators, standard error (SD) and 95% confidence interval (CI) are used to verify and compare the models in different research areas. The results show that: RF model is the most efficient in predicting the collapsibility of loess in Xining, and its AUC average is above 80%, which can be used in engineering practice.


2019 ◽  
Vol 67 (6) ◽  
pp. 1991-2003 ◽  
Author(s):  
Edyta Puskarczyk

Abstract Unconventional oil and gas reservoirs from the lower Palaeozoic basin at the western slope of the East European Craton were taken into account in this study. The aim was to supply and improve standard well logs interpretation based on machine learning methods, especially ANNs. ANNs were used on standard well logging data, e.g. P-wave velocity, density, resistivity, neutron porosity, radioactivity and photoelectric factor. During the calculations, information about lithology or stratigraphy was not taken into account. We apply different methods of classification: cluster analysis, support vector machine and artificial neural network—Kohonen algorithm. We compare the results and analyse obtained electrofacies. Machine learning method–support vector machine SVM was used for classification. For the same data set, SVM algorithm application results were compared to the results of the Kohonen algorithm. The results were very similar. We obtained very good agreement of results. Kohonen algorithm (ANN) was used for pattern recognition and identification of electrofacies. Kohonen algorithm was also used for geological interpretation of well logs data. As a result of Kohonen algorithm application, groups corresponding to the gas-bearing intervals were found. Analysis showed diversification between gas-bearing formations and surrounding beds. It is also shown that internal diversification in gas-saturated beds is present. It is concluded that ANN appeared to be a useful and quick tool for preliminary classification of members and gas-saturated identification.


Author(s):  
Xihua Li ◽  
Fuqiang Wang ◽  
Xiaohong Chen

Due to the radical change in both Chinese and global economic environment, it is essential to develop a practical model to predict financial distress. The support vector machine (SVM), a new outstanding learning machine based on the statistical learning theory, embodying the principle of structural risk minimization instead of empirical risk minimization principle, is a promising method for such financial distress prediction. However, to some extent, the performance of single classifier depends on the sample's pattern characteristics and each single classifier has its own uncertainty. Using the ensemble methods to predict financial distress becomes a rising trend in this field. This research puts forward a SVM ensemble based on the Choquet integral for financial distress prediction in which Bagging algorithm is used to generate new training sets. The proposed ensemble method can be expressed as "Choquet + Bagging + SVMs". With real data from Chinese listed companies, an experiment is carried out to compare the performance of single classifiers with the proposed ensemble method. Empirical results indicate that the proposed ensemble of SVMs based on the Choquet integral for financial distress prediction has higher average accuracy and stability than single SVM classifiers.


2013 ◽  
Vol 2013 ◽  
pp. 1-10 ◽  
Author(s):  
Jianwei Liu ◽  
Shuang Cheng Li ◽  
Xionglin Luo

Support vector machine is an effective classification and regression method that uses machine learning theory to maximize the predictive accuracy while avoiding overfitting of data.L2regularization has been commonly used. If the training dataset contains many noise variables,L1regularization SVM will provide a better performance. However, bothL1andL2are not the optimal regularization method when handing a large number of redundant values and only a small amount of data points is useful for machine learning. We have therefore proposed an adaptive learning algorithm using the iterative reweightedp-norm regularization support vector machine for 0 <p≤ 2. A simulated data set was created to evaluate the algorithm. It was shown that apvalue of 0.8 was able to produce better feature selection rate with high accuracy. Four cancer data sets from public data banks were used also for the evaluation. All four evaluations show that the new adaptive algorithm was able to achieve the optimal prediction error using apvalue less thanL1norm. Moreover, we observe that the proposedLppenalty is more robust to noise variables than theL1andL2penalties.


2014 ◽  
Vol 1030-1032 ◽  
pp. 1814-1817
Author(s):  
Lan Lan Kang ◽  
Wen Liang Cao

Support vector machine is a beginning of the 1990s, based on statistical learning theory proposed new machine learning method, which structural risk minimization principle as the theoretical basis, by appropriately selecting a subset of functions and discriminant function in the subset, so the actual risk of learning machine to a minimum, to ensure that the limited training samples obtained through a small error classifier, an independent test set for testing error remains small. In this paper, support vector machine theory, algorithm, application status, etc. are discussed in detail.


2021 ◽  
pp. 1-17
Author(s):  
Ming-Ai Li ◽  
Ruo-Tu Wang ◽  
Li-Na Wei

BACKGROUND: Motor imagery electroencephalogram (MI-EEG) play an important role in the field of neurorehabilitation, and a fuzzy support vector machine (FSVM) is one of the most used classifiers. Specifically, a fuzzy c-means (FCM) algorithm was used to membership calculation to deal with the classification problems with outliers or noises. However, FCM is sensitive to its initial value and easily falls into local optima. OBJECTIVE: The joint optimization of genetic algorithm (GA) and FCM is proposed to enhance robustness of fuzzy memberships to initial cluster centers, yielding an improved FSVM (GF-FSVM). METHOD: The features of each channel of MI-EEG are extracted by the improved refined composite multivariate multiscale fuzzy entropy and fused to form a feature vector for a trial. Then, GA is employed to optimize the initial cluster center of FCM, and the fuzzy membership degrees are calculated through an iterative process and further applied to classify two-class MI-EEGs. RESULTS: Extensive experiments are conducted on two publicly available datasets, the average recognition accuracies achieve 99.89% and 98.81% and the corresponding kappa values are 0.9978 and 0.9762, respectively. CONCLUSION: The optimized cluster centers of FCM via GA are almost overlapping, showing great stability, and GF-FSVM obtains higher classification accuracies and higher consistency as well.


2017 ◽  
Vol 10 (3) ◽  
pp. 683-690 ◽  
Author(s):  
Kamalpreet Kaur ◽  
O.P. Guptata

Maturity checking has become mandatory for the food industries as well as for the farmers so as to ensure that the fruits and vegetables are not diseased and are ripe. However, manual inspection leads to human error, unripe fruits and vegetables may decrease the production [3]. Thus, this study proposes a Tomato Classification system for determining maturity stages of tomato through Machine Learning which involves training of different algorithms like Decision Tree, Logistic Regression, Gradient Boosting, Random Forest, Support Vector Machine, K-NN and XG Boost. This system consists of image collection, feature extraction and training the classifiers on 80% of the total data. Rest 20% of the total data is used for the testing purpose. It is concluded from the results that the performance of the classifier depends on the size and kind of features extracted from the data set. The results are obtained in the form of Learning Curve, Confusion Matrix and Accuracy Score. It is observed that out of seven classifiers, Random Forest is successful with 92.49% accuracy due to its high capability of handling large set of data. Support Vector Machine has shown the least accuracy due to its inability to train large data set.


The Breast Cancer is disease which tremendously increased in women’s nowadays. Mammography is technique of low-powered X-ray diagnosis approach for detection and diagnosis of cancer diseases at early stage. The proposed system shows the solution of two problems. First shows to detect tumors as suspicious regions with a weak contrast to their background and second shows way to extract features which categorize tumors. Hence this classification can be done with SVM, a great method of statistical learning has made significant achievement in various field. Discovered in the early 90’s, which led to an interest in machine learning? Here the different types of tumor like Benign, Malignant, or Normal image are classified using the SVM classifier. This techniques shows how easily we can detect region of tumor is present in mammogram images with more than 80% of accuracy rates for linear classification using SVM. The 10-fold cross validation to get an accurate outcome is been used by proposed system. The Wisconsin breast cancer diagnosis data set is referred from UCI machine learning repository. The considering accuracy, sensitivity, specificity, false discovery rate, false omission rate and Matthews’s correlation coefficient is appraised in the proposed system. This Provides good result for both training and testing phase. The techniques also shows accuracy of 98.57% and 97.14% by use of Support Vector Machine and K-Nearest Neighbors


Sign in / Sign up

Export Citation Format

Share Document