scholarly journals Sparse coded spatial pyramid matching and multi-kernel integrated SVM for non-linear scene classification

2021 ◽  
Vol 72 (6) ◽  
pp. 374-380
Author(s):  
Bhavinkumar Gajjar ◽  
Hiren Mewada ◽  
Ashwin Patani

Abstract Support vector machine (SVM) techniques and deep learning have been prevalent in object classification for many years. However, deep learning is computation-intensive and can require a long training time. SVM is significantly faster than Convolution Neural Network (CNN). However, the SVM has limited its applications in the mid-size dataset as it requires proper tuning. Recently the parameterization of multiple kernels has shown greater flexibility in the characterization of the dataset. Therefore, this paper proposes a sparse coded multi-scale approach to reduce training complexity and tuning of SVM using a non-linear fusion of kernels for large class natural scene classification. The optimum features are obtained by parameterizing the dictionary, Scale Invariant Feature Transform (SIFT) parameters, and fusion of multiple kernels. Experiments were conducted on a large dataset to examine the multi-kernel space capability to find distinct features for better classification. The proposed approach founds to be promising than the linear multi-kernel SVM approaches achieving 91.12 % maximum accuracy.

2021 ◽  
Author(s):  
Shuaizhou Hu ◽  
Xinyao Zhang ◽  
Hao-yu Liao ◽  
Xiao Liang ◽  
Minghui Zheng ◽  
...  

Abstract Remanufacturing sites often receive products with different brands, models, conditions, and quality levels. Proper sorting and classification of the waste stream is a primary step in efficiently recovering and handling used products. The correct classification is particularly crucial in future electronic waste (e-waste) management sites equipped with Artificial Intelligence (AI) and robotic technologies. Robots should be enabled with proper algorithms to recognize and classify products with different features and prepare them for assembly and disassembly tasks. In this study, two categories of Machine Learning (ML) and Deep Learning (DL) techniques are used to classify consumer electronics. ML models include Naïve Bayes with Bernoulli, Gaussian, Multinomial distributions, and Support Vector Machine (SVM) algorithms with four kernels of Linear, Radial Basis Function (RBF), Polynomial, and Sigmoid. While DL models include VGG-16, GoogLeNet, Inception-v3, Inception-v4, and ResNet-50. The above-mentioned models are used to classify three laptop brands, including Apple, HP, and ThinkPad. First the Edge Histogram Descriptor (EHD) and Scale Invariant Feature Transform (SIFT) are used to extract features as inputs to ML models for classification. DL models use laptop images without pre-processing on feature extraction. The trained models are slightly overfitting due to the limited dataset and complexity of model parameters. Despite slight overfitting, the models can identify each brand. The findings prove that DL models outperform them of ML. Among DL models, GoogLeNet has the highest performance in identifying the laptop brands.


2012 ◽  
Vol 190-191 ◽  
pp. 1099-1103 ◽  
Author(s):  
Jun Guo ◽  
Chang Ren Zhu

In this paper we propose an automatic ship detection method in High Resolution optical satellite images based on neighbor context information. First, a pre-detection of targets gives us candidates. For each candidate, we choose an extended region called candidate with neighborhood which comprises candidate and its neighbor area. Second, the patches of candidate with neighborhood are got by a regular grid, and their SIFT(Scale Invariant Feature Transform) features are extracted. Then the SIFT features of training images are clustered with the K-means algorithm to form a codebook of the patches. We quantize the patches of candidate with neighborhood according to this codebook and get the visual word representation. Finally by applying spatial pyramid matching, the candidates are classified with SVM (support vector machine). Experiment results are given for a set of images show that our method has got predominant performance.


2019 ◽  
Vol 8 (2) ◽  
pp. 6053-6057

Telugu language is one of the most spoken Indian languages throughout the world. Since it has an old heritage, so Telugu literature and newspaper publications can be scanned to identify individual words. Identification of Telugu word images poses serious problems owing to its complex structure and larger set of individual characters. This paper aims to develop a novel methodology to achieve the same using SIFT (Scale Invariant Feature Transform) features of telugu words and classifying these features using BoVW (bag of visual words). The features are clustered to create a dictionary using k-means clustering. These words are used to create a visual codebook of the word images and the classification is achieved through SVM (Support Vector Machine).


Today, digital image processing is used in diverse fields; this paper attempts to compare the outcome of two commonly used techniques namely Speeded Up Robust Feature (SURF) points and Scale Invariant Feature Transform (SIFT) points in image processing operations. This study focuses on leaf veins for identification of plants. An algorithm sequence has been utilized for the purpose of recognition of leaves. SURF and SIFT extractions are applied to define and distinguish the limited structures of the documented vein image of the leaf separately and Support Vector Machine (SVM) is integrated to classify and identify the correct plant. The results prove that the SURF algorithm is the fastest and an efficient one. The results of the study can be extrapolated to authenticate medicinal plants which is the starting step to standardize herbs and carryout research.


Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3035
Author(s):  
Feiyue Deng ◽  
Yan Bi ◽  
Yongqiang Liu ◽  
Shaopu Yang

Remaining useful life (RUL) prediction of key components is an important influencing factor in making accurate maintenance decisions for mechanical systems. With the rapid development of deep learning (DL) techniques, the research on RUL prediction based on the data-driven model is increasingly widespread. Compared with the conventional convolution neural networks (CNNs), the multi-scale CNNs can extract different-scale feature information, which exhibits a better performance in the RUL prediction. However, the existing multi-scale CNNs employ multiple convolution kernels with different sizes to construct the network framework. There are two main shortcomings of this approach: (1) the convolution operation based on multiple size convolution kernels requires enormous computation and has a low operational efficiency, which severely restricts its application in practical engineering. (2) The convolutional layer with a large size convolution kernel needs a mass of weight parameters, leading to a dramatic increase in the network training time and making it prone to overfitting in the case of small datasets. To address the above issues, a multi-scale dilated convolution network (MsDCN) is proposed for RUL prediction in this article. The MsDCN adopts a new multi-scale dilation convolution fusion unit (MsDCFU), in which the multi-scale network framework is composed of convolution operations with different dilated factors. This effectively expands the range of receptive field (RF) for the convolution kernel without an additional computational burden. Moreover, the MsDCFU employs the depthwise separable convolution (DSC) to further improve the operational efficiency of the prognostics model. Finally, the proposed method was validated with the accelerated degradation test data of rolling element bearings (REBs). The experimental results demonstrate that the proposed MSDCN has a higher RUL prediction accuracy compared to some typical CNNs and better operational efficiency than the existing multi-scale CNNs based on different convolution kernel sizes.


2019 ◽  
Vol 9 (18) ◽  
pp. 3935 ◽  
Author(s):  
Kazushige Okayasu ◽  
Kota Yoshida ◽  
Masataka Fuchida ◽  
Akio Nakamura

This study aims to propose a vision-based method to classify mosquito species. To investigate the efficiency of the method, we compared two different classification methods: The handcraft feature-based conventional method and the convolutional neural network-based deep learning method. For the conventional method, 12 types of features were adopted for handcraft feature extraction, while a support vector machine method was adopted for classification. For the deep learning method, three types of architectures were adopted for classification. We built a mosquito image dataset, which included 14,400 images with three types of mosquito species. The dataset comprised 12,000 images for training, 1500 images for testing, and 900 images for validating. Experimental results revealed that the accuracy of the conventional method using the scale-invariant feature transform algorithm was 82.4% at maximum, whereas the accuracy of the deep learning method was 95.5% in a residual network using data augmentation. From the experimental results, deep learning can be considered to be effective for classifying the mosquito species of the proposed dataset. Furthermore, data augmentation improves the accuracy of mosquito species’ classification.


2019 ◽  
pp. 1-3
Author(s):  
Anita Kaklotar

Breast cancer is the primary and the most common disease found among women. Today, mammography is the most powerful screening technique used for early detection of cancer which increases the chance of successful treatment. In order to correctly detect the mammogram images as being cancerous or malignant, there is a need of a classier. With this objective, an attempt is made to analyze different feature extraction techniques and classiers. In the proposed system we rst do the preprocessing of the mammogram images, where the unwanted noise and disturbances in the mammograms are removed. Features are then extracted from the mammogram images using Gray Level Co-Occurrences Matrix (GLCM) and Scale Invariant Feature Transform (SIFT). Finally, the features are classied using classiers like HiCARe (Classier based on High Condence Association Rule Agreements), Support Vector Machine (SVM), Naïve Bayes classier and K-NN Classier. Further we test the images and classify them as benign or malignant class.


Author(s):  
L. Yang ◽  
L. Shi ◽  
P. Li ◽  
J. Yang ◽  
L. Zhao ◽  
...  

Due to the forward scattering and block of radar signal, the water, bare soil, shadow, named low backscattering objects (LBOs), often present low backscattering intensity in polarimetric synthetic aperture radar (PolSAR) image. Because the LBOs rise similar backscattering intensity and polarimetric responses, the spectral-based classifiers are inefficient to deal with LBO classification, such as Wishart method. Although some polarimetric features had been exploited to relieve the confusion phenomenon, the backscattering features are still found unstable when the system noise floor varies in the range direction. This paper will introduce a simple but effective scene classification method based on Bag of Words (BoW) model using Support Vector Machine (SVM) to discriminate the LBOs, without relying on any polarimetric features. In the proposed approach, square windows are firstly opened around the LBOs adaptively to determine the scene images, and then the Scale-Invariant Feature Transform (SIFT) points are detected in training and test scenes. The several SIFT features detected are clustered using K-means to obtain certain cluster centers as the visual word lists and scene images are represented using word frequency. At last, the SVM is selected for training and predicting new scenes as some kind of LBOs. The proposed method is executed over two AIRSAR data sets at C band and L band, including water, bare soil and shadow scenes. The experimental results illustrate the effectiveness of the scene method in distinguishing LBOs.


Sign in / Sign up

Export Citation Format

Share Document