Classification of Fashion Article Images Based on Improved Random Forest and VGG-IE Algorithm

Author(s):  
Jian Liu ◽  
Yuchen Zheng ◽  
Ke Dong ◽  
Haitong Yu ◽  
Jianjun Zhou ◽  
...  

In classification of fashion article images based on e-commerce image recommendation system, the classification accuracy and computation time cannot meet the actual requirements. Herein, for the first time to our knowledge, we present two diverse image recognition approaches for classification of fashion article images called random-forest method based on genetic algorithm (GA-RF) and Visual Geometry Group-Image Enhancement algorithm (VGG-IE) to solve classification accuracy and computation time problem. In GA-RF, the number of segmentation times and the decision trees are the key factors affecting the classification results. Improved genetic algorithm is introduced into the parameter optimization of forests to determine the optimal combination of the two parameters with minimal manual intervention. Finally, we propose six different Deep Neural Network architectures, including VGG-IE, to improve classification accuracy. The VGG-IE algorithm uses batch normalization and seven kinds training-data augmentation for ease and promotion of learning process. We investigate the effectiveness of the proposed method using Fashion-MNIST dataset and 70[Formula: see text]000 pictures, Experimental results demonstrate that, in comparison with the state-of-the-art algorithms for 10 categories of image recognition, our VGG algorithm has the shortest computational time when it satisfies certain classification accuracy. VGG-IE approach has the highest classification accuracy.

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hamideh Soltani ◽  
Zahra Einalou ◽  
Mehrdad Dadgostar ◽  
Keivan Maghooli

AbstractBrain computer interface (BCI) systems have been regarded as a new way of communication for humans. In this research, common methods such as wavelet transform are applied in order to extract features. However, genetic algorithm (GA), as an evolutionary method, is used to select features. Finally, classification was done using the two approaches support vector machine (SVM) and Bayesian method. Five features were selected and the accuracy of Bayesian classification was measured to be 80% with dimension reduction. Ultimately, the classification accuracy reached 90.4% using SVM classifier. The results of the study indicate a better feature selection and the effective dimension reduction of these features, as well as a higher percentage of classification accuracy in comparison with other studies.


2022 ◽  
Vol 10 (1) ◽  
pp. 0-0

Effective productivity estimates of fresh produced crops are very essential for efficient farming, commercial planning, and logistical support. In the past ten years, machine learning (ML) algorithms have been widely used for grading and classification of agricultural products in agriculture sector. However, the precise and accurate assessment of the maturity level of tomatoes using ML algorithms is still a quite challenging to achieve due to these algorithms being reliant on hand crafted features. Hence, in this paper we propose a deep learning based tomato maturity grading system that helps to increase the accuracy and adaptability of maturity grading tasks with less amount of training data. The performance of proposed system is assessed on the real tomato datasets collected from the open fields using Nikon D3500 CCD camera. The proposed approach achieved an average maturity classification accuracy of 99.8 % which seems to be quite promising in comparison to the other state of art methods.


2019 ◽  
Vol 11 (24) ◽  
pp. 3000 ◽  
Author(s):  
Francisco Alonso-Sarria ◽  
Carmen Valdivieso-Ros ◽  
Francisco Gomariz-Castillo

Supervised land cover classification from remote sensing imagery is based on gathering a set of training areas to characterise each of the classes and to train a predictive model that is then used to predict land cover in the rest of the image. This procedure relies mainly on the assumptions of statistical separability of the classes and the representativeness of the training areas. This paper uses isolation forests, a type of random tree ensembles, to analyse both assumptions and to easily correct lack of representativeness by digitising new training areas where needed to improve the classification of a Landsat-8 set of images with Random Forest. The results show that the improved set of training areas after the isolation forest analysis is more representative of the whole image and increases classification accuracy. Besides, the distribution of isolation values can be useful to estimate class separability. A class separability parameter that summarises such distributions is proposed. This parameter is more correlated to omission and commission errors than other separability measures such as the Jeffries–Matusita distance.


Author(s):  
Lalu Zulfikar Muslim ◽  
I Gede Pasek Suta Wijaya ◽  
Fitri Bimantoro

the classification of fruit quality on a computer using image data is very necessary. In addition, this can also be used in making decisions and policies related to business strategies in the industry. In this research, the quality classification of watermelon was carried out using the Weighted K-Means Algorithm. The classification of watermelon fruit in this study was divided into three groups, namely fresh, medium, and rotten. The classification process in the system created is divided into two stages, namely training and examinations.The data that is input into the system is watermelon image data in YCbCr format. In the training phase, the input data that is processed is image data that has been classified. As for the testing/classification phase, the input data processed is an arbitrary image that has not been classified.The results of the classification with watermelon case studies using the weighted k-means algorithm obtained a conclusion that the greater the amount of training data, the computing time needed for the training and testing process will increase, as well as the level of accuracy, precision and recall of the classification results obtained will also get better. While the greater the number of k values, the computational time needed for the training and testing process will increase, but the level of accuracy, precision, and recall of the results of the classification that gets smaller.


2004 ◽  
Vol 43 (02) ◽  
pp. 192-201 ◽  
Author(s):  
R. E. Abdel-Aal

Summary Objectives: To introduce abductive network classifier committees as an ensemble method for improving classification accuracy in medical diagnosis. While neural networks allow many ways to introduce enough diversity among member models to improve performance when forming a committee, the self-organizing, automatic-stopping nature, and learning approach used by abductive networks are not very conducive for this purpose. We explore ways of overcoming this limitation and demonstrate improved classification on three standard medical datasets. Methods: Two standard 2-class medical datasets (Pima Indians Diabetes and Heart Disease) and a 6-class dataset (Dermatology) were used to investigate ways of training abductive networks with adequate independence, as well as methods of combining their outputs to form a network that improves performance beyond that of single models. Results: Two- or three-member committees of models trained on completely or partially different subsets of training data and using simple output combination methods achieve improvements between 2 and 5 percentage points in the classification accuracy over the best single model developed using the full training set. Conclusions: Varying model complexity alone gives abductive network models that are too correlated to ensure enough diversity for forming a useful committee. Diversity achieved through training member networks on independent subsets of the training data outweighs limitations of the smaller training set for each, resulting in net gain in committee performance. As such models train faster and can be trained in parallel, this can also speed up classifier development.


2021 ◽  
Vol 5 (1) ◽  
pp. 187-192
Author(s):  
Yoga Religia ◽  
Agung Nugroho ◽  
Wahyu Hadikristanto

The world of banking requires a marketer to be able to reduce the risk of borrowing by keeping his customers from occurring non-performing loans. One way to reduce this risk is by using data mining techniques. Data mining provides a powerful technique for finding meaningful and useful information from large amounts of data by way of classification. The classification algorithm that can be used to handle imbalance problems can use the Random Forest (RF) algorithm. However, several references state that an optimization algorithm is needed to improve the classification results of the RF algorithm. Optimization of the RF algorithm can be done using Bagging and Genetic Algorithm (GA). This study aims to classify Bank Marketing data in the form of loan application receipts, which data is taken from the www.data.world site. Classification is carried out using the RF algorithm to obtain a predictive model for loan application acceptance with optimal accuracy. This study will also compare the use of optimization in the RF algorithm with Bagging and Genetic Algorithms. Based on the tests that have been done, the results show that the most optimal performance of the classification of Bank Marketing data is by using the RF algorithm with an accuracy of 88.30%, AUC (+) of 0.500 and AUC (-) of 0.000. The optimization of Bagging and Genetic Algorithm has not been able to improve the performance of the RF algorithm for classification of Bank Marketing data.  


2021 ◽  
Author(s):  
Anton Korosov ◽  
Hugo Boulze ◽  
Julien Brajard

<p>A new algorithm for classification of sea ice types on Sentinel-1 Synthetic Aperture Radar (SAR) data using a convolutional neural network (CNN) is presented.  The CNN is trained on reference ice charts produced by human experts and compared with an existing machine learning algorithm based on texture features and random forest classifier. The CNN is trained on a dataset from winter 2020 for retrieval of four classes: ice free, young ice, first-year ice and old ice. The accuracy of our classification is 91.6%. The error is a bit higher for young ice (76%) and first-year ice (84%). Our algorithm outperforms the existing random forest product for each ice type. It has also proved to be more efficient in computing time and less sensitive to the noise in SAR data.</p><p> </p><p>Our study demonstrates that CNN can be successfully applied for classification of sea ice types in SAR data. The algorithm is applied in small sub-images extracted from a SAR image after preprocessing including thermal noise removal. Validation shows that the errors are mostly attributed to coarse resolution of ice charts or misclassification of training data by human experts.</p><p> </p><p>Several sensitivity experiments were conducted for testing the impact of CNN architecture, hyperparameters, training parameters and data preprocessing on accuracy. It was shown that a CNN with three convolutional layers, two max-pool layers and three hidden dense layers can be applied to a sub-image with size 50 x 50 pixels for achieving the best results. It was also shown that a CNN can be applied to SAR data without thermal noise removal on the preprocessing step. Understandably, the classification accuracy decreases to 89% but remains reasonable.</p><p> </p><p>The main advantages of the new algorithm are the ability to classify several ice types, higher classification accuracy for each ice type and higher speed of processing than in the previous studies. The relative simplicity of the algorithm (both texture analysis and classification are performed by CNN) is also a benefit. In addition to providing ice type labels, the algorithm also derives the probability of belonging to a class. Uncertainty of the method can be derived from these probabilities and used in the assimilation of ice type in numerical models. </p><p><br>Given the high accuracy and processing speed, the CNN-based algorithm is included in the Copernicus Marine Environment Monitoring Service (CMEMS) for operational sea ice type retrieval for generating ice charts in the Arctic Ocean. It is already released as an open source software and available on Github: https://github.com/nansencenter/s1_icetype_cnn.</p>


2020 ◽  
Vol 62 (1) ◽  
pp. 15-21
Author(s):  
Changdong Wu

In an online monitoring system for an electrified railway, it is important to classify the catenary equipment successfully. The extreme learning machine (ELM) is an effective image classification algorithm and the genetic algorithm (GA) is a typical optimisation method. In this paper, a coupled genetic algorithm-extreme learning machine (GA-ELM) technique is proposed for the classification of catenary equipment. Firstly, the GA is used to search for optimal features by reducing the initial multi-dimensional features to low-dimensional features. Next, the optimised features are used as the input to the ELM. The ELM algorithm is then used to classify the catenary equipment. In this process, the impacts of the activation function, the number of hidden layer neurons and different models on the performance of the ELM are discussed in turn. Finally, the proposed method is compared with traditional methods in terms of classification accuracy and efficiency. Experimental results show that the number of feature dimensions decreases to 58% of the original number and the computational complexity is greatly decreased. Moreover, the reduced features and the few steps of the ELM improve the classification accuracy and speed. Noticeably, when the performance of the GA-ELM method is compared with that of the ELM method, the classification accuracy rate is 93.33% compared with 85.83% and the time consumption is 2.25 s compared with 8.85 s, respectively. That is to say, the proposed method not only decreases the number of features but also increases the classification accuracy and efficiency. This meets the needs of a real-time online condition monitoring system.


2017 ◽  
Vol 2017 ◽  
pp. 1-14 ◽  
Author(s):  
Wenbo Pang ◽  
Huiyan Jiang ◽  
Siqi Li

Accurate classification of hepatocellular carcinoma (HCC) image is of great importance in pathology diagnosis and treatment. This paper proposes a concave-convex variation (CCV) method to optimize three classifiers (random forest, support vector machine, and extreme learning machine) for the more accurate HCC image classification results. First, in preprocessing stage, hematoxylin-eosin (H&E) pathological images are enhanced using bilateral filter and each HCC image patch is obtained under the guidance of pathologists. Then, after extracting the complete features of each patch, a new sparse contribution (SC) feature selection model is established to select the beneficial features for each classifier. Finally, a concave-convex variation method is developed to improve the performance of classifiers. Experiments using 1260 HCC image patches demonstrate that our proposed CCV classifiers have improved greatly compared to each original classifier and CCV-random forest (CCV-RF) performs the best for HCC image recognition.


Sign in / Sign up

Export Citation Format

Share Document