Equalization of the Training Set For Backpropagation Networks Applied to Classification Problems

2016 ◽  
Author(s):  
Frederico dos Santos Liporace ◽  
Ricardo José Machado ◽  
Valmir C. Barbosa
2017 ◽  
Vol 2017 ◽  
pp. 1-21 ◽  
Author(s):  
Carlos Fernández ◽  
David Fernández-Llorca ◽  
Miguel A. Sotelo

A hybrid vision-map system is presented to solve the road detection problem in urban scenarios. The standardized use of machine learning techniques in classification problems has been merged with digital navigation map information to increase system robustness. The objective of this paper is to create a new environment perception method to detect the road in urban environments, fusing stereo vision with digital maps by detecting road appearance and road limits such as lane markings or curbs. Deep learning approaches make the system hard-coupled to the training set. Even though our approach is based on machine learning techniques, the features are calculated from different sources (GPS, map, curbs, etc.), making our system less dependent on the training set.


2021 ◽  
Vol 13 (18) ◽  
pp. 10435
Author(s):  
Seoro Lee ◽  
Jonggun Kim ◽  
Gwanjae Lee ◽  
Jiyeong Hong ◽  
Joo Hyun Bae ◽  
...  

Changes in hydrological characteristics and increases in various pollutant loadings due to rapid climate change and urbanization have a significant impact on the deterioration of aquatic ecosystem health (AEH). Therefore, it is important to effectively evaluate the AEH in advance and establish appropriate strategic plans. Recently, machine learning (ML) models have been widely used to solve hydrological and environmental problems in various fields. However, in general, collecting sufficient data for ML training is time-consuming and labor-intensive. Especially in classification problems, data imbalance can lead to erroneous prediction results of ML models. In this study, we proposed a method to solve the data imbalance problem through data augmentation based on Wasserstein Generative Adversarial Network (WGAN) and to efficiently predict the grades (from A to E grades) of AEH indices (i.e., Benthic Macroinvertebrate Index (BMI), Trophic Diatom Index (TDI), Fish Assessment Index (FAI)) through the ML models. Raw datasets for the AEH indices composed of various physicochemical factors (i.e., WT, DO, BOD5, SS, TN, TP, and Flow) and AEH grades were built and augmented through the WGAN. The performance of each ML model was evaluated through a 10-fold cross-validation (CV), and the performances of the ML models trained on the raw and WGAN-based training sets were compared and analyzed through AEH grade prediction on the test sets. The results showed that the ML models trained on the WGAN-based training set had an average F1-score for grades of each AEH index of 0.9 or greater for the test set, which was superior to the models trained on the raw training set (fewer data compared to other datasets) only. Through the above results, it was confirmed that by using the dataset augmented through WGAN, the ML model can yield better AEH grade predictive performance compared to the model trained on limited datasets; this approach reduces the effort needed for actual data collection from rivers which requires enormous time and cost. In the future, the results of this study can be used as basic data to construct big data of aquatic ecosystems, needed to efficiently evaluate and predict AEH in rivers based on the ML models.


2020 ◽  
Vol 44 (2) ◽  
pp. 236-243 ◽  
Author(s):  
B.V. Faizov ◽  
V.I. Shakhuro ◽  
V.V. Sanzharov ◽  
A.S. Konushin

The paper studies the possibility of using neural networks for the classification of objects that are few or absent at all in the training set. The task is illustrated by the example of classification of rare traffic signs. We consider neural networks trained using a contrastive loss function and its modifications, also we use different methods for generating synthetic samples for classification problems. As a basic method, the indexing of classes using neural network features is used. A comparison is made of classifiers trained with three different types of synthetic samples and their mixtures with real data. We propose a method of classification of rare traffic signs using a neural network discriminator of rare and frequent signs. The experimental evaluation shows that the proposed method allows rare traffic signs to be classified without significant loss of frequent sign classification quality.


2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Chenghao Cai ◽  
Yanyan Xu ◽  
Dengfeng Ke ◽  
Kaile Su

We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including theN-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs perform when used to resolve classification problems. Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates. Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets. The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.


Symmetry ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 2018
Author(s):  
Yonis Gulzar ◽  
Yasir Hamid ◽  
Arjumand Bano Soomro ◽  
Ali A. Alwan ◽  
Ludovic Journaux

Over the last few years, the research into agriculture has gained momentum, showing signs of rapid growth. The latest to appear on the scene is bringing convenience in how agriculture can be done by employing various computational technologies. There are lots of factors that affect agricultural production, with seed quality topping the list. Seed classification can provide additional knowledge about quality production, seed quality control and impurity identification. The process of categorising seeds has been traditionally done based on characteristics like colour, shape and texture. Generally, this is performed by specialists by visually inspecting each sample, which is a very tedious and time-consuming task. This procedure can be easily automated, providing a significantly more efficient method for seed sorting than having them be inspected using human labour. In related areas, computer vision technology based on machine learning (ML), symmetry and, more particularly, convolutional neural networks (CNNs) have been generously applied, often resulting in increased work efficiency. Considering the success of the computational intelligence methods in other image classification problems, this research proposes a classification system for seeds by employing CNN and transfer learning. The proposed system contains a model that classifies 14 commonly known seeds with the implication of advanced deep learning techniques. The techniques applied in this research include decayed learning rate, model checkpointing and hybrid weight adjustment. This research applies symmetry when sampling the images of the seeds during data formation. The application of symmetry generates homogeneity with regards to resizing and labelling the images to extract their features. This resulted in 99% classification accuracy during the training set. The proposed model produced results with an accuracy of 99% for the test set, which contained 234 images. These results were much higher than the results reported in related research.


2021 ◽  
Author(s):  
Mihai Oltean

Abstract Multi Expression Programming (MEP) is a Genetic Programming variant that uses a linear representation of chromosomes. MEP individuals are strings of genes encoding complex computer programs. When MEP individuals encode expressions, their representation is similar to the way in which compilers translate C or Pascal expressions into machine code. A unique MEP feature is the ability to store multiple solutions to a problem in a single chromosome. Usually, the best solution is chosen for fitness assignment. When solving symbolic regression or classification problems (or any other problems for which the training set is known before the problem is solved) MEP has the same complexity as other techniques storing a single solution in a chromosome (such as GP, CGP, GEP, or GE). Evaluation of the expressions encoded into an MEP individual can be performed by a single parsing of the chromosome. Offspring obtained by crossover and mutation is always syntactically correct MEP individuals (computer programs). Thus, no extra processing for repairing newly obtained individuals is needed.


2019 ◽  
Author(s):  
Rafael S. Pereira ◽  
Fabio Porto

Deep learning models expect a reasonable amount of training in- stances to improve prediction quality. Moreover, in classification problems, the occurrence of an unbalanced distribution may lead to a biased model. In this paper, we investigate the problem of species classification from plant images, where some species have very few image samples. We explore reduced versions of imagenet Neural Network winners architecture to filter the space of candi- date matches, under a target accuracy level. We show through experimental results using real unbalanced plant image datasets that our approach can lead to classifications within the 5 best positions with high probability.


2019 ◽  
Author(s):  
Rafael S. Pereira ◽  
Fabio Porto

Deep learning models expect a reasonable amount of training instances to improve prediction quality. Moreover, in classification problems, the occurrence of an unbalanced distribution may lead to a biased model. In this paper, we investigate the problem of species classification from plant images, where some species have very few image samples. We explore reduced versions of imagenet Neural Network winners architecture to filter the space of candidate matches, under a target accuracy level. We show through experimental results using real unbalanced plant image datasets that our approach can lead to classifications within the 5 best positions with high probability.  


Sign in / Sign up

Export Citation Format

Share Document