Equalization of the Training Set For Backpropagation Networks Applied to Classification Problems

A hybrid vision-map system is presented to solve the road detection problem in urban scenarios. The standardized use of machine learning techniques in classification problems has been merged with digital navigation map information to increase system robustness. The objective of this paper is to create a new environment perception method to detect the road in urban environments, fusing stereo vision with digital maps by detecting road appearance and road limits such as lane markings or curbs. Deep learning approaches make the system hard-coupled to the training set. Even though our approach is based on machine learning techniques, the features are calculated from different sources (GPS, map, curbs, etc.), making our system less dependent on the training set.

Download Full-text

Prediction of Aquatic Ecosystem Health Indices through Machine Learning Models Using the WGAN-Based Data Augmentation Method

Sustainability ◽

10.3390/su131810435 ◽

2021 ◽

Vol 13 (18) ◽

pp. 10435

Author(s):

Seoro Lee ◽

Jonggun Kim ◽

Gwanjae Lee ◽

Jiyeong Hong ◽

Joo Hyun Bae ◽

...

Keyword(s):

Machine Learning ◽

Aquatic Ecosystem ◽

Ecosystem Health ◽

Data Augmentation ◽

Predictive Performance ◽

Classification Problems ◽

Training Set ◽

Generative Adversarial Network ◽

Assessment Index ◽

Data Imbalance

Changes in hydrological characteristics and increases in various pollutant loadings due to rapid climate change and urbanization have a significant impact on the deterioration of aquatic ecosystem health (AEH). Therefore, it is important to effectively evaluate the AEH in advance and establish appropriate strategic plans. Recently, machine learning (ML) models have been widely used to solve hydrological and environmental problems in various fields. However, in general, collecting sufficient data for ML training is time-consuming and labor-intensive. Especially in classification problems, data imbalance can lead to erroneous prediction results of ML models. In this study, we proposed a method to solve the data imbalance problem through data augmentation based on Wasserstein Generative Adversarial Network (WGAN) and to efficiently predict the grades (from A to E grades) of AEH indices (i.e., Benthic Macroinvertebrate Index (BMI), Trophic Diatom Index (TDI), Fish Assessment Index (FAI)) through the ML models. Raw datasets for the AEH indices composed of various physicochemical factors (i.e., WT, DO, BOD5, SS, TN, TP, and Flow) and AEH grades were built and augmented through the WGAN. The performance of each ML model was evaluated through a 10-fold cross-validation (CV), and the performances of the ML models trained on the raw and WGAN-based training sets were compared and analyzed through AEH grade prediction on the test sets. The results showed that the ML models trained on the WGAN-based training set had an average F1-score for grades of each AEH index of 0.9 or greater for the test set, which was superior to the models trained on the raw training set (fewer data compared to other datasets) only. Through the above results, it was confirmed that by using the dataset augmented through WGAN, the ML model can yield better AEH grade predictive performance compared to the model trained on limited datasets; this approach reduces the effort needed for actual data collection from rivers which requires enormous time and cost. In the future, the results of this study can be used as basic data to construct big data of aquatic ecosystems, needed to efficiently evaluate and predict AEH in rivers based on the ML models.

Download Full-text

Classification of rare traffic signs

Computer Optics ◽

10.18287/2412-6179-co-601 ◽

2020 ◽

Vol 44 (2) ◽

pp. 236-243 ◽

Cited By ~ 1

Author(s):

B.V. Faizov ◽

V.I. Shakhuro ◽

V.V. Sanzharov ◽

A.S. Konushin

Keyword(s):

Neural Network ◽

Neural Networks ◽

Real Data ◽

Significant Loss ◽

Classification Problems ◽

Training Set ◽

Traffic Signs ◽

Different Types ◽

Classification Quality

The paper studies the possibility of using neural networks for the classification of objects that are few or absent at all in the training set. The task is illustrated by the example of classification of rare traffic signs. We consider neural networks trained using a contrastive loss function and its modifications, also we use different methods for generating synthetic samples for classification problems. As a basic method, the indexing of classes using neural network features is used. A comparison is made of classifiers trained with three different types of synthetic samples and their mixtures with real data. We propose a method of classification of rare traffic signs using a neural network discriminator of rare and frequent signs. The experimental evaluation shows that the proposed method allows rare traffic signs to be classified without significant loss of frequent sign classification quality.

Download Full-text

Deep Neural Networks with Multistate Activation Functions

Computational Intelligence and Neuroscience ◽

10.1155/2015/721367 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

Chenghao Cai ◽

Yanyan Xu ◽

Dengfeng Ke ◽

Kaile Su

Keyword(s):

Neural Networks ◽

Gradient Descent ◽

Deep Neural Networks ◽

Error Rates ◽

Stochastic Gradient Descent ◽

Activation Functions ◽

Classification Problems ◽

Training Set ◽

Relative Improvement ◽

Better Than

We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including theN-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs perform when used to resolve classification problems. Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates. Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets. The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

Download Full-text

Selecting the Training Set in Classification Problems with Rare Events

Studies in Classification, Data Analysis, and Knowledge Organization - New Developments in Classification and Data Analysis ◽

10.1007/3-540-27373-5_5 ◽

2005 ◽

pp. 39-46 ◽

Cited By ~ 1

Author(s):

Bruno Scarpa ◽

Nicola Torelli

Keyword(s):

Rare Events ◽

Classification Problems ◽

Training Set

Download Full-text

A Convolution Neural Network-Based Seed Classification System

Symmetry ◽

10.3390/sym12122018 ◽

2020 ◽

Vol 12 (12) ◽

pp. 2018

Author(s):

Yonis Gulzar ◽

Yasir Hamid ◽

Arjumand Bano Soomro ◽

Ali A. Alwan ◽

Ludovic Journaux

Keyword(s):

Classification System ◽

Seed Quality ◽

Classification Problems ◽

Training Set ◽

Learning Techniques ◽

Proposed Model ◽

Human Labour ◽

Computer Vision Technology ◽

Computational Intelligence Methods ◽

Quality Production

Over the last few years, the research into agriculture has gained momentum, showing signs of rapid growth. The latest to appear on the scene is bringing convenience in how agriculture can be done by employing various computational technologies. There are lots of factors that affect agricultural production, with seed quality topping the list. Seed classification can provide additional knowledge about quality production, seed quality control and impurity identification. The process of categorising seeds has been traditionally done based on characteristics like colour, shape and texture. Generally, this is performed by specialists by visually inspecting each sample, which is a very tedious and time-consuming task. This procedure can be easily automated, providing a significantly more efficient method for seed sorting than having them be inspected using human labour. In related areas, computer vision technology based on machine learning (ML), symmetry and, more particularly, convolutional neural networks (CNNs) have been generously applied, often resulting in increased work efficiency. Considering the success of the computational intelligence methods in other image classification problems, this research proposes a classification system for seeds by employing CNN and transfer learning. The proposed system contains a model that classifies 14 commonly known seeds with the implication of advanced deep learning techniques. The techniques applied in this research include decayed learning rate, model checkpointing and hybrid weight adjustment. This research applies symmetry when sampling the images of the seeds during data formation. The application of symmetry generates homogeneity with regards to resizing and labelling the images to extract their features. This resulted in 99% classification accuracy during the training set. The proposed model produced results with an accuracy of 99% for the test set, which contained 234 images. These results were much higher than the results reported in related research.

Download Full-text

Multi Expression Programming - an in-depth description

10.21203/rs.3.rs-898407/v1 ◽

2021 ◽

Author(s):

Mihai Oltean

Keyword(s):

Linear Representation ◽

Computer Programs ◽

Classification Problems ◽

Machine Code ◽

Training Set ◽

Single Chromosome ◽

Genes Encoding ◽

Multi Expression Programming ◽

Crossover And Mutation ◽

Complex Computer

Abstract Multi Expression Programming (MEP) is a Genetic Programming variant that uses a linear representation of chromosomes. MEP individuals are strings of genes encoding complex computer programs. When MEP individuals encode expressions, their representation is similar to the way in which compilers translate C or Pascal expressions into machine code. A unique MEP feature is the ability to store multiple solutions to a problem in a single chromosome. Usually, the best solution is chosen for fitness assignment. When solving symbolic regression or classification problems (or any other problems for which the training set is known before the problem is solved) MEP has the same complexity as other techniques storing a single solution in a chromosome (such as GP, CGP, GEP, or GE). Evaluation of the expressions encoded into an MEP individual can be performed by a single parsing of the chromosome. Offspring obtained by crossover and mutation is always syntactically correct MEP individuals (computer programs). Thus, no extra processing for repairing newly obtained individuals is needed.

Download Full-text

Deep Learning Application for Plant Classification on Unbalanced Training Set

10.5753/bresci.2019.6304 ◽

2019 ◽

Author(s):

Rafael S. Pereira ◽

Fabio Porto

Keyword(s):

Neural Network ◽

Deep Learning ◽

High Probability ◽

Experimental Results ◽

Learning Models ◽

Classification Problems ◽

Training Set ◽

Species Classification ◽

Accuracy Level ◽

Plant Image

Deep learning models expect a reasonable amount of training in- stances to improve prediction quality. Moreover, in classification problems, the occurrence of an unbalanced distribution may lead to a biased model. In this paper, we investigate the problem of species classification from plant images, where some species have very few image samples. We explore reduced versions of imagenet Neural Network winners architecture to filter the space of candi- date matches, under a target accuracy level. We show through experimental results using real unbalanced plant image datasets that our approach can lead to classifications within the 5 best positions with high probability.

Download Full-text

A Strategy for Training Set Selection in Text Classification Problems

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2013.040608 ◽

2013 ◽

Vol 4 (6) ◽

Cited By ~ 1

Author(s):

Maria Luiza ◽

Katiusca B. ◽

Grazziela P. ◽

Nelson F.

Keyword(s):

Text Classification ◽

Classification Problems ◽

Training Set ◽

Training Set Selection

Download Full-text

Deep Learning Application for Plant Classification on Unbalanced Training Set

10.5753/bresci.2019.10023 ◽

2019 ◽

Author(s):

Rafael S. Pereira ◽

Fabio Porto

Keyword(s):

Neural Network ◽

Deep Learning ◽

High Probability ◽

Experimental Results ◽

Learning Models ◽

Classification Problems ◽

Training Set ◽

Species Classification ◽

Accuracy Level ◽

Plant Image

Deep learning models expect a reasonable amount of training instances to improve prediction quality. Moreover, in classification problems, the occurrence of an unbalanced distribution may lead to a biased model. In this paper, we investigate the problem of species classification from plant images, where some species have very few image samples. We explore reduced versions of imagenet Neural Network winners architecture to filter the space of candidate matches, under a target accuracy level. We show through experimental results using real unbalanced plant image datasets that our approach can lead to classifications within the 5 best positions with high probability.

Download Full-text