A Scalable Data Augmentation and Training Pipeline for Logo Detection

Author(s):  
Han Guo ◽  
Viswanathan Swaminathan ◽  
Saayan Mitra
Author(s):  
Prem Enkvetchakul ◽  
Olarik Surinta

Plant disease is the most common problem in agriculture. Usually, the symptoms appear on leaves of the plants which allow farmers to diagnose and prevent the disease from spreading to other areas. An accurate and consistent plant disease recognition system can help prevent the spread of diseases and save maintenance costs. In this research, we present a plant leaf disease recognition system using two deep convolutional neural networks (CNNs); MobileNetV2 and NasNetMobile. These CNN architectures are designed to be suitable for smartphones due to the models being small. We have experimented on training techniques; online, offline, and mixed training techniques on two plant leaf diseases. As for data augmentation techniques, we found that the combination of rotation, shift, and zoom techniques significantly increases the performance of the CNN architectures. The experimental results show that the most accurate algorithm for plant leaf disease recognition is NASNetMobile architecture using transfer learning. Additionally, the most accurate result is obtained when combining the offline training technique with data augmentation techniques.


2022 ◽  
Vol 12 ◽  
Author(s):  
Shenda Hong ◽  
Wenrui Zhang ◽  
Chenxi Sun ◽  
Yuxi Zhou ◽  
Hongyan Li

Cardiovascular diseases (CVDs) are one of the most fatal disease groups worldwide. Electrocardiogram (ECG) is a widely used tool for automatically detecting cardiac abnormalities, thereby helping to control and manage CVDs. To encourage more multidisciplinary researches, PhysioNet/Computing in Cardiology Challenge 2020 (Challenge 2020) provided a public platform involving multi-center databases and automatic evaluations for ECG classification tasks. As a result, 41 teams successfully submitted their solutions and were qualified for rankings. Although Challenge 2020 was a success, there has been no in-depth methodological meta-analysis of these solutions, making it difficult for researchers to benefit from the solutions and results. In this study, we aim to systematically review the 41 solutions in terms of data processing, feature engineering, model architecture, and training strategy. For each perspective, we visualize and statistically analyze the effectiveness of the common techniques, and discuss the methodological advantages and disadvantages. Finally, we summarize five practical lessons based on the aforementioned analysis: (1) Data augmentation should be employed and adapted to specific scenarios; (2) Combining different features can improve performance; (3) A hybrid design of different types of deep neural networks (DNNs) is better than using a single type; (4) The use of end-to-end architectures should depend on the task being solved; (5) Multiple models are better than one. We expect that our meta-analysis will help accelerate the research related to ECG classification based on machine-learning models.


Author(s):  
В’ячеслав Васильович Москаленко ◽  
Микола Олександрович Зарецький ◽  
Ярослав Юрійович Ковальський ◽  
Сергій Сергійович Мартиненко

Video inspection is often used to diagnose sewer pipe defects. To correctly encode founded defects according to existing standards, it is necessary to consider a lot of contextual information about the orientation and location of the camera from sewer pipe video inspection. A model for the classification of context on frames during observations in the video inspection of sewer pipes and a five-stage method of machine learning is proposed. The main idea of the proposed approach is to combine the methods of deep machine learning with the principles of information maximization and coding with self-correcting Hamming codes. The proposed model consists of a deep convolutional neural network with a sigmoid layer followed by the rounding output layer and information-extreme decision rules. The first stages of the method are data augmentation and training of the feature extractor in the Siamese model with softmax triplet loss function. The next steps involve calculating a binary code for each class of recognition that is used as a label in learning with a binary cross-entropy loss function to increase the compactness of the distribution of each class's observations in the Hamming binary space. At the last stage of the training method, it is supposed to optimize the parameters of radial-basis decision rules in the Hamming space for each class according to the existing information-extreme criterion. The information criterion, expressed as a logarithmic function of the accuracy characteristics of the decision rules, provides the maximum generalization and reliability of the model under the most difficult conditions in the statistical sense. The effectiveness of this approach was tested on data provided by Ace Pipe Cleaning (Kansas City, USA) and MPWiK (Wroclaw, Poland) by comparing learning results according to the proposed and traditional models and training schemes. The obtained model of the image frame classifier provides acceptable for practical use classification accuracy on the test sample, which is 96.8 % and exceeds the result of the traditional scheme of training with the softmax output layer by 6.8 %.


WBC is a White Blood Cell or White Blood Corpuscle also known as Leucocytes. The normal count of WBC ranges from 4000 to 11000/mm3. It plays a vital role in the human body. Many diseases in human start with the abnormal balance of WBC, which acquire the part of immunity. To have adequate knowledge about WBC, we have to have a clinical test like blood count test which gives the count of RBC, hemoglobin, WBC, etc. RBC is otherwise known as Erythrocyte and it does not have a nucleus, with pigment hemoglobin. Due to the presence of this pigment, blood is red in color. In RBC, O2 and CO2 are transported in and out of the tissues. Recent research explains about diseases like cancer, allergy, breast cancer, etc are caused due to lack or abnormal WBC. This comes with the solution of finding the count of WBC in two types: Manually and automated way. In our paper, we are concentrating on collecting the WBC count using the Data augmentation method and Convolutional Neural Network.The Quality of image is improved in comparison with number of augmented images. This explains that we have 12500 sample images in the dataset in which 9957 samples are trained, validated on 2487 samples and training accuracy is high with increasing epoch value.


2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Ahmad Hasasneh ◽  
Nikolas Kampel ◽  
Praveen Sripad ◽  
N. Jon Shah ◽  
Jürgen Dammers

We propose an artifact classification scheme based on a combined deep and convolutional neural network (DCNN) model, to automatically identify cardiac and ocular artifacts from neuromagnetic data, without the need for additional electrocardiogram (ECG) and electrooculogram (EOG) recordings. From independent components, the model uses both the spatial and temporal information of the decomposed magnetoencephalography (MEG) data. In total, 7122 samples were used after data augmentation, in which task and nontask related MEG recordings from 48 subjects served as the database for this study. Artifact rejection was applied using the combined model, which achieved a sensitivity and specificity of 91.8% and 97.4%, respectively. The overall accuracy of the model was validated using a cross-validation test and revealed a median accuracy of 94.4%, indicating high reliability of the DCNN-based artifact removal in task and nontask related MEG experiments. The major advantages of the proposed method are as follows: (1) it is a fully automated and user independent workflow of artifact classification in MEG data; (2) once the model is trained there is no need for auxiliary signal recordings; (3) the flexibility in the model design and training allows for various modalities (MEG/EEG) and various sensor types.


Information ◽  
2020 ◽  
Vol 11 (9) ◽  
pp. 422
Author(s):  
Rafael T. Anchiêta ◽  
Rogério F. de Sousa ◽  
Thiago A. S. Pardo

Paraphrase detection is a Natural-Language Processing (NLP) task that aims at automatically identifying whether two sentences convey the same meaning (even with different words). For the Portuguese language, most of the works model this task as a machine-learning solution, extracting features and training a classifier. In this paper, following a different line, we explore a graph structure representation and model the paraphrase identification task over a heterogeneous network. We also adopt a back-translation strategy for data augmentation to balance the dataset we use. Our approach, although simple, outperforms the best results reported for the paraphrase detection task in Portuguese, showing that graph structures may capture better the semantic relatedness among sentences.


2021 ◽  
Vol 12 (04) ◽  
pp. 33-49
Author(s):  
Ezeofor Chukwunazo ◽  
Akpado Kenneth ◽  
Ulasi Afamefuna

This paper presents Predictive Model for Stem Borers’ classification in Precision Farming. The recent announcement of the aggressive attack of stem borers (Spodoptera species) to maize crops in Africa is alarming. These species migrate in large numbers and feed on maize leaf, stem, and ear of corn. The male of these species are the target because after mating with their female counterpart, thousands of eggs are laid which produces larvae that create the havoc. Currently, Nigerian farmers find it difficult to distinguish between these targeted species (Fall Armyworm-FAW, African Armyworm-AAW and Egyptian cotton leaf worm-ECLW only) because they look alike in appearance. For these reasons, the network model that would predict the presence of these species in the maize farm to farmers is proposed. The maize species were captured using delta pheromone traps and laboratory breeding for each category. The captured images were pre-processed and stored in an online Google drive image dataset folder created. The convolutional neural network (CNN) model for classifying these targeted maize moths was designed from the scratch. The Google Colab platform with Python libraries was used to train the model called MothNet. The images of the FAW, AAW, and ECLW were inputted to the designed MothNet model during learning process. Dropout and data augmentation were added to the architecture of the model for an efficient prediction. After training the MothNet model, the validation accuracy achieved was 90.37% with validation loss of 24.72%, and training accuracy 90.8% with loss of 23.25%, and the training occurred within 5minutes 33seconds. Due to the small amount of images gathered (1674), the model prediction on each image was of low confident. Because of this, transfer learning was deployed and Resnet 50 pretrained model selected and modified. The modified ResNet-50 model was fine-tuned and tested. The model validation accuracy achieved was 99.21%, loss of 3.79%, and training accuracy of 99.75% with loss of 2.55% within 10mins 5 seconds. Hence, MothNet model can be improved on by gathering more images and retraining the system for optimum performance while modified ResNet 50 is recommended to be integrated in Internet of Things device for maize moths’ classification on-site.


2020 ◽  
Vol 43 ◽  
Author(s):  
Myrthe Faber

Abstract Gilead et al. state that abstraction supports mental travel, and that mental travel critically relies on abstraction. I propose an important addition to this theoretical framework, namely that mental travel might also support abstraction. Specifically, I argue that spontaneous mental travel (mind wandering), much like data augmentation in machine learning, provides variability in mental content and context necessary for abstraction.


Sign in / Sign up

Export Citation Format

Share Document