A hybrid deep learning-based fruit classification using attention model and convolution autoencoder

Complex & Intelligent Systems ◽

10.1007/s40747-020-00192-x ◽

2020 ◽

Author(s):

Gang Xue ◽

Shifeng Liu ◽

Yicao Ma

Keyword(s):

Supply Chain ◽

Deep Learning ◽

Image Classification ◽

Image Recognition ◽

Ground Truth ◽

Experimental Results ◽

Attention Model ◽

Convolutional Networks ◽

Classification Framework ◽

Fruit Sorting

Abstract Image recognition supports several applications, for instance, facial recognition, image classification, and achieving accurate fruit and vegetable classification is very important in fresh supply chain, factories, supermarkets, and other fields. In this paper, we develop a hybrid deep learning-based fruit image classification framework, named attention-based densely connected convolutional networks with convolution autoencoder (CAE-ADN), which uses a convolution autoencoder to pre-train the images and uses an attention-based DenseNet to extract the features of image. In the first part of the framework, an unsupervised method with a set of images is applied to pre-train the greedy layer-wised CAE. We use CAE structure to initialize a set of weights and bias of ADN. In the second part of the framework, the supervised ADN with the ground truth is implemented. The final part of the framework makes a prediction of the category of fruits. We use two fruit datasets to test the effectiveness of the model, experimental results show the effectiveness of the framework, and the framework can improve the efficiency of fruit sorting, which can reduce costs of fresh supply chain, factories, supermarkets, etc.

Download Full-text

Ship Classification in High-Resolution SAR Images Using Deep Learning of Small Datasets

Sensors ◽

10.3390/s18092929 ◽

2018 ◽

Vol 18 (9) ◽

pp. 2929 ◽

Cited By ~ 14

Author(s):

Yuanyuan Wang ◽

Chao Wang ◽

Hong Zhang

Keyword(s):

Deep Learning ◽

High Resolution ◽

Classification Accuracy ◽

Experimental Results ◽

Fine Tuning ◽

Classification Model ◽

Great Success ◽

Sar Images ◽

Convolutional Networks ◽

Ship Classification

With the capability to automatically learn discriminative features, deep learning has experienced great success in natural images but has rarely been explored for ship classification in high-resolution SAR images due to the training bottleneck caused by the small datasets. In this paper, convolutional neural networks (CNNs) are applied to ship classification by using SAR images with the small datasets. First, ship chips are constructed from high-resolution SAR images and split into training and validation datasets. Second, a ship classification model is constructed based on very deep convolutional networks (VGG). Then, VGG is pretrained via ImageNet, and fine tuning is utilized to train our model. Six scenes of COSMO-SkyMed images are used to evaluate our proposed model with regard to the classification accuracy. The experimental results reveal that (1) our proposed ship classification model trained by fine tuning achieves more than 95% average classification accuracy, even with 5-cross validation; (2) compared with other models, the ship classification model based on VGG16 achieves at least 2% higher accuracies for classification. These experimental results reveal the effectiveness of our proposed method.

Download Full-text

The optimization of parallel convolutional RBM based on Spark

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691319400113 ◽

2019 ◽

Vol 17 (02) ◽

pp. 1940011 ◽

Cited By ~ 1

Author(s):

Yun Jiang ◽

Junyu Zhuo ◽

Juan Zhang ◽

Xiao Xiao

Keyword(s):

Deep Learning ◽

Speech Recognition ◽

Image Classification ◽

Image Recognition ◽

Restricted Boltzmann Machine ◽

Sequential Algorithm ◽

Boltzmann Machine ◽

X Ray ◽

Model Based ◽

Traditional Algorithm

With the extensive attention and research of the scholars in deep learning, the convolutional restricted Boltzmann machine (CRBM) model based on restricted Boltzmann machine (RBM) is widely used in image recognition, speech recognition, etc. However, time consuming training still seems to be an unneglectable issue. To solve this problem, this paper mainly uses optimized parallel CRBM based on Spark, and proposes a parallel comparison divergence algorithm based on Spark and uses it to train the CRBM model to improve the training speed. The experiments show that the method is faster than traditional sequential algorithm. We train the CRBM with the method and apply it to breast X-ray image classification. The experiments show that it can improve the precision and the speed of training compared with traditional algorithm.

Download Full-text

Research on Deep Learning Techniques in Breaking Text-Based Captchas and Designing Image-Based Captcha

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-900 ◽

2021 ◽

pp. 266-269

Author(s):

Janarthanan A ◽

Pandiyarajan C ◽

Sabarinathan M ◽

Sudhan M ◽

Kala R

Keyword(s):

Deep Learning ◽

Image Classification ◽

Character Recognition ◽

Optical Character Recognition ◽

Experimental Results ◽

Text Recognition ◽

Image Resizing ◽

Optical Character ◽

Learning Techniques ◽

Text Images

Optical character recognition (OCR) is a process of text recognition in images (one word). The input images are taken from the dataset. The collected text images are implemented to pre-processing. In pre-processing, we can implement the image resize process. Image resizing is necessary when you need to increase or decrease the total number of pixels, whereas remapping can occur when you are zooming refers to increase the quantity of pixels, so that when you zoom an image, you will see clear content. After that, we can implement the segmentation process. In segmentation, we can segment the each characters in one word. We can extract the features values from the image that means test feature. In classification process, we have to classify the text from the image. Image classification is performed the images in order to identify which image contains text. A classifier is used to identify the image containing text. The experimental results shows that the accuracy.

Download Full-text

Research on Recognition Method of COVID-19 Images Based on Deep Learning

10.1101/2020.12.09.20246371 ◽

2020 ◽

Author(s):

dongshen ji ◽

yanzhong zhao ◽

zhujun zhang ◽

qianchuan zhao

Keyword(s):

Deep Learning ◽

Image Recognition ◽

Feature Fusion ◽

Recognition Accuracy ◽

Recognition Performance ◽

Small Sample ◽

Experimental Results ◽

Ct Image ◽

Recognition Method ◽

Sample Recognition

In view of the large demand for new coronary pneumonia covid19 image recognition samples,the recognition accuracy is not ideal.In this paper,a new coronary pneumonia positive image recognition method proposed based on small sample recognition. First, the CT image pictures are preprocessed, and the pictures are converted into the picture formats which are required for transfer learning. Secondly, perform small-sample image enhancement and expansion on the converted picture, such as miscut transformation, random rotation and translation, etc.. Then, multiple migration models are used to extract features and then perform feature fusion. Finally,the model is adjusted by fine-tuning.Then train the model to obtain experimental results. The experimental results show that our method has excellent recognition performance in the recognition of new coronary pneumonia images,even with only a small number of CT image samples.

Download Full-text

A Garbage Image Classification Framework Based on Deep Learning

Advances in Intelligent Systems and Computing - The 10th International Conference on Computer Engineering and Networks ◽

10.1007/978-981-15-8462-6_32 ◽

2020 ◽

pp. 268-275

Author(s):

Chengchuang Lin ◽

Gansen Zhao ◽

Lei Zhao ◽

Bingchuan Chen

Keyword(s):

Deep Learning ◽

Image Classification ◽

Classification Framework

Download Full-text

OP0059 AUTOSCORA: DEEP LEARNING TO AUTOMATE SCORING OF RADIOGRAPHIC PROGRESSION IN RHEUMATOID ARTHRITIS

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-eular.714 ◽

2020 ◽

Vol 79 (Suppl 1) ◽

pp. 39.2-40

Author(s):

T. Deimel ◽

D. Aletaha ◽

G. Langs

Keyword(s):

Rheumatoid Arthritis ◽

Clinical Trials ◽

Deep Learning ◽

Clinical Practice ◽

Radiographic Progression ◽

Scoring Systems ◽

Ground Truth ◽

Research Support ◽

Convolutional Networks ◽

Roi Extraction

Background:The prevention of joint destruction is an important goal in the management of rheumatoid arthritis (RA) and a key endpoint in drug trials. To quantify structural damage in radiographs, standardized scoring systems1, such as the Sharp/van der Heijde (SvdH) score2, which separately assesses joint space narrowing (JSN) and erosions, have been developed. However, application of these scores is time-consuming, requires specially trained staff, and results are subject to considerable intra- and inter-reader variability1. This makes their application poorly feasible in clinical practice and limits their reliability in clinical trials.Objectives:We aim to develop a fully automated deep learning-based scoring system of radiographic progression in RA to facilitate the introduction of quantitative joint damage assessment into daily clinical practice and circumvent inter-reader variability in clinical trials.Methods:5191 hand radiographs and their corresponding SvdH JSN scores from 640 adult patients with RA without visible joint surgery were extracted from the picture archive of a large tertiary hospital. The dataset was split, on a patient level, into training (2207 images/270 patients), validation (1150/133), and test (1834/237) sets. Joints were automatically localized using a particular deep learning model3which utilizes the local appearance of joints combined with information on the spatial relationship between joints. Small regions of interest (ROI) were automatically extracted around each joint. Finally, different deep learning architectures were trained on the extracted ROIs using the manually assigned SvdH JSN scores as ground truth (Fig. 1). The best models were chosen based on their performance on the validation set. Their ability to assign the correct SvdH JSN scores to ROIs was assessed using the unseen data of the test set.Fig. 1.3-step approach to automated scoring: joint localization, ROI extraction, JSN scoring.Results:ROI extraction was successful in 96% of joints, meaning that all structures were visible and joints were not malrotated by more than 30 degrees. For JSN scoring, modifications of the VGG164architecture seemed to outperform adaptations of DenseNet5. The mean obtained accuracy (i.e., the percentage of joints to which the human reader and our system assigned the same score) for MCP joints was 80.5 %, that for PIP joints was 72.3 %. In only 1.8 % (MCPs) and 1.7 % (PIPs) of cases did the predicted score differ by more than one point from the ground truth (Fig. 2).Fig. 2.Confusion matrices of automatically assigned scores (‘predicted score’) vs. the human reader ground truth (‘true score’).Conclusion:Although a number of previous efforts have been published, none have succeeded in replacing manual scoring systems at scale. To our knowledge, this is the first work that utilizes a dataset of adequate size to apply deep learning to automate JSN scoring. Our results are, even in this early version, in good agreement with human reader ground truth scores. In future versions, this system can be expanded to the detection of erosions and to all joints contained in the SvdH score.References:[1]Boini, S. & Guillemin, F. Radiographic scoring methods as outcome measures in rheumatoid arthritis: properties and advantages.Ann. Rheum. Dis.60, 817–827 (2001).[2]van der Heijde, D. How to read radiographs according to the Sharp/van der Heijde method.J. Rheumatol.27, 261–263 (2000).[3]Payer, C., Štern, D., Bischof, H. & Urschler, M. Regressing Heatmaps for Multiple Landmark Localization Using CNNs. inMedical Image Computing and Computer-Assisted Intervention – MICCAI 2016230–238 (Springer, Cham, 2016). doi:10.1007/978-3-319-46723-8_27.[4]Simonyan, K. & Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition.arXiv:1409.1556 [cs](2015).[5]Huang, G., Liu, Z., van der Maaten, L. & Weinberger, K. Q. Densely Connected Convolutional Networks.arXiv:1608.06993 [cs](2016).Disclosure of Interests:Thomas Deimel: None declared, Daniel Aletaha Grant/research support from: AbbVie, Novartis, Roche, Consultant of: AbbVie, Amgen, Celgene, Lilly, Medac, Merck, Novartis, Pfizer, Roche, Sandoz, Sanofi Genzyme, Speakers bureau: AbbVie, Celgene, Lilly, Merck, Novartis, Pfizer, Sanofi Genzyme, UCB, Georg Langs Shareholder of: Co-Founder/Shareholder contextflow GmbH, Grant/research support from: Grants by Novartis, Siemens Healthineers, NVIDIA

Download Full-text

Automated Ground Truth Generation for Learning-Based Crack Detection on Concrete Surfaces

Applied Sciences ◽

10.3390/app112210966 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10966

Author(s):

Hsiang-Chieh Chen ◽

Zheng-Ting Li

Keyword(s):

Deep Learning ◽

Crack Detection ◽

Ground Truth ◽

Experimental Results ◽

Training Data ◽

Generation Process ◽

Detection Model ◽

Ground Truth Generation ◽

Labeling Approach ◽

The Cost

This article introduces an automated data-labeling approach for generating crack ground truths (GTs) within concrete images. The main algorithm includes generating first-round GTs, pre-training a deep learning-based model, and generating second-round GTs. On the basis of the generated second-round GTs of the training data, a learning-based crack detection model can be trained in a self-supervised manner. The pre-trained deep learning-based model is effective for crack detection after it is re-trained using the second-round GTs. The main contribution of this study is the proposal of an automated GT generation process for training a crack detection model at the pixel level. Experimental results show that the second-round GTs are similar to manually marked labels. Accordingly, the cost of implementing learning-based methods is reduced significantly because data labeling by humans is not necessitated.

Download Full-text

Analysis of Tracheobronchial Diverticula Based on Semantic Segmentation of CT Images via the Dual-Channel Attention Network

Frontiers in Public Health ◽

10.3389/fpubh.2021.813717 ◽

2022 ◽

Vol 9 ◽

Author(s):

Maoyi Zhang ◽

Changqing Ding ◽

Shuli Guo

Keyword(s):

Deep Learning ◽

Ground Truth ◽

Semantic Segmentation ◽

Rapid Identification ◽

Diagnostic Process ◽

Automated Identification ◽

Data Set ◽

Attention Model ◽

Physiological Indicators ◽

Dual Channel

Tracheobronchial diverticula (TD) is a common cystic lesion that can be easily neglected; hence accurate and rapid identification is critical for later diagnosis. There is a strong need to automate this diagnostic process because traditional manual observations are time-consuming and laborious. However, most studies have only focused on the case report or listed the relationship between the disease and other physiological indicators, but a few have adopted advanced technologies such as deep learning for automated identification and diagnosis. To fill this gap, this study interpreted TD recognition as semantic segmentation and proposed a novel attention-based network for TD semantic segmentation. Since the area of TD lesion is small and similar to surrounding organs, we designed the atrous spatial pyramid pooling (ASPP) and attention mechanisms, which can efficiently complete the segmentation of TD with robust results. The proposed attention model can selectively gather features from different branches according to the amount of information they contain. Besides, to the best of our knowledge, no public research data is available yet. For efficient network training, we constructed a data set containing 218 TD and related ground truth (GT). We evaluated different models based on the proposed data set, among which the highest MIOU can reach 0.92. The experiments show that our model can outperform state-of-the-art methods, indicating that the deep learning method has great potential for TD recognition.

Download Full-text