scholarly journals A hybrid deep learning-based fruit classification using attention model and convolution autoencoder

Author(s):  
Gang Xue ◽  
Shifeng Liu ◽  
Yicao Ma

Abstract Image recognition supports several applications, for instance, facial recognition, image classification, and achieving accurate fruit and vegetable classification is very important in fresh supply chain, factories, supermarkets, and other fields. In this paper, we develop a hybrid deep learning-based fruit image classification framework, named attention-based densely connected convolutional networks with convolution autoencoder (CAE-ADN), which uses a convolution autoencoder to pre-train the images and uses an attention-based DenseNet to extract the features of image. In the first part of the framework, an unsupervised method with a set of images is applied to pre-train the greedy layer-wised CAE. We use CAE structure to initialize a set of weights and bias of ADN. In the second part of the framework, the supervised ADN with the ground truth is implemented. The final part of the framework makes a prediction of the category of fruits. We use two fruit datasets to test the effectiveness of the model, experimental results show the effectiveness of the framework, and the framework can improve the efficiency of fruit sorting, which can reduce costs of fresh supply chain, factories, supermarkets, etc.

Sensors ◽  
2018 ◽  
Vol 18 (9) ◽  
pp. 2929 ◽  
Author(s):  
Yuanyuan Wang ◽  
Chao Wang ◽  
Hong Zhang

With the capability to automatically learn discriminative features, deep learning has experienced great success in natural images but has rarely been explored for ship classification in high-resolution SAR images due to the training bottleneck caused by the small datasets. In this paper, convolutional neural networks (CNNs) are applied to ship classification by using SAR images with the small datasets. First, ship chips are constructed from high-resolution SAR images and split into training and validation datasets. Second, a ship classification model is constructed based on very deep convolutional networks (VGG). Then, VGG is pretrained via ImageNet, and fine tuning is utilized to train our model. Six scenes of COSMO-SkyMed images are used to evaluate our proposed model with regard to the classification accuracy. The experimental results reveal that (1) our proposed ship classification model trained by fine tuning achieves more than 95% average classification accuracy, even with 5-cross validation; (2) compared with other models, the ship classification model based on VGG16 achieves at least 2% higher accuracies for classification. These experimental results reveal the effectiveness of our proposed method.


Author(s):  
Yun Jiang ◽  
Junyu Zhuo ◽  
Juan Zhang ◽  
Xiao Xiao

With the extensive attention and research of the scholars in deep learning, the convolutional restricted Boltzmann machine (CRBM) model based on restricted Boltzmann machine (RBM) is widely used in image recognition, speech recognition, etc. However, time consuming training still seems to be an unneglectable issue. To solve this problem, this paper mainly uses optimized parallel CRBM based on Spark, and proposes a parallel comparison divergence algorithm based on Spark and uses it to train the CRBM model to improve the training speed. The experiments show that the method is faster than traditional sequential algorithm. We train the CRBM with the method and apply it to breast X-ray image classification. The experiments show that it can improve the precision and the speed of training compared with traditional algorithm.


Author(s):  
Janarthanan A ◽  
Pandiyarajan C ◽  
Sabarinathan M ◽  
Sudhan M ◽  
Kala R

Optical character recognition (OCR) is a process of text recognition in images (one word). The input images are taken from the dataset. The collected text images are implemented to pre-processing. In pre-processing, we can implement the image resize process. Image resizing is necessary when you need to increase or decrease the total number of pixels, whereas remapping can occur when you are zooming refers to increase the quantity of pixels, so that when you zoom an image, you will see clear content. After that, we can implement the segmentation process. In segmentation, we can segment the each characters in one word. We can extract the features values from the image that means test feature. In classification process, we have to classify the text from the image. Image classification is performed the images in order to identify which image contains text. A classifier is used to identify the image containing text. The experimental results shows that the accuracy.


2020 ◽  
Author(s):  
dongshen ji ◽  
yanzhong zhao ◽  
zhujun zhang ◽  
qianchuan zhao

In view of the large demand for new coronary pneumonia covid19 image recognition samples,the recognition accuracy is not ideal.In this paper,a new coronary pneumonia positive image recognition method proposed based on small sample recognition. First, the CT image pictures are preprocessed, and the pictures are converted into the picture formats which are required for transfer learning. Secondly, perform small-sample image enhancement and expansion on the converted picture, such as miscut transformation, random rotation and translation, etc.. Then, multiple migration models are used to extract features and then perform feature fusion. Finally,the model is adjusted by fine-tuning.Then train the model to obtain experimental results. The experimental results show that our method has excellent recognition performance in the recognition of new coronary pneumonia images,even with only a small number of CT image samples.


2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 39.2-40
Author(s):  
T. Deimel ◽  
D. Aletaha ◽  
G. Langs

Background:The prevention of joint destruction is an important goal in the management of rheumatoid arthritis (RA) and a key endpoint in drug trials. To quantify structural damage in radiographs, standardized scoring systems1, such as the Sharp/van der Heijde (SvdH) score2, which separately assesses joint space narrowing (JSN) and erosions, have been developed. However, application of these scores is time-consuming, requires specially trained staff, and results are subject to considerable intra- and inter-reader variability1. This makes their application poorly feasible in clinical practice and limits their reliability in clinical trials.Objectives:We aim to develop a fully automated deep learning-based scoring system of radiographic progression in RA to facilitate the introduction of quantitative joint damage assessment into daily clinical practice and circumvent inter-reader variability in clinical trials.Methods:5191 hand radiographs and their corresponding SvdH JSN scores from 640 adult patients with RA without visible joint surgery were extracted from the picture archive of a large tertiary hospital. The dataset was split, on a patient level, into training (2207 images/270 patients), validation (1150/133), and test (1834/237) sets. Joints were automatically localized using a particular deep learning model3which utilizes the local appearance of joints combined with information on the spatial relationship between joints. Small regions of interest (ROI) were automatically extracted around each joint. Finally, different deep learning architectures were trained on the extracted ROIs using the manually assigned SvdH JSN scores as ground truth (Fig. 1). The best models were chosen based on their performance on the validation set. Their ability to assign the correct SvdH JSN scores to ROIs was assessed using the unseen data of the test set.Fig. 1.3-step approach to automated scoring: joint localization, ROI extraction, JSN scoring.Results:ROI extraction was successful in 96% of joints, meaning that all structures were visible and joints were not malrotated by more than 30 degrees. For JSN scoring, modifications of the VGG164architecture seemed to outperform adaptations of DenseNet5. The mean obtained accuracy (i.e., the percentage of joints to which the human reader and our system assigned the same score) for MCP joints was 80.5 %, that for PIP joints was 72.3 %. In only 1.8 % (MCPs) and 1.7 % (PIPs) of cases did the predicted score differ by more than one point from the ground truth (Fig. 2).Fig. 2.Confusion matrices of automatically assigned scores (‘predicted score’) vs. the human reader ground truth (‘true score’).Conclusion:Although a number of previous efforts have been published, none have succeeded in replacing manual scoring systems at scale. To our knowledge, this is the first work that utilizes a dataset of adequate size to apply deep learning to automate JSN scoring. Our results are, even in this early version, in good agreement with human reader ground truth scores. In future versions, this system can be expanded to the detection of erosions and to all joints contained in the SvdH score.References:[1]Boini, S. & Guillemin, F. Radiographic scoring methods as outcome measures in rheumatoid arthritis: properties and advantages.Ann. Rheum. Dis.60, 817–827 (2001).[2]van der Heijde, D. How to read radiographs according to the Sharp/van der Heijde method.J. Rheumatol.27, 261–263 (2000).[3]Payer, C., Štern, D., Bischof, H. & Urschler, M. Regressing Heatmaps for Multiple Landmark Localization Using CNNs. inMedical Image Computing and Computer-Assisted Intervention – MICCAI 2016230–238 (Springer, Cham, 2016). doi:10.1007/978-3-319-46723-8_27.[4]Simonyan, K. & Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition.arXiv:1409.1556 [cs](2015).[5]Huang, G., Liu, Z., van der Maaten, L. & Weinberger, K. Q. Densely Connected Convolutional Networks.arXiv:1608.06993 [cs](2016).Disclosure of Interests:Thomas Deimel: None declared, Daniel Aletaha Grant/research support from: AbbVie, Novartis, Roche, Consultant of: AbbVie, Amgen, Celgene, Lilly, Medac, Merck, Novartis, Pfizer, Roche, Sandoz, Sanofi Genzyme, Speakers bureau: AbbVie, Celgene, Lilly, Merck, Novartis, Pfizer, Sanofi Genzyme, UCB, Georg Langs Shareholder of: Co-Founder/Shareholder contextflow GmbH, Grant/research support from: Grants by Novartis, Siemens Healthineers, NVIDIA


2021 ◽  
Vol 11 (22) ◽  
pp. 10966
Author(s):  
Hsiang-Chieh Chen ◽  
Zheng-Ting Li

This article introduces an automated data-labeling approach for generating crack ground truths (GTs) within concrete images. The main algorithm includes generating first-round GTs, pre-training a deep learning-based model, and generating second-round GTs. On the basis of the generated second-round GTs of the training data, a learning-based crack detection model can be trained in a self-supervised manner. The pre-trained deep learning-based model is effective for crack detection after it is re-trained using the second-round GTs. The main contribution of this study is the proposal of an automated GT generation process for training a crack detection model at the pixel level. Experimental results show that the second-round GTs are similar to manually marked labels. Accordingly, the cost of implementing learning-based methods is reduced significantly because data labeling by humans is not necessitated.


2022 ◽  
Vol 9 ◽  
Author(s):  
Maoyi Zhang ◽  
Changqing Ding ◽  
Shuli Guo

Tracheobronchial diverticula (TD) is a common cystic lesion that can be easily neglected; hence accurate and rapid identification is critical for later diagnosis. There is a strong need to automate this diagnostic process because traditional manual observations are time-consuming and laborious. However, most studies have only focused on the case report or listed the relationship between the disease and other physiological indicators, but a few have adopted advanced technologies such as deep learning for automated identification and diagnosis. To fill this gap, this study interpreted TD recognition as semantic segmentation and proposed a novel attention-based network for TD semantic segmentation. Since the area of TD lesion is small and similar to surrounding organs, we designed the atrous spatial pyramid pooling (ASPP) and attention mechanisms, which can efficiently complete the segmentation of TD with robust results. The proposed attention model can selectively gather features from different branches according to the amount of information they contain. Besides, to the best of our knowledge, no public research data is available yet. For efficient network training, we constructed a data set containing 218 TD and related ground truth (GT). We evaluated different models based on the proposed data set, among which the highest MIOU can reach 0.92. The experiments show that our model can outperform state-of-the-art methods, indicating that the deep learning method has great potential for TD recognition.


Sign in / Sign up

Export Citation Format

Share Document