scholarly journals Training Convolutional Neural Network for Sketch Recognition on Large-Scale Dataset

Author(s):  
Wen Zhou ◽  
Jinyuan Jia

With the rapid development of computer vision technology, increasingly more focus has been put on image recognition. More specifically, a sketch is an important hand-drawn image that is garnering increased attention. Moreover, as handheld devices such as tablets, smartphones, etc. have become more popular, it has become increasingly more convenient for people to hand-draw sketches using this equipment. Hence, sketch recognition is a necessary task to improve the performance of intelligent equipment. In this paper, a sketch recognition learning approach is proposed that is based on the Visual Geometry Group16 Convolutional Neural Network (VGG16 CNN). In particular, in order to diminish the effect of the number of sketches on the learning method, we adopt a strategy of increasing the quantity to improve the diversity and scale of sketches. Initially, sketch features are extracted via the pretrained VGG16 CNN. Additionally, we obtain contextual features based on the traverse stroke scheme. Then, the VGG16 CNN is trained using a joint Bayesian method to update the related network parameters. Moreover, this network has been applied to predict the labels of input sketches in order to automatically recognize the label of a sketch. Last but not least, related experiments are conducted, and the comparison of our method with the state-of-the-art methods is performed, which shows that our approach is superior and feasible

2021 ◽  
Vol 2078 (1) ◽  
pp. 012053
Author(s):  
Yangfeng Wang ◽  
Tao Chen

Abstract With the rapid development of science and technology, biotechnology has developed rapidly. Among the many biometric technologies, finger vein technology has the characteristics of vitality, portability, and non-replicability, so it is considered to be the most promising biometric technology. However, the accuracy of finger vein recognition is affected by the collection device, the surrounding temperature and the algorithm. The flaws cannot be applied to real life on a large scale. This paper designs a finger vein recognition system based on convolutional neural network and Android, which mainly includes the following three parts. First, the system hardware includes the design of the acquisition device, the selection of the core development board and the display screen. Second, the design of the entire system software architecture is based on the MVVM architecture, which ensures low coupling of the program and is easy for later expansion and maintenance. The software includes collection function, recognition function and administrator function. Finally, a lightweight neural network is proposed for finger vein feature extraction, and proposed a storage method based on MMKV to meet the real-time performance of the system.


2020 ◽  
Vol 39 (4) ◽  
pp. 5319-5327
Author(s):  
Feng Lei ◽  
You Yu ◽  
Daijun Zhang ◽  
Li Feng ◽  
Jinsong Guo ◽  
...  

In recent years, with the rapid development of satellite technology, remote sensing inversion has been used as an important part of environmental monitoring. Remote sensing inversion has been prepared for large-scale water environment monitoring in the watershed that is difficult for the traditional water environment monitoring methods. This paper will discuss some shortcomings of traditional remote sensing inversion methods, and proposes a remote sensing inversion method based on convolutional neural network, which realizes large-scale remote sensing smart and automatic inversion monitoring of the water environment. The results show that the method is practical and effective, and can achieve high recognition accuracy for water blooms.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 122784-122795
Author(s):  
Panle Li ◽  
Xiaohui He ◽  
Xijie Cheng ◽  
Xu Gao ◽  
Runchuan Li ◽  
...  

2020 ◽  
Vol 2020 (10) ◽  
pp. 181-1-181-7
Author(s):  
Takahiro Kudo ◽  
Takanori Fujisawa ◽  
Takuro Yamaguchi ◽  
Masaaki Ikehara

Image deconvolution has been an important issue recently. It has two kinds of approaches: non-blind and blind. Non-blind deconvolution is a classic problem of image deblurring, which assumes that the PSF is known and does not change universally in space. Recently, Convolutional Neural Network (CNN) has been used for non-blind deconvolution. Though CNNs can deal with complex changes for unknown images, some CNN-based conventional methods can only handle small PSFs and does not consider the use of large PSFs in the real world. In this paper we propose a non-blind deconvolution framework based on a CNN that can remove large scale ringing in a deblurred image. Our method has three key points. The first is that our network architecture is able to preserve both large and small features in the image. The second is that the training dataset is created to preserve the details. The third is that we extend the images to minimize the effects of large ringing on the image borders. In our experiments, we used three kinds of large PSFs and were able to observe high-precision results from our method both quantitatively and qualitatively.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Sangmin Jeon ◽  
Kyungmin Clara Lee

Abstract Objective The rapid development of artificial intelligence technologies for medical imaging has recently enabled automatic identification of anatomical landmarks on radiographs. The purpose of this study was to compare the results of an automatic cephalometric analysis using convolutional neural network with those obtained by a conventional cephalometric approach. Material and methods Cephalometric measurements of lateral cephalograms from 35 patients were obtained using an automatic program and a conventional program. Fifteen skeletal cephalometric measurements, nine dental cephalometric measurements, and two soft tissue cephalometric measurements obtained by the two methods were compared using paired t test and Bland-Altman plots. Results A comparison between the measurements from the automatic and conventional cephalometric analyses in terms of the paired t test confirmed that the saddle angle, linear measurements of maxillary incisor to NA line, and mandibular incisor to NB line showed statistically significant differences. All measurements were within the limits of agreement based on the Bland-Altman plots. The widths of limits of agreement were wider in dental measurements than those in the skeletal measurements. Conclusions Automatic cephalometric analyses based on convolutional neural network may offer clinically acceptable diagnostic performance. Careful consideration and additional manual adjustment are needed for dental measurements regarding tooth structures for higher accuracy and better performance.


Entropy ◽  
2021 ◽  
Vol 23 (7) ◽  
pp. 816
Author(s):  
Pingping Liu ◽  
Xiaokang Yang ◽  
Baixin Jin ◽  
Qiuzhan Zhou

Diabetic retinopathy (DR) is a common complication of diabetes mellitus (DM), and it is necessary to diagnose DR in the early stages of treatment. With the rapid development of convolutional neural networks in the field of image processing, deep learning methods have achieved great success in the field of medical image processing. Various medical lesion detection systems have been proposed to detect fundus lesions. At present, in the image classification process of diabetic retinopathy, the fine-grained properties of the diseased image are ignored and most of the retinopathy image data sets have serious uneven distribution problems, which limits the ability of the network to predict the classification of lesions to a large extent. We propose a new non-homologous bilinear pooling convolutional neural network model and combine it with the attention mechanism to further improve the network’s ability to extract specific features of the image. The experimental results show that, compared with the most popular fundus image classification models, the network model we proposed can greatly improve the prediction accuracy of the network while maintaining computational efficiency.


Author(s):  
Anil S. Baslamisli ◽  
Partha Das ◽  
Hoang-An Le ◽  
Sezer Karaoglu ◽  
Theo Gevers

AbstractIn general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2852
Author(s):  
Parvathaneni Naga Srinivasu ◽  
Jalluri Gnana SivaSai ◽  
Muhammad Fazal Ijaz ◽  
Akash Kumar Bhoi ◽  
Wonjoon Kim ◽  
...  

Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region’s image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Changming Wu ◽  
Heshan Yu ◽  
Seokhyeong Lee ◽  
Ruoming Peng ◽  
Ichiro Takeuchi ◽  
...  

AbstractNeuromorphic photonics has recently emerged as a promising hardware accelerator, with significant potential speed and energy advantages over digital electronics for machine learning algorithms, such as neural networks of various types. Integrated photonic networks are particularly powerful in performing analog computing of matrix-vector multiplication (MVM) as they afford unparalleled speed and bandwidth density for data transmission. Incorporating nonvolatile phase-change materials in integrated photonic devices enables indispensable programming and in-memory computing capabilities for on-chip optical computing. Here, we demonstrate a multimode photonic computing core consisting of an array of programable mode converters based on on-waveguide metasurfaces made of phase-change materials. The programmable converters utilize the refractive index change of the phase-change material Ge2Sb2Te5 during phase transition to control the waveguide spatial modes with a very high precision of up to 64 levels in modal contrast. This contrast is used to represent the matrix elements, with 6-bit resolution and both positive and negative values, to perform MVM computation in neural network algorithms. We demonstrate a prototypical optical convolutional neural network that can perform image processing and recognition tasks with high accuracy. With a broad operation bandwidth and a compact device footprint, the demonstrated multimode photonic core is promising toward large-scale photonic neural networks with ultrahigh computation throughputs.


Sign in / Sign up

Export Citation Format

Share Document