scholarly journals A Compact High-Quality Image Demosaicking Neural Network for Edge-Computing Devices

Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3265
Author(s):  
Shuyu Wang ◽  
Mingxin Zhao ◽  
Runjiang Dou ◽  
Shuangming Yu ◽  
Liyuan Liu ◽  
...  

Image demosaicking has been an essential and challenging problem among the most crucial steps of image processing behind image sensors. Due to the rapid development of intelligent processors based on deep learning, several demosaicking methods based on a convolutional neural network (CNN) have been proposed. However, it is difficult for their networks to run in real-time on edge computing devices with a large number of model parameters. This paper presents a compact demosaicking neural network based on the UNet++ structure. The network inserts densely connected layer blocks and adopts Gaussian smoothing layers instead of down-sampling operations before the backbone network. The densely connected blocks can extract mosaic image features efficiently by utilizing the correlation between feature maps. Furthermore, the block adopts depthwise separable convolutions to reduce the model parameters; the Gaussian smoothing layer can expand the receptive fields without down-sampling image size and discarding image information. The size constraints on the input and output images can also be relaxed, and the quality of demosaicked images is improved. Experiment results show that the proposed network can improve the running speed by 42% compared with the fastest CNN-based method and achieve comparable reconstruction quality as it on four mainstream datasets. Besides, when we carry out the inference processing on the demosaicked images on typical deep CNN networks, Mobilenet v1 and SSD, the accuracy can also achieve 85.83% (top 5) and 75.44% (mAP), which performs comparably to the existing methods. The proposed network has the highest computing efficiency and lowest parameter number through all methods, demonstrating that it is well suitable for applications on modern edge computing devices.

2021 ◽  
Author(s):  
Yuguang Ye

Abstract With the rapid development of intelligent algorithm and image processing technology, the limitations of traditional image processing methods are more and more obvious. Based on this, this paper studies a new pattern of sparse representation optimization of image Gaussian mixture feature based on convolution neural network, and designs a sparse representation system model of vehicle detection image based on convolution neural network. The vehicle image data is collected from many aspects, and the convolution neural network is used for comprehensive analysis and evaluation. The model can extract the feature information of the vehicle detection image better by making the scheme of the real-time vehicle detection image and according to the image features and convolution neural network algorithm. The results show that the Gaussian mixture feature sparse representation optimization model based on convolution neural network has the advantages of high feasibility, high data accuracy and high response speed, which can enhance the processing efficiency of vehicle detection image and improve the utilization of local environmental information in the image.


2021 ◽  
Vol 13 (18) ◽  
pp. 3605
Author(s):  
Xin Luo ◽  
Guangling Lai ◽  
Xiao Wang ◽  
Yuwei Jin ◽  
Xixu He ◽  
...  

With the rapid development of unmanned aerial vehicle (UAV) technology, UAV remote sensing images are increasing sharply. However, due to the limitation of the perspective of UAV remote sensing, the UAV images obtained from different viewpoints of a same scene need to be stitched together for further applications. Therefore, an automatic registration method of UAV remote sensing images based on deep residual features is proposed in this work. It needs no additional training and does not depend on image features, such as points, lines and shapes, or on specific image contents. This registration framework is built as follows: Aimed at the problem that most of traditional registration methods only use low-level features for registration, we adopted deep residual neural network features extracted by an excellent deep neural network, ResNet-50. Then, a tensor product was employed to construct feature description vectors through exacted high-level abstract features. At last, the progressive consistency algorithm (PROSAC) was exploited to remove false matches and fit a geometric transform model so as to enhance registration accuracy. The experimental results for different typical scene images with different resolutions acquired by different UAV image sensors indicate that the improved algorithm can achieve higher registration accuracy than a state-of-the-art deep learning registration algorithm and other popular registration algorithms.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Qunjing Ji

With the rapid development of image recognition technology, freehand sketch recognition has attracted more and more attention. How to achieve good recognition effect in the absence of color and texture information is the key to the development of freehand sketch recognition. Traditional nonlearning classical models are highly dependent on manual selection features. To solve this problem, a neural network sketch recognition method based on DSCN structure is proposed in this paper. Firstly, the stroke sequence of the sketch is drawn; then, the feature is extracted according to the stroke sequence combined with neural network, and the extracted image features are used as the input of the model to construct the time relationship between different image features. Through the control experiment on TU-Berlin dataset, the results show that, compared with the traditional nonlearning methods, HOG-SVM, SIFT-Fisher Vector, MKL-SVM, and FV-SP, the recognition accuracy of DSCN network is improved by 15.8%, 10.3%, 6.0%, and 2.9%, respectively. Compared with the classical deep learning model, Alex-Net, the recognition accuracy is improved by 5.6%. The above results show that the DSCN network proposed in this paper has strong ability of feature extraction and nonlinear expression and can effectively improve the recognition accuracy of hand-painted sketches after introducing the stroke order.


2021 ◽  
Author(s):  
Wei-Che Chien ◽  
Hsin-Hung Cho ◽  
Fan-Hsun Tseng ◽  
Shih-Yeh Chen

Abstract The rapid development of the Internet of Things and multimedia applications has led to an exponential growth in mobile network traffic year by year. In order to meet the demand for large amounts of data transmission and solve the problem of insufficient spectrum resources, millimeter waves are adopted for 5G communication. For B5G/6G, effective use of spectrum resources is one of the key technologies for the development of mobile communications. Therefore, this study uses a lightweight neural network to predict cellular traffic based on regional data, considering the data types of temporal and spatial dependence at the same time. Furthermore, in order to optimize the prediction performance and reduce the number of parameters of the neural network, this study uses a meta-heuristic algorithm to adjust the hyperparameters and combines local and global explanations to interpret the improvement of traffic prediction. The local explanations show the adjustment results of a single hyperparameter, and global explanations show the correlation between different hyperparameters and their influence on the amount and accuracy of model parameters. The simulation results show that compared with adjustment strategies of the manual method and greedy algorithm, the proposed explainable learning method can effectively improve the accuracy of cellular traffic prediction, reduce the number of parameters and provide a reasonable explanation.


2021 ◽  
Author(s):  
Sofia Krasovskaya ◽  
Georgii Zhulikov ◽  
Joseph MacInnes

Approximately twenty years ago, Laurent Itti and Christof Koch created a model of saliency in visual attention in an attempt to recreate the work of biological pyramidal neurons by mimicking neurons with centre-surround receptive fields. The Saliency Model has launched many studies that contributed to the understanding of layers of vision and the sphere of visual attention. The aim of the current study is to improve this model by using an artificial neural network as the spatial component of a model that generates saccades similar to how humans make saccadic eye movements. The proposed model uses a Leaky Integrate-and-Fire layer for temporal predictions, and replaces parallel feature maps with a deep learning neural network in order to create a generative model that is precise for both spatial and temporal shifts of attention. Our model was able to predict eye movements based on unsupervised learning from raw image input, combined with supervised learning from fixation maps retrieved during an eye-tracking experiment. The results imply that it is possible to match the spatial and temporal distributions of the model to spatial and temporal human distributions.


Entropy ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. 1351
Author(s):  
Tomasz Hachaj ◽  
Justyna Miazga

Hashtag-based image descriptions are a popular approach for labeling images on social media platforms. In practice, images are often described by more than one hashtag. Due the rapid development of deep neural networks specialized in image embedding and classification, it is now possible to generate those descriptions automatically. In this paper we propose a novel Voting Deep Neural Network with Associative Rules Mining (VDNN-ARM) algorithm that can be used to solve multi-label hashtag recommendation problems. VDNN-ARM is a machine learning approach that utilizes an ensemble of deep neural networks to generate image features, which are then classified to potential hashtag sets. Proposed hashtags are then filtered by a voting schema. The remaining hashtags might be included in a final recommended hashtags dataset by application of associative rules mining, which explores dependencies in certain hashtag groups. Our approach is evaluated on a HARRISON benchmark dataset as a multi-label classification problem. The highest values of our evaluation parameters, including precision, recall, and accuracy, have been obtained for VDNN-ARM with a confidence threshold 0.95. VDNN-ARM outperforms state-of-the-art algorithms, including VGG-Object + VGG-Scene precision by 17.91% as well as ensemble–FFNN (intersection) recall by 32.33% and accuracy by 27.00%. Both the dataset and all source codes we implemented for this research are available for download, and our results can be reproduced.


Author(s):  
Leilei Jin ◽  
Hong LIANG ◽  
Changsheng Yang

Underwater target recognition is one core technology of underwater unmanned detection. To improve the accuracy of underwater automatic target recognition, a sonar image recognition method based on convolutional neural network was proposed and the underwater target recognition model was established according to the characteristics of sonar images. Firstly, the sonar image was segmented and clipped with a saliency detection method to reduce the dimension of input data, and to reduce the interference of image background to the feature extraction process. Secondly, by using stacked convolutional layers and pooling layers, the high-level semantic information of the target was automatically learned from the input sonar image, to avoid damaging the effective information caused by extracting image features manually. Finally, the spatial pyramid pooling method was used to extract the multi-scale information from the sonar feature maps, which was to make up for the lack of detailed information of sonar images and solve the problem caused by the inconsistent size of input images. On the collected sonar image dataset, the experimental results show that the target recognition accuracy of the present method can recognize underwater targets more accurately and efficiently than the conventional convolutional neural networks.


2021 ◽  
Vol 38 (3) ◽  
pp. 895-906
Author(s):  
Ruiyang Qi ◽  
Zhiqiang Liu

Fire image monitoring systems are being applied to more and more fields, owing to their large monitoring area. However, the existing image processing-based fire detection technology cannot effectively make real-time fire warning in actual scenes, and the relevant fire recognition algorithms are not robust enough. To solve the problems, this paper tries to extract and classify image features for fire recognition based on convolutional neural network (CNN). Specifically, the authors set up the framework of a fire recognition system based on fire video images (FVIFRS), and extracted both static and dynamic features of flame. To improve the efficiency of image analysis, a Gaussian mixture model was established to extract the features from the fire smoke movement areas. Finally, the CNN was improved to process and classify the fire feature maps of the CNN. The proposed algorithm and model were proved to be feasible and effective through experiments.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Guiyong Xu ◽  
Yang Xu ◽  
Sicong Zhang ◽  
Xiaoyao Xie

In the era of big data, convolutional neural network (CNN) has been widely used in the field of image classification and has achieved excellent performance. More and more researchers are beginning to combine deep neural networks with steganalysis to improve performance in recent years. However, most of the steganalysis algorithm based on the convolutional neural network has only run test against the WOW and S-UNIWARD algorithms; meanwhile, their versatility is insufficient due to long training time and the limit of image size. This paper proposes a new network architecture, called SFRNet, to solve these problems. The feature extraction and fusion layer can extract more features from the digital image. The RepVgg block is used to accelerate the inference and increase memory utilization. The SE block improves the detection accuracy rate because it can learn feature weights to make effective feature maps with significant weights and invalid or ineffective feature maps with small weights. Experimental results show that the SFRNet has achieved excellent performance in the detection accuracy rate against four state-of-the-art steganography algorithms in the spatial domain, e.g., HUGO, WOW, S-UNIWARD, and MiPOD, under different payloads. The SFRNet detection accuracy rate achieves 89.6% against S-UNIWARD algorithm with the payload of 0.4bpp and 72.5% at 0.2bpp. As the same time, the training time of our network is greatly reduced by 35% compared with Yedroudj-Net.


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Mingyu Gao ◽  
Peng Song ◽  
Fei Wang ◽  
Junyan Liu ◽  
Andreas Mandelis ◽  
...  

Wood defects are quickly identified from an optical image based on deep learning methodology, which effectively improves wood utilization. Traditional neural network techniques have not yet been employed for wood defect detection due to long training time, low recognition accuracy, and nonautomatical extraction of defect image features. In this work, a model (so-called ReSENet-18) for wood knot defect detection that combined deep learning and transfer learning is proposed. The “squeeze-and-excitation” (SE) module is firstly embedded into the “residual basic block” structure for a “SE-Basic-Block” module construction. This model has the advantages of the features that are extracted in the channel dimension, and it is fused in multiscale with original features. Instantaneously, the fully connected layer is replaced with a global average pooling; consequently, the model parameters could be reduced effectively. The experimental results show that the accuracy has reached 99.02%, meanwhile the training time is also reduced. It shows that the proposed deep convolutional neural network based on ReSENet-18 combined with transfer learning can improve the accuracy of defect recognition and has a potential application in the detection of wood knot defects.


Sign in / Sign up

Export Citation Format

Share Document