convolution kernels
Recently Published Documents


TOTAL DOCUMENTS

182
(FIVE YEARS 78)

H-INDEX

15
(FIVE YEARS 4)

2022 ◽  
pp. 1-1
Author(s):  
Zipeng Ye ◽  
Mengfei Xia ◽  
Ran Yi ◽  
Juyong Zhang ◽  
Yu-Kun Lai ◽  
...  

2022 ◽  
pp. 103397
Author(s):  
Huibin Zhang ◽  
Liping Feng ◽  
Xiaohua Zhang ◽  
Yuchi Yang ◽  
Jing Li

2021 ◽  
pp. 1-10
Author(s):  
Rui Cao ◽  
Feng Jiang ◽  
Zhao Wu ◽  
Jia Ren

With the advancement of computer performance, deep learning is playing a vital role on hardware platforms. Indoor scene segmentation is a challenging deep learning task because indoor objects tend to obscure each other, and the dense layout increases the difficulty of segmentation. Still, current networks pursue accuracy improvement, sacrifice speed, and augment memory resource usage. To solve this problem, achieve a compromise between accuracy, speed, and model size. This paper proposes Multichannel Fusion Network (MFNet) for indoor scene segmentation, which mainly consists of Dense Residual Module(DRM) and Multi-scale Feature Extraction Module(MFEM). MFEM uses depthwise separable convolution to cut the number of parameters, matches different sizes of convolution kernels and dilation rates to achieve optimal receptive field; DRM fuses feature maps at several levels of resolution to optimize segmentation details. Experimental results on the NYU V2 dataset show that the proposed method achieves very competitive results compared with other advanced algorithms, with a segmentation speed of 38.47 fps, nearly twice that of Deeplab v3+, but only 1/5 of the number of parameters of Deeplab v3 + . Its segmentation results were close to those of advanced segmentation networks, making it beneficial for the real-time processing of images.


2021 ◽  
Author(s):  
Jian Zhao ◽  
ZhiWei Zhang ◽  
Jinping Qiu ◽  
Lijuan Shi ◽  
Zhejun KUANG ◽  
...  

Abstract With the rapid development of deep learning in recent years, automatic electroencephalography (EEG) emotion recognition has been widely concerned. At present, most deep learning methods do not normalize EEG data properly and do not fully extract the features of time and frequency domain, which will affect the accuracy of EEG emotion recognition. To solve these problems, we propose GTScepeion, a deep learning EEG emotion recognition model. In pre-processing, the EEG time slicing data including channels were pre-processed. In our model, global convolution kernels are used to extract overall semantic features, followed by three kinds of temporal convolution kernels representing different emotional periods, followed by two kinds of spatial convolution kernels highlighting brain hemispheric differences to extract spatial features, and finally emotions are dichotomy classified by the full connected layer. The experiments is based on the DEAP dataset, and our model can effectively normalize the data and fully extract features. For Arousal, ours is 8.76% higher than the current optimal emotion recognition model based on Inception. For Valence, the best accuracy of our model reaches 91.51%.


Author(s):  
Lei Liu ◽  
Hao Chen ◽  
Yinghong Sun

Sentiment analysis of social media texts has become a research hotspot in information processing. Sentiment analysis methods based on the combination of machine learning and sentiment lexicon need to select features. Selected emotional features are often subjective, which can easily lead to overfitted models and poor generalization ability. Sentiment analysis models based on deep learning can automatically extract effective text emotional features, which will greatly improve the accuracy of text sentiment analysis. However, due to the lack of a multi-classification emotional corpus, it cannot accurately express the emotional polarity. Therefore, we propose a multi-classification sentiment analysis model, GLU-RCNN, based on Gated Linear Units and attention mechanism. Our model uses the Gated Linear Units based attention mechanism to integrate the local features extracted by CNN with the semantic features extracted by the LSTM. The local features of short text are extracted and concatenated by using multi-size convolution kernels. At the classification layer, the emotional features extracted by CNN and LSTM are respectively concatenated to express the emotional features of the text. The detailed evaluation on two benchmark datasets shows that the proposed model outperforms state-of-the-art approaches.


Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3035
Author(s):  
Feiyue Deng ◽  
Yan Bi ◽  
Yongqiang Liu ◽  
Shaopu Yang

Remaining useful life (RUL) prediction of key components is an important influencing factor in making accurate maintenance decisions for mechanical systems. With the rapid development of deep learning (DL) techniques, the research on RUL prediction based on the data-driven model is increasingly widespread. Compared with the conventional convolution neural networks (CNNs), the multi-scale CNNs can extract different-scale feature information, which exhibits a better performance in the RUL prediction. However, the existing multi-scale CNNs employ multiple convolution kernels with different sizes to construct the network framework. There are two main shortcomings of this approach: (1) the convolution operation based on multiple size convolution kernels requires enormous computation and has a low operational efficiency, which severely restricts its application in practical engineering. (2) The convolutional layer with a large size convolution kernel needs a mass of weight parameters, leading to a dramatic increase in the network training time and making it prone to overfitting in the case of small datasets. To address the above issues, a multi-scale dilated convolution network (MsDCN) is proposed for RUL prediction in this article. The MsDCN adopts a new multi-scale dilation convolution fusion unit (MsDCFU), in which the multi-scale network framework is composed of convolution operations with different dilated factors. This effectively expands the range of receptive field (RF) for the convolution kernel without an additional computational burden. Moreover, the MsDCFU employs the depthwise separable convolution (DSC) to further improve the operational efficiency of the prognostics model. Finally, the proposed method was validated with the accelerated degradation test data of rolling element bearings (REBs). The experimental results demonstrate that the proposed MSDCN has a higher RUL prediction accuracy compared to some typical CNNs and better operational efficiency than the existing multi-scale CNNs based on different convolution kernel sizes.


2021 ◽  
Vol 22 (S5) ◽  
Author(s):  
Hui Yu ◽  
Jinqiu Li ◽  
Lixin Zhang ◽  
Yuzhen Cao ◽  
Xuyao Yu ◽  
...  

Abstract Background Accurate segmentation and recognition algorithm of lung nodules has great important value of reference for early diagnosis of lung cancer. An algorithm is proposed for 3D CT sequence images in this paper based on 3D Res U-Net segmentation network and 3D ResNet50 classification network. The common convolutional layers in encoding and decoding paths of U-Net are replaced by residual units while the loss function is changed to Dice loss after using cross entropy loss to accelerate network convergence. Since the lung nodules are small and rich in 3D information, the ResNet50 is improved by replacing the 2D convolutional layers with 3D convolutional layers and reducing the sizes of some convolution kernels, 3D ResNet50 network is obtained for the diagnosis of benign and malignant lung nodules. Results 3D Res U-Net was trained and tested on 1044 CT subcases in the LIDC-IDRI database. The segmentation result shows that the Dice coefficient of 3D Res U-Net is above 0.8 for the segmentation of lung nodules larger than 10 mm in diameter. 3D ResNet50 was trained and tested on 2960 lung nodules in the LIDC-IDRI database. The classification result shows that the diagnostic accuracy of 3D ResNet50 is 87.3% and AUC is 0.907. Conclusion The 3D Res U-Net module improves segmentation performance significantly with the comparison of 3D U-Net model based on residual learning mechanism. 3D Res U-Net can identify small nodules more effectively and improve its segmentation accuracy for large nodules. Compared with the original network, the classification performance of 3D ResNet50 is significantly improved, especially for small benign nodules.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Weixian Song ◽  
Junlong Fang ◽  
Runtao Wang ◽  
Kezhu Tan ◽  
Marwan Aouad

Abstract The behaviours of the pig are often closely related to their health. Pig recognition is very important for pig behaviour analysis and digital breeding. Currently, the early signs and abnormal behaviours of sick pigs in breeding farms are mainly completed by human observation. However, visual inspection is labour intensive and time-consuming, and it suffers from the problems of individual experiences and varying environments. An improved ResNet model was proposed and applied to detect individual pigs in this study based on deep learning knowledge. The developed model captured the features of pigs applying across layer connections, and the ability of feature expression was improved by adding a new residual module. The number of layers was reduced to minimise the net complexity. Generally, the ResNet frame was developed by reducing the number of convolution layers, constructing different types of the residual module and adding the number of convolution kernels. The training accuracy and testing accuracy reached 98.2% and 96.4%, respectively, when using the improved model. The experiment results showed that the method proposed in this paper for checking living situations and disease prevention of commercial pigs in pig farms is potential.


2021 ◽  
pp. 1-13
Author(s):  
Daxin Zhou ◽  
Yurong Qian ◽  
Yuanyuan Ma ◽  
Yingying Fan ◽  
Jianeng Yang ◽  
...  

Low-illumination image restoration has been widely used in many fields. Aiming at the problem of low resolution and noise amplification in low light environment, this paper applies style transfer of CycleGAN(Cycle-Consistent Generative Adversarial Networks) to low illumination image enhancement. In the design network structure, different convolution kernels are used to extract the features from three paths, and the deep residual shrinkage network is designed to suppress the noise after convolution. The color deviation of the image can be resolved by the identity loss of CycleGAN. In the discriminator, different convolution kernels are used to extract image features from two paths. Compared with the training and testing results of Deep-Retinex network, GLAD network, KinD and other network methods on LOL-dataset and Brightening dataset, CycleGAN based on multi-scale depth residuals contraction proposed in this experiment on LOL-dataset results image quality evaluation indicators PSNR = 24.62, NIQE = 4.9856, SSIM = 0.8628, PSNR = 27.85, NIQE = 4.7652, SSIM = 0.8753. From the visual effect and objective index, it is proved that CycleGAN based on multi-scale depth residual shrinkage has excellent performance in low illumination enhancement, detail recovery and denoising.


Sign in / Sign up

Export Citation Format

Share Document