convolution kernels Latest Research Papers

Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels

IEEE Transactions on Multimedia ◽

10.1109/tmm.2022.3142387 ◽

2022 ◽

pp. 1-1

Author(s):

Zipeng Ye ◽

Mengfei Xia ◽

Ran Yi ◽

Juyong Zhang ◽

Yu-Kun Lai ◽

...

Keyword(s):

Convolution Kernels ◽

Talking Face

Necessary conditions for convergence of CNNs and initialization of convolution kernels

Digital Signal Processing ◽

10.1016/j.dsp.2022.103397 ◽

2022 ◽

pp. 103397

Author(s):

Huibin Zhang ◽

Liping Feng ◽

Xiaohua Zhang ◽

Yuchi Yang ◽

Jing Li

Keyword(s):

Necessary Conditions ◽

Convolution Kernels

MFNet algorithm based on indoor scene segmentation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-212275 ◽

2021 ◽

pp. 1-10

Author(s):

Rui Cao ◽

Feng Jiang ◽

Zhao Wu ◽

Jia Ren

Keyword(s):

Deep Learning ◽

Learning Task ◽

Vital Role ◽

Scene Segmentation ◽

Feature Maps ◽

Real Time Processing ◽

Multi Scale ◽

Indoor Scene ◽

Convolution Kernels ◽

Hardware Platforms

With the advancement of computer performance, deep learning is playing a vital role on hardware platforms. Indoor scene segmentation is a challenging deep learning task because indoor objects tend to obscure each other, and the dense layout increases the difficulty of segmentation. Still, current networks pursue accuracy improvement, sacrifice speed, and augment memory resource usage. To solve this problem, achieve a compromise between accuracy, speed, and model size. This paper proposes Multichannel Fusion Network (MFNet) for indoor scene segmentation, which mainly consists of Dense Residual Module(DRM) and Multi-scale Feature Extraction Module(MFEM). MFEM uses depthwise separable convolution to cut the number of parameters, matches different sizes of convolution kernels and dilation rates to achieve optimal receptive field; DRM fuses feature maps at several levels of resolution to optimize segmentation details. Experimental results on the NYU V2 dataset show that the proposed method achieves very competitive results compared with other advanced algorithms, with a segmentation speed of 38.47 fps, nearly twice that of Deeplab v3+, but only 1/5 of the number of parameters of Deeplab v3 + . Its segmentation results were close to those of advanced segmentation networks, making it beneficial for the real-time processing of images.

An efficient method for singular integral equations of non-normal type with two convolution kernels

Complex Variables and Elliptic Equations ◽

10.1080/17476933.2021.2009817 ◽

2021 ◽

pp. 1-17

Author(s):

Pingrun Li ◽

Na Zhang ◽

Mincheng Wang ◽

Yajie Zhou

Keyword(s):

Integral Equations ◽

Singular Integral ◽

Efficient Method ◽

Singular Integral Equations ◽

Normal Type ◽

Convolution Kernels

GTSception: A Deep Learning EEG Emotion Recognition Model Based on Fusion of Global, Time Domain and Frequency Domain Feature Extraction

10.21203/rs.3.rs-1085276/v1 ◽

2021 ◽

Author(s):

Jian Zhao ◽

ZhiWei Zhang ◽

Jinping Qiu ◽

Lijuan Shi ◽

Zhejun KUANG ◽

...

Keyword(s):

Deep Learning ◽

Emotion Recognition ◽

Frequency Domain ◽

Rapid Development ◽

Semantic Features ◽

Hemispheric Differences ◽

Recognition Model ◽

Model Based ◽

Spatial Features ◽

Convolution Kernels

Abstract With the rapid development of deep learning in recent years, automatic electroencephalography (EEG) emotion recognition has been widely concerned. At present, most deep learning methods do not normalize EEG data properly and do not fully extract the features of time and frequency domain, which will affect the accuracy of EEG emotion recognition. To solve these problems, we propose GTScepeion, a deep learning EEG emotion recognition model. In pre-processing, the EEG time slicing data including channels were pre-processed. In our model, global convolution kernels are used to extract overall semantic features, followed by three kinds of temporal convolution kernels representing different emotional periods, followed by two kinds of spatial convolution kernels highlighting brain hemispheric differences to extract spatial features, and finally emotions are dichotomy classified by the full connected layer. The experiments is based on the DEAP dataset, and our model can effectively normalize the data and fully extract features. For Arousal, ours is 8.76% higher than the current optimal emotion recognition model based on Inception. For Valence, the best accuracy of our model reaches 91.51%.

A Multi-Classification Sentiment Analysis Model of Chinese Short Text Based on Gated Linear Units and Attention Mechanism

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3464425 ◽

2021 ◽

Vol 20 (6) ◽

pp. 1-13

Author(s):

Lei Liu ◽

Hao Chen ◽

Yinghong Sun

Keyword(s):

Sentiment Analysis ◽

Local Features ◽

Attention Mechanism ◽

Semantic Features ◽

Analysis Model ◽

Short Text ◽

Sentiment Lexicon ◽

Benchmark Datasets ◽

Convolution Kernels ◽

Multi Classification

Sentiment analysis of social media texts has become a research hotspot in information processing. Sentiment analysis methods based on the combination of machine learning and sentiment lexicon need to select features. Selected emotional features are often subjective, which can easily lead to overfitted models and poor generalization ability. Sentiment analysis models based on deep learning can automatically extract effective text emotional features, which will greatly improve the accuracy of text sentiment analysis. However, due to the lack of a multi-classification emotional corpus, it cannot accurately express the emotional polarity. Therefore, we propose a multi-classification sentiment analysis model, GLU-RCNN, based on Gated Linear Units and attention mechanism. Our model uses the Gated Linear Units based attention mechanism to integrate the local features extracted by CNN with the semantic features extracted by the LSTM. The local features of short text are extracted and concatenated by using multi-size convolution kernels. At the classification layer, the emotional features extracted by CNN and LSTM are respectively concatenated to express the emotional features of the text. The detailed evaluation on two benchmark datasets shows that the proposed model outperforms state-of-the-art approaches.

Deep-Learning-Based Remaining Useful Life Prediction Based on a Multi-Scale Dilated Convolution Network

Mathematics ◽

10.3390/math9233035 ◽

2021 ◽

Vol 9 (23) ◽

pp. 3035

Author(s):

Feiyue Deng ◽

Yan Bi ◽

Yongqiang Liu ◽

Shaopu Yang

Keyword(s):

Deep Learning ◽

Rapid Development ◽

Operational Efficiency ◽

Remaining Useful Life ◽

Convolution Kernel ◽

Training Time ◽

Multi Scale ◽

Dilated Convolution ◽

Convolution Kernels ◽

Useful Life

Remaining useful life (RUL) prediction of key components is an important influencing factor in making accurate maintenance decisions for mechanical systems. With the rapid development of deep learning (DL) techniques, the research on RUL prediction based on the data-driven model is increasingly widespread. Compared with the conventional convolution neural networks (CNNs), the multi-scale CNNs can extract different-scale feature information, which exhibits a better performance in the RUL prediction. However, the existing multi-scale CNNs employ multiple convolution kernels with different sizes to construct the network framework. There are two main shortcomings of this approach: (1) the convolution operation based on multiple size convolution kernels requires enormous computation and has a low operational efficiency, which severely restricts its application in practical engineering. (2) The convolutional layer with a large size convolution kernel needs a mass of weight parameters, leading to a dramatic increase in the network training time and making it prone to overfitting in the case of small datasets. To address the above issues, a multi-scale dilated convolution network (MsDCN) is proposed for RUL prediction in this article. The MsDCN adopts a new multi-scale dilation convolution fusion unit (MsDCFU), in which the multi-scale network framework is composed of convolution operations with different dilated factors. This effectively expands the range of receptive field (RF) for the convolution kernel without an additional computational burden. Moreover, the MsDCFU employs the depthwise separable convolution (DSC) to further improve the operational efficiency of the prognostics model. Finally, the proposed method was validated with the accelerated degradation test data of rolling element bearings (REBs). The experimental results demonstrate that the proposed MSDCN has a higher RUL prediction accuracy compared to some typical CNNs and better operational efficiency than the existing multi-scale CNNs based on different convolution kernel sizes.

Design of lung nodules segmentation and recognition algorithm based on deep learning

BMC Bioinformatics ◽

10.1186/s12859-021-04234-0 ◽

2021 ◽

Vol 22 (S5) ◽

Author(s):

Hui Yu ◽

Jinqiu Li ◽

Lixin Zhang ◽

Yuzhen Cao ◽

Xuyao Yu ◽

...

Keyword(s):

Classification Performance ◽

Recognition Algorithm ◽

Lung Nodules ◽

Dice Coefficient ◽

Segmentation Accuracy ◽

Convolution Kernels ◽

The Common ◽

Diagnosis Of Lung Cancer ◽

Benign Nodules ◽

3D Information

Abstract Background Accurate segmentation and recognition algorithm of lung nodules has great important value of reference for early diagnosis of lung cancer. An algorithm is proposed for 3D CT sequence images in this paper based on 3D Res U-Net segmentation network and 3D ResNet50 classification network. The common convolutional layers in encoding and decoding paths of U-Net are replaced by residual units while the loss function is changed to Dice loss after using cross entropy loss to accelerate network convergence. Since the lung nodules are small and rich in 3D information, the ResNet50 is improved by replacing the 2D convolutional layers with 3D convolutional layers and reducing the sizes of some convolution kernels, 3D ResNet50 network is obtained for the diagnosis of benign and malignant lung nodules. Results 3D Res U-Net was trained and tested on 1044 CT subcases in the LIDC-IDRI database. The segmentation result shows that the Dice coefficient of 3D Res U-Net is above 0.8 for the segmentation of lung nodules larger than 10 mm in diameter. 3D ResNet50 was trained and tested on 2960 lung nodules in the LIDC-IDRI database. The classification result shows that the diagnostic accuracy of 3D ResNet50 is 87.3% and AUC is 0.907. Conclusion The 3D Res U-Net module improves segmentation performance significantly with the comparison of 3D U-Net model based on residual learning mechanism. 3D Res U-Net can identify small nodules more effectively and improve its segmentation accuracy for large nodules. Compared with the original network, the classification performance of 3D ResNet50 is significantly improved, especially for small benign nodules.

Detection of pig based on improved RESNET model in natural scene

Applied Mathematics and Nonlinear Sciences ◽

10.2478/amns.2021.2.00040 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Weixian Song ◽

Junlong Fang ◽

Runtao Wang ◽

Kezhu Tan ◽

Marwan Aouad

Keyword(s):

Visual Inspection ◽

Natural Scene ◽

Number Of Layers ◽

Different Types ◽

Convolution Kernels ◽

Testing Accuracy ◽

Varying Environments ◽

Improved Model ◽

Breeding Farms ◽

Feature Expression

Abstract The behaviours of the pig are often closely related to their health. Pig recognition is very important for pig behaviour analysis and digital breeding. Currently, the early signs and abnormal behaviours of sick pigs in breeding farms are mainly completed by human observation. However, visual inspection is labour intensive and time-consuming, and it suffers from the problems of individual experiences and varying environments. An improved ResNet model was proposed and applied to detect individual pigs in this study based on deep learning knowledge. The developed model captured the features of pigs applying across layer connections, and the ability of feature expression was improved by adding a new residual module. The number of layers was reduced to minimise the net complexity. Generally, the ResNet frame was developed by reducing the number of convolution layers, constructing different types of the residual module and adding the number of convolution kernels. The training accuracy and testing accuracy reached 98.2% and 96.4%, respectively, when using the improved model. The experiment results showed that the method proposed in this paper for checking living situations and disease prevention of commercial pigs in pig farms is potential.

Low illumination image enhancement based on multi-scale CycleGAN with deep residual shrinkage

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211664 ◽

2021 ◽

pp. 1-13

Author(s):

Daxin Zhou ◽

Yurong Qian ◽

Yuanyuan Ma ◽

Yingying Fan ◽

Jianeng Yang ◽

...

Keyword(s):

Image Enhancement ◽

Image Features ◽

Generative Adversarial Networks ◽

Style Transfer ◽

Multi Scale ◽

Adversarial Networks ◽

Network Methods ◽

Convolution Kernels ◽

Low Illumination ◽

Noise Amplification

Low-illumination image restoration has been widely used in many fields. Aiming at the problem of low resolution and noise amplification in low light environment, this paper applies style transfer of CycleGAN(Cycle-Consistent Generative Adversarial Networks) to low illumination image enhancement. In the design network structure, different convolution kernels are used to extract the features from three paths, and the deep residual shrinkage network is designed to suppress the noise after convolution. The color deviation of the image can be resolved by the identity loss of CycleGAN. In the discriminator, different convolution kernels are used to extract image features from two paths. Compared with the training and testing results of Deep-Retinex network, GLAD network, KinD and other network methods on LOL-dataset and Brightening dataset, CycleGAN based on multi-scale depth residuals contraction proposed in this experiment on LOL-dataset results image quality evaluation indicators PSNR = 24.62, NIQE = 4.9856, SSIM = 0.8628, PSNR = 27.85, NIQE = 4.7652, SSIM = 0.8753. From the visual effect and objective index, it is proved that CycleGAN based on multi-scale depth residual shrinkage has excellent performance in low illumination enhancement, detail recovery and denoising.

convolution kernels
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels

Necessary conditions for convergence of CNNs and initialization of convolution kernels

MFNet algorithm based on indoor scene segmentation

An efficient method for singular integral equations of non-normal type with two convolution kernels

GTSception: A Deep Learning EEG Emotion Recognition Model Based on Fusion of Global, Time Domain and Frequency Domain Feature Extraction

A Multi-Classification Sentiment Analysis Model of Chinese Short Text Based on Gated Linear Units and Attention Mechanism

Deep-Learning-Based Remaining Useful Life Prediction Based on a Multi-Scale Dilated Convolution Network

Design of lung nodules segmentation and recognition algorithm based on deep learning

Detection of pig based on improved RESNET model in natural scene

Low illumination image enhancement based on multi-scale CycleGAN with deep residual shrinkage

Export Citation Format

convolution kernelsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels

Necessary conditions for convergence of CNNs and initialization of convolution kernels

MFNet algorithm based on indoor scene segmentation

An efficient method for singular integral equations of non-normal type with two convolution kernels

GTSception: A Deep Learning EEG Emotion Recognition Model Based on Fusion of Global, Time Domain and Frequency Domain Feature Extraction

A Multi-Classification Sentiment Analysis Model of Chinese Short Text Based on Gated Linear Units and Attention Mechanism

Deep-Learning-Based Remaining Useful Life Prediction Based on a Multi-Scale Dilated Convolution Network

Design of lung nodules segmentation and recognition algorithm based on deep learning

Detection of pig based on improved RESNET model in natural scene

Low illumination image enhancement based on multi-scale CycleGAN with deep residual shrinkage

convolution kernels
Recently Published Documents