dilated convolution
Recently Published Documents


TOTAL DOCUMENTS

176
(FIVE YEARS 157)

H-INDEX

11
(FIVE YEARS 8)

2022 ◽  
Author(s):  
Jun Mutaguchi ◽  
Ken`ichi Morooka ◽  
Satoshi Kobayashi ◽  
Aiko Umehara ◽  
Shoko Miyauchi ◽  
...  

Author(s):  
Song-Toan Tran ◽  
Thanh-Tuan Nguyen ◽  
Minh-Hai Le ◽  
Ching-Hwa Cheng ◽  
Don-Gey Liu

Author(s):  
Zhiwu Shang ◽  
Baoren Zhang ◽  
Wanxiang Li ◽  
Shiqi Qian ◽  
Jie Zhang

AbstractConvolution neural network (CNN) has been widely used in the field of remaining useful life (RUL) prediction. However, the CNN-based RUL prediction methods have some limitations. The receptive field of CNN is limited and easy to happen gradient vanishing problem when the network is too deep. The contribution differences of different channels and different time steps to RUL prediction are not considered, and only use deep learning features or handcrafted statistical features for prediction. These limitations can lead to inaccurate prediction results. To solve these problems, this paper proposes an RUL prediction method based on multi-layer self-attention (MLSA) and temporal convolution network (TCN). The TCN is used to extract deep learning features. Dilated convolution and residual connection are adopted in TCN structure. Dilated convolution is an efficient way to widen receptive field, and the residual structure can avoid the gradient vanishing problem. Besides, we propose a feature fusion method to fuse deep learning features and statistical features. And the MLSA is designed to adaptively assign feature weights. Finally, the turbofan engine dataset is used to verify the proposed method. Experimental results indicate the effectiveness of the proposed method.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8406
Author(s):  
Khaled R. Ahmed

Roads make a huge contribution to the economy and act as a platform for transportation. Potholes in roads are one of the major concerns in transportation infrastructure. A lot of research has proposed using computer vision techniques to automate pothole detection that include a wide range of image processing and object detection algorithms. There is a need to automate the pothole detection process with adequate accuracy and speed and implement the process easily and with low setup cost. In this paper, we have developed efficient deep learning convolution neural networks (CNNs) to detect potholes in real-time with adequate accuracy. To reduce the computational cost and improve the training results, this paper proposes a modified VGG16 (MVGG16) network by removing some convolution layers and using different dilation rates. Moreover, this paper uses the MVGG16 as a backbone network for the Faster R-CNN. In addition, this work compares the performance of YOLOv5 (Large (Yl), Medium (Ym), and Small (Ys)) models with ResNet101 backbone and Faster R-CNN with ResNet50(FPN), VGG16, MobileNetV2, InceptionV3, and MVGG16 backbones. The experimental results show that the Ys model is more applicable for real-time pothole detection because of its speed. In addition, using the MVGG16 network as the backbone of the Faster R-CNN provides better mean precision and shorter inference time than using VGG16, InceptionV3, or MobilNetV2 backbones. The proposed MVGG16 succeeds in balancing the pothole detection accuracy and speed.


Author(s):  
Zhao Qiu ◽  
Lin Yuan ◽  
Lihao Liu ◽  
Zheng Yuan ◽  
Tao Chen ◽  
...  

The image generation and completion model complement the missing area of the image to be repaired according to the image itself or the information of the image library so that the repaired image looks very natural and difficult to distinguish from the undamaged image. The difficulty of image generation and completion lies in the reasonableness of image semantics and the clear and true texture of the generated image. In this paper, a Wasserstein generative adversarial network with dilated convolution and deformable convolution (DDC-WGAN) is proposed for image completion. A deformable offset is added based on dilated convolution, which enlarges the receptive field and provides a more stable representation of geometric deformation. Experiments show that the DDC-WGAN method proposed in this paper has better performance in image generation and complementation than the traditional generative adversarial complementation network.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Qi Zhang

AbstractImage classification plays an important role in computer vision. The existing convolutional neural network methods have some problems during image classification process, such as low accuracy of tumor classification and poor ability of feature expression and feature extraction. Therefore, we propose a novel ResNet101 model based on dense dilated convolution for medical liver tumors classification. The multi-scale feature extraction module is used to extract multi-scale features of images, and the receptive field of the network is increased. The depth feature extraction module is used to reduce background noise information and focus on effective features of the focal region. To obtain broader and deeper semantic information, a dense dilated convolution module is deployed in the network. This module combines the advantages of Inception, residual structure, and multi-scale dilated convolution to obtain a deeper level of feature information without causing gradient explosion and gradient disappearance. To solve the common feature loss problems in the classification network, the up- down-sampling module in the network is improved, and multiple convolution kernels with different scales are cascaded to widen the network, which can effectively avoid feature loss. Finally, experiments are carried out on the proposed method. Compared with the existing mainstream classification networks, the proposed method can improve the classification performance, and finally achieve accurate classification of liver tumors. The effectiveness of the proposed method is further verified by ablation experiments.Highlights The multi-scale feature extraction module is introduced to extract multi-scale features of images, it can extract deep context information of the lesion region and surrounding tissues to enhance the feature extraction ability of the network. The depth feature extraction module is used to focus on the local features of the lesion region from both channel and space, weaken the influence of irrelevant information, and strengthen the recognition ability of the lesion region. The feature extraction module is enhanced by the parallel structure of dense dilated convolution, and the deeper feature information is obtained without losing the image feature information to improve the classification accuracy.


Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3035
Author(s):  
Feiyue Deng ◽  
Yan Bi ◽  
Yongqiang Liu ◽  
Shaopu Yang

Remaining useful life (RUL) prediction of key components is an important influencing factor in making accurate maintenance decisions for mechanical systems. With the rapid development of deep learning (DL) techniques, the research on RUL prediction based on the data-driven model is increasingly widespread. Compared with the conventional convolution neural networks (CNNs), the multi-scale CNNs can extract different-scale feature information, which exhibits a better performance in the RUL prediction. However, the existing multi-scale CNNs employ multiple convolution kernels with different sizes to construct the network framework. There are two main shortcomings of this approach: (1) the convolution operation based on multiple size convolution kernels requires enormous computation and has a low operational efficiency, which severely restricts its application in practical engineering. (2) The convolutional layer with a large size convolution kernel needs a mass of weight parameters, leading to a dramatic increase in the network training time and making it prone to overfitting in the case of small datasets. To address the above issues, a multi-scale dilated convolution network (MsDCN) is proposed for RUL prediction in this article. The MsDCN adopts a new multi-scale dilation convolution fusion unit (MsDCFU), in which the multi-scale network framework is composed of convolution operations with different dilated factors. This effectively expands the range of receptive field (RF) for the convolution kernel without an additional computational burden. Moreover, the MsDCFU employs the depthwise separable convolution (DSC) to further improve the operational efficiency of the prognostics model. Finally, the proposed method was validated with the accelerated degradation test data of rolling element bearings (REBs). The experimental results demonstrate that the proposed MSDCN has a higher RUL prediction accuracy compared to some typical CNNs and better operational efficiency than the existing multi-scale CNNs based on different convolution kernel sizes.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7862
Author(s):  
Sangun Park ◽  
Dong Eui Chang

Robot vision is an essential research field that enables machines to perform various tasks by classifying/detecting/segmenting objects as humans do. The classification accuracy of machine learning algorithms already exceeds that of a well-trained human, and the results are rather saturated. Hence, in recent years, many studies have been conducted in the direction of reducing the weight of the model and applying it to mobile devices. For this purpose, we propose a multipath lightweight deep network using randomly selected dilated convolutions. The proposed network consists of two sets of multipath networks (minimum 2, maximum 8), where the output feature maps of one path are concatenated with the input feature maps of the other path so that the features are reusable and abundant. We also replace the 3×3 standard convolution of each path with a randomly selected dilated convolution, which has the effect of increasing the receptive field. The proposed network lowers the number of floating point operations (FLOPs) and parameters by more than 50% and the classification error by 0.8% as compared to the state-of-the-art. We show that the proposed network is efficient.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Fazeel Abid ◽  
Ikram Ud Din ◽  
Ahmad Almogren ◽  
Hasan Ali Khattak ◽  
Mirza Waqar Baig

Deep learning-based methodologies are significant to perform sentiment analysis on social media data. The valuable insights of social media data through sentiment analysis can be employed to develop intelligent applications. Among many networks, convolution neural networks (CNNs) are widely used in many conventional text classification tasks and perform a significant role. However, to capture long-term contextual information and address the detail loss problem, CNNs require stacking multiple convolutional layers. Also, the stacking of convolutional layers has issues requiring massive computations and the tuning of additional parameters. To solve these problems, in this paper, a contextualized concatenated word representation (CCWRs) is initialized from social media data based on text which is essential to misspelled and out of vocabulary words (OOV). In CCWRs, different word representation models, for example, Word2Vec, its optimized version FastText and Global Vectors, and GloVe, collectively create contextualized representations upon the sequence of input. Second, a three-layered dilated convolutional neural network (3D-CNN) is proposed that places dilated convolution kernels instead of conventional CNN kernels. Incorporating the extension in the receptive field’s size successfully solves the detail loss problem and achieves long-term context information with different dilation rates. Experiments on datasets demonstrate that the proposed framework achieves reliable results with the selection of numerous hyperparameter tuning and configurations for improved optimization leads to reduced computational resources and reliable accuracy.


Author(s):  
Rujie Hou ◽  
Zhenyi Chen ◽  
Jinglong Chen ◽  
Shuilong He ◽  
Zitong Zhou

Abstract In practical engineering, the number of acquired fault samples from different categories might be in great difference due to the little probability of key equipment happening to malfunctioned. When training the imbalanced data, more methods focus on balancing the number of samples between different categories which may be time-consuming and easy to over-fit. To address this problem, we proposed Embedding-augmented Gaussian Prototype Network (EGPN) which applied a new training mechanism from the perspective of meta-learning. We only train the categories with large samples and the remaining categories only appeared in the testing process to calculate untrained prototypes. EGPN includes a feature embedding augmented module, weighted prototype module and metric module. Firstly, ordinary convolution and dilated convolution are mixed to capture different frequency bands simultaneously, and residual-attention module is added to highlight key features and suppress unimportant features. Different prototypes are calculated by weighting to the embedding vectors through Gaussian covariance matrix. Finally, the classification is taken according to the modified distance. The experiments in two datasets indicating that the proposed method can effectively recognize the untrained categories with only a few samples using as the prototypes, which can tackle the problem of identifying imbalanced fault data efficiently.


Sign in / Sign up

Export Citation Format

Share Document