scholarly journals Detection of Wildfire Smoke Images Based on a Densely Dilated Convolutional Network

Electronics ◽  
2019 ◽  
Vol 8 (10) ◽  
pp. 1131 ◽  
Author(s):  
Li ◽  
Zhao ◽  
Zhang ◽  
Hu

Recently, many researchers have attempted to use convolutional neural networks (CNNs) for wildfire smoke detection. However, the application of CNNs in wildfire smoke detection still faces several issues, e.g., the high false-alarm rate of detection and the imbalance of training data. To address these issues, we propose a novel framework integrating conventional methods into CNN for wildfire smoke detection, which consisted of a candidate smoke region segmentation strategy and an advanced network architecture, namely wildfire smoke dilated DenseNet (WSDD-Net). Candidate smoke region segmentation removed the complex backgrounds of the wildfire smoke images. The proposed WSDD-Net achieved multi-scale feature extraction by combining dilated convolutions with dense block. In order to solve the problem of the dataset imbalance, an improved cross entropy loss function, namely balanced cross entropy (BCE), was used instead of the original cross entropy loss function in the training process. The proposed WSDD-Net was evaluated according to two smoke datasets, i.e., WS and Yuan, and achieved a high AR (99.20%) and a low FAR (0.24%). The experimental results demonstrated that the proposed framework had better detection capabilities under different negative sample interferences.

Author(s):  
Zhenzhen Yang ◽  
Pengfei Xu ◽  
Yongpeng Yang ◽  
Bing-Kun Bao

The U-Net has become the most popular structure in medical image segmentation in recent years. Although its performance for medical image segmentation is outstanding, a large number of experiments demonstrate that the classical U-Net network architecture seems to be insufficient when the size of segmentation targets changes and the imbalance happens between target and background in different forms of segmentation. To improve the U-Net network architecture, we develop a new architecture named densely connected U-Net (DenseUNet) network in this article. The proposed DenseUNet network adopts a dense block to improve the feature extraction capability and employs a multi-feature fuse block fusing feature maps of different levels to increase the accuracy of feature extraction. In addition, in view of the advantages of the cross entropy and the dice loss functions, a new loss function for the DenseUNet network is proposed to deal with the imbalance between target and background. Finally, we test the proposed DenseUNet network and compared it with the multi-resolutional U-Net (MultiResUNet) and the classic U-Net networks on three different datasets. The experimental results show that the DenseUNet network has significantly performances compared with the MultiResUNet and the classic U-Net networks.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3873 ◽  
Author(s):  
Jong Taek Lee ◽  
Eunhee Park ◽  
Tae-Du Jung

Videofluoroscopic swallowing study (VFSS) is a standard diagnostic tool for dysphagia. To detect the presence of aspiration during a swallow, a manual search is commonly used to mark the time intervals of the pharyngeal phase on the corresponding VFSS image. In this study, we present a novel approach that uses 3D convolutional networks to detect the pharyngeal phase in raw VFSS videos without manual annotations. For efficient collection of training data, we propose a cascade framework which no longer requires time intervals of the swallowing process nor the manual marking of anatomical positions for detection. For video classification, we applied the inflated 3D convolutional network (I3D), one of the state-of-the-art network for action classification, as a baseline architecture. We also present a modified 3D convolutional network architecture that is derived from the baseline I3D architecture. The classification and detection performance of these two architectures were evaluated for comparison. The experimental results show that the proposed model outperformed the baseline I3D model in the condition where both models are trained with random weights. We conclude that the proposed method greatly reduces the examination time of the VFSS images with a low miss rate.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 111626-111635
Author(s):  
Li Li ◽  
Milos Doroslovacki ◽  
Murray H. Loew

2020 ◽  
Vol 10 (3) ◽  
pp. 661-666 ◽  
Author(s):  
Shaoguo Cui ◽  
Moyu Chen ◽  
Chang Liu

Breast cancer is one of the leading causes of death among the women worldwide. The clinical medical system urgently needs an accurate and automatic breast segmentation method in order to detect the breast ultrasound lesions. Recently, some studies show that deep learning methods based on fully convolutional network, have demonstrated a competitive performance in breast ultrasound segmentation. However, some features are missed in the Unet in case of down-sampling that results in a low segmentation accuracy. Furthermore, there is a semantic gap between the feature maps of decoder and encoder in Unet, so the simple fusion of high and low level features is not conducive to the semantic classification of pixels. In addition, the poor quality of breast ultrasound also affects the accuracy of image segmentation. To solve these problems, we propose a new end-toend network model called Dense skip Unet (DsUnet), which consists of the Unet backbone, short skip connection and deep supervision. The proposed method can effectively avoid the missing of feature information caused by down-sampling and implement the fusion of multilevel semantic information. We used a new loss function to optimize the DsUnet, which is composed of a binary cross-entropy and dice coefficient. We employed the True Positive Fraction (TPF), False Positives per image (FPs) and F -measure as performance metrics for evaluating various methods. In this paper, we adopted the UDIAT 212 dataset and the experimental results validate that our new approach achieved better performance than other existing methods in detecting and segmenting the ultrasound breast lesions. When we used the DsUnet model and new loss function (binary cross-entropy + dice coefficient), the best performance indexes can be achieved, i.e., 0.87 in TPF, 0.13 in FPs/image and 0.86 in F-measure.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 146331-146341 ◽  
Author(s):  
Yangfan Zhou ◽  
Xin Wang ◽  
Mingchuan Zhang ◽  
Junlong Zhu ◽  
Ruijuan Zheng ◽  
...  

2019 ◽  
Vol 117 (1) ◽  
pp. 161-170 ◽  
Author(s):  
Carlo Baldassi ◽  
Fabrizio Pittorino ◽  
Riccardo Zecchina

Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss function, typically by a stochastic gradient descent (SGD) strategy. The learning process is observed to be able to find good minimizers without getting stuck in local critical points and such minimizers are often satisfactory at avoiding overfitting. How these 2 features can be kept under control in nonlinear devices composed of millions of tunable connections is a profound and far-reaching open question. In this paper we study basic nonconvex 1- and 2-layer neural network models that learn random patterns and derive a number of basic geometrical and algorithmic features which suggest some answers. We first show that the error loss function presents few extremely wide flat minima (WFM) which coexist with narrower minima and critical points. We then show that the minimizers of the cross-entropy loss function overlap with the WFM of the error loss. We also show examples of learning devices for which WFM do not exist. From the algorithmic perspective we derive entropy-driven greedy and message-passing algorithms that focus their search on wide flat regions of minimizers. In the case of SGD and cross-entropy loss, we show that a slow reduction of the norm of the weights along the learning process also leads to WFM. We corroborate the results by a numerical study of the correlations between the volumes of the minimizers, their Hessian, and their generalization performance on real data.


2019 ◽  
Vol 11 (17) ◽  
pp. 1996 ◽  
Author(s):  
Zhu ◽  
Yan ◽  
Mo ◽  
Liu

Scene classification of highresolution remote sensing images (HRRSI) is one of the most important means of landcover classification. Deep learning techniques, especially the convolutional neural network (CNN) have been widely applied to the scene classification of HRRSI due to the advancement of graphic processing units (GPU). However, they tend to extract features from the whole images rather than discriminative regions. The visual attention mechanism can force the CNN to focus on discriminative regions, but it may suffer from the influence of intraclass diversity and repeated texture. Motivated by these problems, we propose an attention-based deep feature fusion (ADFF) framework that constitutes three parts, namely attention maps generated by Gradientweighted Class Activation Mapping (GradCAM), a multiplicative fusion of deep features and the centerbased cross-entropy loss function. First of all, we propose to make attention maps generated by GradCAM as an explicit input in order to force the network to concentrate on discriminative regions. Then, deep features derived from original images and attention maps are proposed to be fused by multiplicative fusion in order to consider both improved abilities to distinguish scenes of repeated texture and the salient regions. Finally, the centerbased cross-entropy loss function that utilizes both the cross-entropy loss and center loss function is proposed to backpropagate fused features so as to reduce the effect of intraclass diversity on feature representations. The proposed ADFF architecture is tested on three benchmark datasets to show its performance in scene classification. The experiments confirm that the proposed method outperforms most competitive scene classification methods with an average overall accuracy of 94% under different training ratios.


2021 ◽  
Vol 13 (16) ◽  
pp. 3187
Author(s):  
Xinchun Wei ◽  
Xing Li ◽  
Wei Liu ◽  
Lianpeng Zhang ◽  
Dayu Cheng ◽  
...  

Deep learning techniques have greatly improved the efficiency and accuracy of building extraction using remote sensing images. However, high-quality building outline extraction results that can be applied to the field of surveying and mapping remain a significant challenge. In practice, most building extraction tasks are manually executed. Therefore, an automated procedure of a building outline with a precise position is required. In this study, we directly used the U2-net semantic segmentation model to extract the building outline. The extraction results showed that the U2-net model can provide the building outline with better accuracy and a more precise position than other models based on comparisons with semantic segmentation models (Segnet, U-Net, and FCN) and edge detection models (RCF, HED, and DexiNed) applied for two datasets (Nanjing and Wuhan University (WHU)). We also modified the binary cross-entropy loss function in the U2-net model into a multiclass cross-entropy loss function to directly generate the binary map with the building outline and background. We achieved a further refined outline of the building, thus showing that with the modified U2-net model, it is not necessary to use non-maximum suppression as a post-processing step, as in the other edge detection models, to refine the edge map. Moreover, the modified model is less affected by the sample imbalance problem. Finally, we created an image-to-image program to further validate the modified U2-net semantic segmentation model for building outline extraction.


2021 ◽  
Author(s):  
Chris Pettit ◽  
D. Wilson

We describe what we believe is the first effort to develop a physics-informed neural network (PINN) to predict sound propagation through the atmospheric boundary layer. PINN is a recent innovation in the application of deep learning to simulate physics. The motivation is to combine the strengths of data-driven models and physics models, thereby producing a regularized surrogate model using less data than a purely data-driven model. In a PINN, the data-driven loss function is augmented with penalty terms for deviations from the underlying physics, e.g., a governing equation or a boundary condition. Training data are obtained from Crank-Nicholson solutions of the parabolic equation with homogeneous ground impedance and Monin-Obukhov similarity theory for the effective sound speed in the moving atmosphere. Training data are random samples from an ensemble of solutions for combinations of parameters governing the impedance and the effective sound speed. PINN output is processed to produce realizations of transmission loss that look much like the Crank-Nicholson solutions. We describe the framework for implementing PINN for outdoor sound, and we outline practical matters related to network architecture, the size of the training set, the physics-informed loss function, and challenge of managing the spatial complexity of the complex pressure.


Sign in / Sign up

Export Citation Format

Share Document