scholarly journals ORTHOSEG: A DEEP MULTIMODAL CONVOLUTONAL NEURAL NETWORK ARCHITECTURE FOR SEMANTIC SEGMENTATION OF ORTHOIMAGERY

Author(s):  
P. Bodani ◽  
K. Shreshtha ◽  
S. Sharma

<p><strong>Abstract.</strong> This paper addresses the task of semantic segmentation of orthoimagery using multimodal data e.g. optical RGB, infrared and digital surface model. We propose a deep convolutional neural network architecture termed OrthoSeg for semantic segmentation using multimodal, orthorectified and coregistered data. We also propose a training procedure for supervised training of OrthoSeg. The training procedure complements the inherent architectural characteristics of OrthoSeg for preventing complex co-adaptations of learned features, which may arise due to probable high dimensionality and spatial correlation in multimodal and/or multispectral coregistered data. OrthoSeg consists of parallel encoding networks for independent encoding of multimodal feature maps and a decoder designed for efficiently fusing independently encoded multimodal feature maps. A softmax layer at the end of the network uses the features generated by the decoder for pixel-wise classification. The decoder fuses feature maps from the parallel encoders locally as well as contextually at multiple scales to generate per-pixel feature maps for final pixel-wise classification resulting in segmented output. We experimentally show the merits of OrthoSeg by demonstrating state-of-the-art accuracy on the ISPRS Potsdam 2D Semantic Segmentation dataset. Adaptability is one of the key motivations behind OrthoSeg so that it serves as a useful architectural option for a wide range of problems involving the task of semantic segmentation of coregistered multimodal and/or multispectral imagery. Hence, OrthoSeg is designed to enable independent scaling of parallel encoder networks and decoder network to better match application requirements, such as the number of input channels, the effective field-of-view, and model capacity.</p>

2019 ◽  
Vol 53 (1) ◽  
pp. 2-19 ◽  
Author(s):  
Erion Çano ◽  
Maurizio Morisio

Purpose The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is, however, difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. The purpose of this paper is to present the creation steps of two big data sets of song emotions. The authors also explore usage of convolution and max-pooling neural layers on song lyrics, product and movie review text data sets. Three variants of a simple and flexible neural network architecture are also compared. Design/methodology/approach The intention was to spot any important patterns that can serve as guidelines for parameter optimization of similar models. The authors also wanted to identify architecture design choices which lead to high performing sentiment analysis models. To this end, the authors conducted a series of experiments with neural architectures of various configurations. Findings The results indicate that parallel convolutions of filter lengths up to 3 are usually enough for capturing relevant text features. Also, max-pooling region size should be adapted to the length of text documents for producing the best feature maps. Originality/value Top results the authors got are obtained with feature maps of lengths 6–18. An improvement on future neural network models for sentiment analysis could be generating sentiment polarity prediction of documents using aggregation of predictions on smaller excerpt of the entire text.


2020 ◽  
Vol 10 (18) ◽  
pp. 6386
Author(s):  
Xing Bai ◽  
Jun Zhou

Benefiting from the booming of deep learning, the state-of-the-art models achieved great progress. But they are huge in terms of parameters and floating point operations, which makes it hard to apply them to real-time applications. In this paper, we propose a novel deep neural network architecture, named MPDNet, for fast and efficient semantic segmentation under resource constraints. First, we use a light-weight classification model pretrained on ImageNet as the encoder. Second, we use a cost-effective upsampling datapath to restore prediction resolution and convert features for classification into features for segmentation. Finally, we propose to use a multi-path decoder to extract different types of features, which are not ideal to process inside only one convolutional neural network. The experimental results of our model outperform other models aiming at real-time semantic segmentation on Cityscapes. Based on our proposed MPDNet, we achieve 76.7% mean IoU on Cityscapes test set with only 118.84GFLOPs and achieves 37.6 Hz on 768 × 1536 images on a standard GPU.


2020 ◽  
Author(s):  
Gogulan Karunanithy ◽  
Flemming Hansen

<p>In recent years, the transformative potential of deep neural networks (DNNs) for analysing and interpreting NMR data has clearly been recognised. However, most applications of DNNs in NMR to date either struggle to outperform existing methodologies or are limited in scope to a narrow range of data that closely resemble the data that the network was trained on. These limitations have prevented a widescale uptake of DNNs in NMR. Addressing this, we introduce FID-Net, a deep neural network architecture inspired by WaveNet, for performing analyses on time domain NMR data. We first demonstrate the effectiveness of this architecture in reconstructing non-uniformly sampled (NUS) biomolecular NMR spectra. It is shown that a single network is able to reconstruct a diverse range of 2D NUS spectra that have been obtained with arbitrary sampling schedules, with a range of sweep widths, and a variety of other acquisition parameters. The performance of the trained FID-Net in this case exceeds or matches existing methods currently used for the reconstruction of NUS NMR spectra. Secondly, we present a network based on the FID-Net architecture that can efficiently virtually decouple <sup>13</sup>C<sub>α</sub>-<sup>13</sup>C<sub>β</sub> couplings in HNCA protein NMR spectra in a single shot analysis, while at the same time leaving glycine residues unmodulated. The ability for these DNNs to work effectively in a wide range of scenarios, without retraining, paves the way for their widespread usage in analysing NMR data. </p>


2021 ◽  
Author(s):  
Gogulan Karunanithy ◽  
Flemming Hansen

<p>In recent years, the transformative potential of deep neural networks (DNNs) for analysing and interpreting NMR data has clearly been recognised. However, most applications of DNNs in NMR to date either struggle to outperform existing methodologies or are limited in scope to a narrow range of data that closely resemble the data that the network was trained on. These limitations have prevented a widescale uptake of DNNs in NMR. Addressing this, we introduce FID-Net, a deep neural network architecture inspired by WaveNet, for performing analyses on time domain NMR data. We first demonstrate the effectiveness of this architecture in reconstructing non-uniformly sampled (NUS) biomolecular NMR spectra. It is shown that a single network is able to reconstruct a diverse range of 2D NUS spectra that have been obtained with arbitrary sampling schedules, with a range of sweep widths, and a variety of other acquisition parameters. The performance of the trained FID-Net in this case exceeds or matches existing methods currently used for the reconstruction of NUS NMR spectra. Secondly, we present a network based on the FID-Net architecture that can efficiently virtually decouple <sup>13</sup>C<sub>α</sub>-<sup>13</sup>C<sub>β</sub> couplings in HNCA protein NMR spectra in a single shot analysis, while at the same time leaving glycine residues unmodulated. The ability for these DNNs to work effectively in a wide range of scenarios, without retraining, paves the way for their widespread usage in analysing NMR data. </p>


Sensors ◽  
2020 ◽  
Vol 20 (16) ◽  
pp. 4403
Author(s):  
Umme Hafsa Billah ◽  
Hung Manh La ◽  
Alireza Tavakkoli

An autonomous concrete crack inspection system is necessary for preventing hazardous incidents arising from deteriorated concrete surfaces. In this paper, we present a concrete crack detection framework to aid the process of automated inspection. The proposed approach employs a deep convolutional neural network architecture for crack segmentation, while addressing the effect of gradient vanishing problem. A feature silencing module is incorporated in the proposed framework, capable of eliminating non-discriminative feature maps from the network to improve performance. Experimental results support the benefit of incorporating feature silencing within a convolutional neural network architecture for improving the network’s robustness, sensitivity, and specificity. An added benefit of the proposed architecture is its ability to accommodate for the trade-off between specificity (positive class detection accuracy) and sensitivity (negative class detection accuracy) with respect to the target application. Furthermore, the proposed framework achieves a high precision rate and processing time than the state-of-the-art crack detection architectures.


IoT ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 222-235
Author(s):  
Guillaume Coiffier ◽  
Ghouthi Boukli Hacene ◽  
Vincent Gripon

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.


Metals ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 549
Author(s):  
Ihor Konovalenko ◽  
Pavlo Maruschak ◽  
Vitaly Brevus ◽  
Olegas Prentkovskis

Classification of steel surface defects in steel industry is essential for their detection and also fundamental for the analysis of causes that lead to damages. Timely detection of defects allows to reduce the frequency of their appearance in the final product. This paper considers the classifiers for the recognition of scratches, scrapes and abrasions on metal surfaces. Classifiers are based on the ResNet50 and ResNet152 deep residual neural network architecture. The proposed technique supports the recognition of defects in images and does this with high accuracy. The binary accuracy of the classification based on the test data is 97.14%. The influence of a number of training conditions on the accuracy metrics of the model have been studied. The augmentation conditions have been figured out to make the greatest contribution to improving the accuracy during training. The peculiarities of damages that cause difficulties in their recognition have been studied. The fields of neuron activation have been investigated in the convolutional layers of the model. Feature maps which developed in this case have been found to correspond to the location of the objects of interest. Erroneous cases of the classifier application have been considered. The peculiarities of damages that cause difficulties in their recognition have been studied.


2020 ◽  
Author(s):  
Gogulan Karunanithy ◽  
Flemming Hansen

<p>In recent years, the transformative potential of deep neural networks (DNNs) for analysing and interpreting NMR data has clearly been recognised. However, most applications of DNNs in NMR to date either struggle to outperform existing methodologies or are limited in scope to a narrow range of data that closely resemble the data that the network was trained on. These limitations have prevented a widescale uptake of DNNs in NMR. Addressing this, we introduce FID-Net, a deep neural network architecture inspired by WaveNet, for performing analyses on time domain NMR data. We first demonstrate the effectiveness of this architecture in reconstructing non-uniformly sampled (NUS) biomolecular NMR spectra. It is shown that a single network is able to reconstruct a diverse range of 2D NUS spectra that have been obtained with arbitrary sampling schedules, with a range of sweep widths, and a variety of other acquisition parameters. The performance of the trained FID-Net in this case exceeds or matches existing methods currently used for the reconstruction of NUS NMR spectra. Secondly, we present a network based on the FID-Net architecture that can efficiently virtually decouple <sup>13</sup>C<sub>α</sub>-<sup>13</sup>C<sub>β</sub> couplings in HNCA protein NMR spectra in a single shot analysis, while at the same time leaving glycine residues unmodulated. The ability for these DNNs to work effectively in a wide range of scenarios, without retraining, paves the way for their widespread usage in analysing NMR data. </p>


Author(s):  
Yang Li ◽  
Zhan Xu ◽  
Jianke Zhu

Albeit convolutional neural network (CNN) has shown promising capacity in many computer vision tasks, applying it to visual tracking is yet far from solved. Existing methods either employ a large external dataset to undertake exhaustive pre-training or suffer from less satisfactory results in terms of accuracy and robustness. To track single target in a wide range of videos, we present a novel Correlation Filter Neural Network architecture, as well as a complete visual tracking pipeline, The proposed approach is a special case of CNN, whose initialization does not need any pre-training on the external dataset. The initialization of network enjoys the merits of cyclic sampling to achieve the appealing discriminative capability, while the network updating scheme adopts advantages from back-propagation in order to capture new appearance variations. The tracking pipeline integrates both aspects well by making them complementary to each other. We validate our tracker on OTB-2013 benchmark. The proposed tracker obtains the promising results compared to most of existing representative trackers.


Sign in / Sign up

Export Citation Format

Share Document