TGV Upsampling: A Making-Up Operation for Semantic Segmentation

With the widespread use of deep learning methods, semantic segmentation has achieved great improvements in recent years. However, many researchers have pointed out that with multiple uses of convolution and pooling operations, great information loss would occur in the extraction processes. To solve this problem, various operations or network architectures have been suggested to make up for the loss of information. We observed a trend in many studies to design a network as a symmetric type, with both parts representing the “encoding” and “decoding” stages. By “upsampling” operations in the “decoding” stage, feature maps are constructed in a certain way that would more or less make up for the losses in previous layers. In this paper, we focus on upsampling operations, make a detailed analysis, and compare current methods used in several famous neural networks. We also combine the knowledge on image restoration and design a new upsampled layer (or operation) named the TGV upsampling algorithm. We successfully replaced upsampling layers in the previous research with our new method. We found that our model can better preserve detailed textures and edges of feature maps and can, on average, achieve 1.4–2.3% improved accuracy compared to the original models.

Download Full-text

Random Shifting for CNN: a Solution to Reduce Information Loss in Down-Sampling Layers

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/486 ◽

2017 ◽

Cited By ~ 3

Author(s):

Gangming Zhao ◽

Jingdong Wang ◽

Zhaoxiang Zhang

Keyword(s):

Neural Networks ◽

Computational Cost ◽

Receptive Fields ◽

Information Loss ◽

Network Architectures ◽

Training Process ◽

Feature Maps ◽

Improve Performance ◽

Deep Convolutional Neural Networks ◽

Random Strategy

Down-sampling is widely adopted in deep convolutional neural networks (DCNN) for reducing the number of network parameters while preserving the transformation invariance. However, it cannot utilize information effectively because it only adopts a fixed stride strategy, which may result in poor generalization ability and information loss. In this paper, we propose a novel random strategy to alleviate these problems by embedding random shifting in the down-sampling layers during the training process. Random shifting can be universally applied to diverse DCNN models to dynamically adjust receptive fields by shifting kernel centers on feature maps in different directions. Thus, it can generate more robust features in networks and further enhance the transformation invariance of down-sampling operators. In addition, random shifting cannot only be integrated in all down-sampling layers including strided convolutional layers and pooling layers, but also improve performance of DCNN with negligible additional computational cost. We evaluate our method in different tasks (e.g., image classification and segmentation) with various network architectures (i.e., AlexNet, FCN and DFN-MR). Experimental results demonstrate the effectiveness of our proposed method.

Download Full-text

Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images

Mobile Networks and Applications ◽

10.1007/s11036-020-01703-3 ◽

2021 ◽

Vol 26 (1) ◽

pp. 200-215

Author(s):

Muhammad Alam ◽

Jian-Feng Wang ◽

Cong Guangpei ◽

LV Yunrong ◽

Yuanfang Chen

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Image Processing ◽

Deep Learning ◽

Semantic Segmentation ◽

Natural Scene ◽

Remote Sensing Images ◽

Advantages And Disadvantages ◽

Target Segmentation

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.

Download Full-text

Understanding Memories of the Past in the Context of Different Complex Neural Network Architectures

Neural Computation ◽

10.1162/neco_a_01469 ◽

2022 ◽

pp. 1-27

Author(s):

Clifford Bohm ◽

Douglas Kirkpatrick ◽

Arend Hintze

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Mental Representations ◽

Network Architectures ◽

Information Theoretic ◽

Time Points ◽

The Past ◽

Black Boxes ◽

Past Experiences ◽

Computational Systems

Abstract Deep learning (primarily using backpropagation) and neuroevolution are the preeminent methods of optimizing artificial neural networks. However, they often create black boxes that are as hard to understand as the natural brains they seek to mimic. Previous work has identified an information-theoretic tool, referred to as R, which allows us to quantify and identify mental representations in artificial cognitive systems. The use of such measures has allowed us to make previous black boxes more transparent. Here we extend R to not only identify where complex computational systems store memory about their environment but also to differentiate between different time points in the past. We show how this extended measure can identify the location of memory related to past experiences in neural networks optimized by deep learning as well as a genetic algorithm.

Download Full-text

Application of convolutional neural networks for stellar spectral classification

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stz3100 ◽

2019 ◽

Vol 491 (2) ◽

pp. 2280-2300 ◽

Cited By ~ 6

Author(s):

Kaushal Sharma ◽

Ajit Kembhavi ◽

Aniruddha Kembhavi ◽

T Sivarani ◽

Sheelu Abraham ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Spectral Classification ◽

Stellar Spectra ◽

The Past ◽

Training Samples ◽

Optical Region ◽

Spectral Libraries ◽

Improved Accuracy

ABSTRACT Due to the ever-expanding volume of observed spectroscopic data from surveys such as SDSS and LAMOST, it has become important to apply artificial intelligence (AI) techniques for analysing stellar spectra to solve spectral classification and regression problems like the determination of stellar atmospheric parameters Teff, $\rm {\log g}$, and [Fe/H]. We propose an automated approach for the classification of stellar spectra in the optical region using convolutional neural networks (CNNs). Traditional machine learning (ML) methods with ‘shallow’ architecture (usually up to two hidden layers) have been trained for these purposes in the past. However, deep learning methods with a larger number of hidden layers allow the use of finer details in the spectrum which results in improved accuracy and better generalization. Studying finer spectral signatures also enables us to determine accurate differential stellar parameters and find rare objects. We examine various machine and deep learning algorithms like artificial neural networks, Random Forest, and CNN to classify stellar spectra using the Jacoby Atlas, ELODIE, and MILES spectral libraries as training samples. We test the performance of the trained networks on the Indo-U.S. Library of Coudé Feed Stellar Spectra (CFLIB). We show that using CNNs, we are able to lower the error up to 1.23 spectral subclasses as compared to that of two subclasses achieved in the past studies with ML approach. We further apply the trained model to classify stellar spectra retrieved from the SDSS data base with SNR > 20.

Download Full-text

Semantic Segmentation Using Deep Learning for Brain Tumor MRI via Fully Convolution Neural Networks

Information and Communication Technology for Intelligent Systems - Smart Innovation, Systems and Technologies ◽

10.1007/978-981-13-1742-2_2 ◽

2018 ◽

pp. 11-19 ◽

Cited By ~ 4

Author(s):

Sanjay Kumar ◽

Ashish Negi ◽

J. N. Singh

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Brain Tumor ◽

Semantic Segmentation ◽

Convolution Neural Networks ◽

Tumor Mri

Download Full-text

Text Separation From Document Images

Machine Learning and Deep Learning in Real-Time Applications - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-7998-3095-5.ch013 ◽

2020 ◽

pp. 283-313

Author(s):

Priti P. Rege ◽

Shaheera Akhter

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Semantic Segmentation ◽

Document Image ◽

Training Data ◽

Document Images ◽

Learning Techniques ◽

Extraction Processes ◽

Segmentation Image

Text separation in document image analysis is an important preprocessing step before executing an optical character recognition (OCR) task. It is necessary to improve the accuracy of an OCR system. Traditionally, for separating text from a document, different feature extraction processes have been used that require handcrafting of the features. However, deep learning-based methods are excellent feature extractors that learn features from the training data automatically. Deep learning gives state-of-the-art results on various computer vision, image classification, segmentation, image captioning, object detection, and recognition tasks. This chapter compares various traditional as well as deep-learning techniques and uses a semantic segmentation method for separating text from Devanagari document images using U-Net and ResU-Net models. These models are further fine-tuned for transfer learning to get more precise results. The final results show that deep learning methods give more accurate results compared with conventional methods of image processing for Devanagari text extraction.

Download Full-text

Conditional Random Fields Meet Deep Neural Networks for Semantic Segmentation: Combining Probabilistic Graphical Models with Deep Learning for Structured Prediction

IEEE Signal Processing Magazine ◽

10.1109/msp.2017.2762355 ◽

2018 ◽

Vol 35 (1) ◽

pp. 37-52 ◽

Cited By ~ 36

Author(s):

Anurag Arnab ◽

Shuai Zheng ◽

Sadeep Jayasumana ◽

Bernardino Romera-Paredes ◽

Mans Larsson ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Graphical Models ◽

Random Fields ◽

Deep Neural Networks ◽

Conditional Random Fields ◽

Probabilistic Graphical Models ◽

Semantic Segmentation ◽

Structured Prediction

Download Full-text

A mixed-scale dense convolutional neural network for image analysis

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1715832114 ◽

2017 ◽

Vol 115 (2) ◽

pp. 254-259 ◽

Cited By ~ 60

Author(s):

Daniël M. Pelt ◽

James A. Sethian

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Processing ◽

Network Architecture ◽

Training Data ◽

Network Architectures ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Single Set ◽

Reduced Risk

Deep convolutional neural networks have been successfully applied to many image-processing problems in recent works. Popular network architectures often add additional operations and connections to the standard architecture to enable training deeper networks. To achieve accurate results in practice, a large number of trainable parameters are often required. Here, we introduce a network architecture based on using dilated convolutions to capture features at different image scales and densely connecting all feature maps with each other. The resulting architecture is able to achieve accurate results with relatively few parameters and consists of a single set of operations, making it easier to implement, train, and apply in practice, and automatically adapts to different problems. We compare results of the proposed network architecture with popular existing architectures for several segmentation problems, showing that the proposed architecture is able to achieve accurate results with fewer parameters, with a reduced risk of overfitting the training data.

Download Full-text

Deep learning for semantic segmentation of organs and tissues in laparoscopic surgery

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2020-0016 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Paul Maria Scheikl ◽

Stefan Laschewski ◽

Anna Kisilenko ◽

Tornike Davitashvili ◽

Benjamin Müller ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Laparoscopic Surgery ◽

Semantic Segmentation ◽

Cognitive Robotics ◽

Loss Functions ◽

Cognitive Systems ◽

Context Aware ◽

Segmentation Quality ◽

And Training

AbstractSemantic segmentation of organs and tissue types is an important sub-problem in image based scene understanding for laparoscopic surgery and is a prerequisite for context-aware assistance and cognitive robotics. Deep Learning (DL) approaches are prominently applied to segmentation and tracking of laparoscopic instruments. This work compares different combinations of neural networks, loss functions, and training strategies in their application to semantic segmentation of different organs and tissue types in human laparoscopic images in order to investigate their applicability as components in cognitive systems. TernausNet-11 trained on Soft-Jaccard loss with a pretrained, trainable encoder performs best in regard to segmentation quality (78.31% mean Intersection over Union [IoU]) and inference time (28.07 ms) on a single GTX 1070 GPU.

Download Full-text

GAIT RECOGNITION BASED ON CONVOLUTIONAL NEURAL NETWORKS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-2-w4-207-2017 ◽

2017 ◽

Vol XLII-2/W4 ◽

pp. 207-212 ◽

Cited By ~ 10

Author(s):

A. Sokolova ◽

A. Konushin

Keyword(s):

Neural Network ◽

Neural Networks ◽

Feature Extraction ◽

Deep Learning ◽

Deep Neural Network ◽

Gait Recognition ◽

Learning Approach ◽

Network Architectures ◽

Motion Information ◽

Advantages And Disadvantages

In this work we investigate the problem of people recognition by their gait. For this task, we implement deep learning approach using the optical flow as the main source of motion information and combine neural feature extraction with the additional embedding of descriptors for representation improvement. In order to find the best heuristics, we compare several deep neural network architectures, learning and classification strategies. The experiments were made on two popular datasets for gait recognition, so we investigate their advantages and disadvantages and the transferability of considered methods.

Download Full-text