Deep Ensembles Based on Stochastic Activations for Semantic Segmentation

Alessandra Lumini; Loris Nanni; Gianluca Maguolo

doi:10.3390/signals2040047

Deep Ensembles Based on Stochastic Activations for Semantic Segmentation

Signals ◽

10.3390/signals2040047 ◽

2021 ◽

Vol 2 (4) ◽

pp. 820-833

Author(s):

Alessandra Lumini ◽

Loris Nanni ◽

Gianluca Maguolo

Keyword(s):

Image Segmentation ◽

High Performance ◽

Comprehensive Evaluation ◽

Semantic Segmentation ◽

Large Set ◽

Skin Detection ◽

Activation Functions ◽

Medical Field ◽

Backbone Networks ◽

Modern Computer

Semantic segmentation is a very popular topic in modern computer vision, and it has applications in many fields. Researchers have proposed a variety of architectures for semantic image segmentation. The most common ones exploit an encoder–decoder structure that aims to capture the semantics of the image and its low-level features. The encoder uses convolutional layers, in general with a stride larger than one, to extract the features, while the decoder recreates the image by upsampling and using skip connections with the first layers. The objective of this study is to propose a method for creating an ensemble of CNNs by enhancing diversity among networks with different activation functions. In this work, we use DeepLabV3+ as an architecture to test the effectiveness of creating an ensemble of networks by randomly changing the activation functions inside the network multiple times. We also use different backbone networks in our DeepLabV3+ to validate our findings. A comprehensive evaluation of the proposed approach is conducted across two different image segmentation problems: the first is from the medical field, i.e., polyp segmentation for early detection of colorectal cancer, and the second is skin detection for several different applications, including face detection, hand gesture recognition, and many others. As to the first problem, we manage to reach a Dice coefficient of 0.888, and a mean intersection over union (mIoU) of 0.825, in the competitive Kvasir-SEG dataset. The high performance of the proposed ensemble is confirmed in skin detection, where the proposed approach is ranked first concerning other state-of-the-art approaches (including HarDNet) in a large set of testing datasets.

Download Full-text

Deep ensembles based on Stochastic Activations for Semantic Segmentation

10.20944/preprints202107.0691.v1 ◽

2021 ◽

Author(s):

Alessandra Lumini ◽

Loris Nanni ◽

Gianluca Maguolo

Keyword(s):

Computer Vision ◽

State Of The Art ◽

Semantic Segmentation ◽

Large Set ◽

Skin Detection ◽

Activation Functions ◽

Dice Coefficient ◽

Backbone Networks ◽

Modern Computer ◽

Over Time

Semantic segmentation is a very popular topic in modern computer vision and it has applications to many fields. Researchers proposed a variety of architectures over time, but the most common ones exploit an encoder-decoder structure that aims to capture the semantics of the image and it low level features. The encoder uses convolutional layers, in general with a stride larger than one, to extract the features, while the decoder recreates the image by upsampling an using skip connections with the first layers. In this work, we use DeepLab as architecture to test the effectiveness of creating an ensemble of networks by randomly changing the activation functions inside the network multiple times. We also use different backbone networks in our DeepLab to validate our findings. We manage to reach a dice coefficient of 0.888, and a mean Intersection over Union (mIoU) of 0.825, in the competitive Kvasir-SEG dataset. Results in skin detection also confirm the performance of the proposed ensemble, which is ranked first with respect to other state-of-the-art approaches (including HardNet) in a large set of testing datasets. The developed code will be available at https://github.com/LorisNanni.

Download Full-text

Chest X-ray pneumothorax segmentation using U-Net with EfficientNet and ResNet architectures

PeerJ Computer Science ◽

10.7717/peerj-cs.607 ◽

2021 ◽

Vol 7 ◽

pp. e607

Author(s):

Ayat Abedalla ◽

Malak Abdullah ◽

Mahmoud Al-Ayyoub ◽

Elhadj Benkhelifa

Keyword(s):

Image Segmentation ◽

Medical Image ◽

Medical Images ◽

Weighted Average ◽

Semantic Segmentation ◽

Image Understanding ◽

Medical Image Segmentation ◽

Test Time ◽

Dice Similarity Coefficient ◽

Backbone Networks

Medical imaging refers to visualization techniques to provide valuable information about the internal structures of the human body for clinical applications, diagnosis, treatment, and scientific research. Segmentation is one of the primary methods for analyzing and processing medical images, which helps doctors diagnose accurately by providing detailed information on the body’s required part. However, segmenting medical images faces several challenges, such as requiring trained medical experts and being time-consuming and error-prone. Thus, it appears necessary for an automatic medical image segmentation system. Deep learning algorithms have recently shown outstanding performance for segmentation tasks, especially semantic segmentation networks that provide pixel-level image understanding. By introducing the first fully convolutional network (FCN) for semantic image segmentation, several segmentation networks have been proposed on its basis. One of the state-of-the-art convolutional networks in the medical image field is U-Net. This paper presents a novel end-to-end semantic segmentation model, named Ens4B-UNet, for medical images that ensembles four U-Net architectures with pre-trained backbone networks. Ens4B-UNet utilizes U-Net’s success with several significant improvements by adapting powerful and robust convolutional neural networks (CNNs) as backbones for U-Nets encoders and using the nearest-neighbor up-sampling in the decoders. Ens4B-UNet is designed based on the weighted average ensemble of four encoder-decoder segmentation models. The backbone networks of all ensembled models are pre-trained on the ImageNet dataset to exploit the benefit of transfer learning. For improving our models, we apply several techniques for training and predicting, including stochastic weight averaging (SWA), data augmentation, test-time augmentation (TTA), and different types of optimal thresholds. We evaluate and test our models on the 2019 Pneumothorax Challenge dataset, which contains 12,047 training images with 12,954 masks and 3,205 test images. Our proposed segmentation network achieves a 0.8608 mean Dice similarity coefficient (DSC) on the test set, which is among the top one-percent systems in the Kaggle competition.

Download Full-text

Content and composition of phospholipids, fatty acids and sterols in commercial natural phospholipid excipients

Current Pharmaceutical Analysis ◽

10.2174/1573412916999200605162707 ◽

2020 ◽

Vol 16 ◽

Author(s):

Luxia Zheng ◽

Xiong Shen ◽

Yingchun Wang ◽

Jian Liang ◽

Mingming Xu ◽

...

Keyword(s):

Fatty Acids ◽

Cluster Analysis ◽

Fatty Acid ◽

High Performance ◽

Comprehensive Evaluation ◽

Phospholipid Fatty Acid ◽

Egg Yolk ◽

Systematic Research ◽

Manufacturing Enterprises ◽

The Many

Background: Phospholipids are widely used in food and pharmaceutical industry as functional excipients. In spite of the many analytical methods reported, there are very limited reports concerning systematic research and comparison of phospholipid excipients. Objective: To present a comprehensive evaluation of commercial natural phospholipid excipients (CNPEs). Methods: Seventeen batches of CNPEs from five manufacturing enterprises, isolated either from soybean or egg yolk, were investigated. The content and composition of phospholipids, fatty acids and sterols as a whole were considered as the evaluative index of CNPEs. Eight kinds of phospholipids were determined by supercritical fluid chromatography (SFC), twenty-one kinds of fatty acids were determined by gas chromatography (GC) after boron trifluoride-methanol derivatization, and nine kinds of sterols were determined by high performance liquid chromatography (HPLC) after separation and derivatization of the unsaponifiable matter. Cluster analysis was employed for classification and identification of the CNPEs. Results: The results showed that each kind of CNPEs had its characteristic content and composition of phospholipids, fatty acids and sterols. Seventeen batches of samples were divided into eight groups in cluster analysis. CNPEs of the same type from different source (soybean or egg yolk) or enterprises presented different content and composition of phospholipids, fatty acids and sterols. Conclusion: Each type of CNPEs had its characteristic content and composition of phospholipid, fatty acid and sterol. The compositions of phospholipid, fatty acid and sterol as a whole can be applied as an indicator of the quality and characteristics for CNPEs.

Download Full-text

Recent advances in skin collagen: functionality and non-medical applications

Journal of Leather Science and Engineering ◽

10.1186/s42825-020-00046-9 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Yanting Han ◽

Jinlian Hu ◽

Gang Sun

Keyword(s):

Smart Materials ◽

High Performance ◽

Preparation Method ◽

Chemical Reactivity ◽

Medical Field ◽

Recent Advances ◽

Synthetic Materials ◽

Skin Collagen ◽

Promising Area ◽

Living Organisms

Abstract During nature evolution process, living organisms have gradually adapted to the environment and been adept in synthesizing high performance structural materials at mild conditions by using fairly simple building elements. The skin, as the largest organ of animals, is such a representative example. Conferred by its intricate organization where collagen fibers are arranged in a randomly interwoven network, skin collagen (SC), defined as a biomass derived from skin by removing non-collagen components displays remarkable performance with combinations of mechanical properties, chemical-reactivity and biocompatibility, which far surpasses those of synthetic materials. At present, the application of SC in medical field has been largely studied, and there have been many reviews summarizing these efforts. However, the generalized view on the aspects of SC as smart materials in non-medical fields is still lacking, although SC has shown great potential in terms of its intrinsic properties and functionality. Hence, this review will provide a comprehensive summary that integrated the recent advances in SC, including its preparation method, structure, reactivity, and functionality, as well as applications, particularly in the promising area of smart materials. Graphical abstract

Download Full-text

Research on Distance Transform and Neural Network Lidar Information Sampling Classification-Based Semantic Segmentation of 2D Indoor Room Maps

Sensors ◽

10.3390/s21041365 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1365

Author(s):

Tao Zheng ◽

Zhizhao Duan ◽

Jin Wang ◽

Guodong Lu ◽

Shengjie Li ◽

...

Keyword(s):

Neural Network ◽

High Speed ◽

High Performance ◽

High Efficiency ◽

Semantic Segmentation ◽

Raspberry Pi ◽

Distance Transform ◽

Testing Stage ◽

Sampling Points ◽

Information Sampling

Semantic segmentation of room maps is an essential issue in mobile robots’ execution of tasks. In this work, a new approach to obtain the semantic labels of 2D lidar room maps by combining distance transform watershed-based pre-segmentation and a skillfully designed neural network lidar information sampling classification is proposed. In order to label the room maps with high efficiency, high precision and high speed, we have designed a low-power and high-performance method, which can be deployed on low computing power Raspberry Pi devices. In the training stage, a lidar is simulated to collect the lidar detection line maps of each point in the manually labelled map, and then we use these line maps and the corresponding labels to train the designed neural network. In the testing stage, the new map is first pre-segmented into simple cells with the distance transformation watershed method, then we classify the lidar detection line maps with the trained neural network. The optimized areas of sparse sampling points are proposed by using the result of distance transform generated in the pre-segmentation process to prevent the sampling points selected in the boundary regions from influencing the results of semantic labeling. A prototype mobile robot was developed to verify the proposed method, the feasibility, validity, robustness and high efficiency were verified by a series of tests. The proposed method achieved higher scores in its recall, precision. Specifically, the mean recall is 0.965, and mean precision is 0.943.

Download Full-text

Supervised Domain Adaptation for Automated Semantic Segmentation of the Atrial Cavity

Entropy ◽

10.3390/e23070898 ◽

2021 ◽

Vol 23 (7) ◽

pp. 898

Author(s):

Marta Saiz-Vivó ◽

Adrián Colomer ◽

Carles Fonfría ◽

Luis Martí-Bonmatí ◽

Valery Naranjo

Keyword(s):

High Performance ◽

Computational Models ◽

Domain Adaptation ◽

Semantic Segmentation ◽

Patient Specific ◽

Mr Images ◽

Training Samples ◽

Volumetric Images ◽

Acquisition Costs ◽

Left And Right

Atrial fibrillation (AF) is the most common cardiac arrhythmia. At present, cardiac ablation is the main treatment procedure for AF. To guide and plan this procedure, it is essential for clinicians to obtain patient-specific 3D geometrical models of the atria. For this, there is an interest in automatic image segmentation algorithms, such as deep learning (DL) methods, as opposed to manual segmentation, an error-prone and time-consuming method. However, to optimize DL algorithms, many annotated examples are required, increasing acquisition costs. The aim of this work is to develop automatic and high-performance computational models for left and right atrium (LA and RA) segmentation from a few labelled MRI volumetric images with a 3D Dual U-Net algorithm. For this, a supervised domain adaptation (SDA) method is introduced to infer knowledge from late gadolinium enhanced (LGE) MRI volumetric training samples (80 LA annotated samples) to a network trained with balanced steady-state free precession (bSSFP) MR images of limited number of annotations (19 RA and LA annotated samples). The resulting knowledge-transferred model SDA outperformed the same network trained from scratch in both RA (Dice equals 0.9160) and LA (Dice equals 0.8813) segmentation tasks.

Download Full-text

Road Image Segmentation using Unmanned Aerial Vehicle Images and DeepLab V3+ Semantic Segmentation Model

10.1109/iccsce52189.2021.9530950 ◽

2021 ◽

Author(s):

Mat Nizam Mahmud ◽

Muhammad Khusairi Osman ◽

Ahmad Puad Ismail ◽

Fadzil Ahmad ◽

Khairul Azman Ahmad ◽

...

Keyword(s):

Image Segmentation ◽

Unmanned Aerial Vehicle ◽

Semantic Segmentation ◽

Aerial Vehicle

Download Full-text

Optimal Scale of Hierarchical Image Segmentation with Scribbles Guidance for Weakly Supervised Semantic Segmentation

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001421540264 ◽

2021 ◽

pp. 2154026

Author(s):

Zaid Al-Huda ◽

Donghai Zhai ◽

Yan Yang ◽

Riyadh Nazar Ali Algburi

Keyword(s):

Image Segmentation ◽

Graphical Model ◽

Semantic Segmentation ◽

Saliency Map ◽

Training Data ◽

Deep Convolutional Neural Networks ◽

High Quality ◽

Optimal Scale ◽

Supervised Segmentation ◽

Weakly Supervised

Deep convolutional neural networks (DCNNs) trained on the pixel-level annotated images have achieved improvements in semantic segmentation. Due to the high cost of labeling training data, their applications may have great limitation. However, weakly supervised segmentation approaches can significantly reduce human labeling efforts. In this paper, we introduce a new framework to generate high-quality initial pixel-level annotations. By using a hierarchical image segmentation algorithm to predict the boundary map, we select the optimal scale of high-quality hierarchies. In the initialization step, scribble annotations and the saliency map are combined to construct a graphic model over the optimal scale segmentation. By solving the minimal cut problem, it can spread information from scribbles to unmarked regions. In the training process, the segmentation network is trained by using the initial pixel-level annotations. To iteratively optimize the segmentation, we use a graphical model to refine segmentation masks and retrain the segmentation network to get more precise pixel-level annotations. The experimental results on Pascal VOC 2012 dataset demonstrate that the proposed framework outperforms most of weakly supervised semantic segmentation methods and achieves the state-of-the-art performance, which is [Formula: see text] mIoU.

Download Full-text

Wavelets as activation functions in Neural Networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219225 ◽

2021 ◽

pp. 1-11

Author(s):

Oscar Herrera ◽

Belém Priego

Keyword(s):

Neural Networks ◽

Deep Learning ◽

High Performance ◽

Research Area ◽

Experimental Results ◽

Activation Functions ◽

Hyperbolic Tangent ◽

Open Research ◽

Bounded Functions

Traditionally, a few activation functions have been considered in neural networks, including bounded functions such as threshold, sigmoidal and hyperbolic-tangent, as well as unbounded ReLU, GELU, and Soft-plus, among other functions for deep learning, but the search for new activation functions still being an open research area. In this paper, wavelets are reconsidered as activation functions in neural networks and the performance of Gaussian family wavelets (first, second and third derivatives) are studied together with other functions available in Keras-Tensorflow. Experimental results show how the combination of these activation functions can improve the performance and supports the idea of extending the list of activation functions to wavelets which can be available in high performance platforms.

Download Full-text

Persistent memory hash indexes

Proceedings of the VLDB Endowment ◽

10.14778/3446095.3446101 ◽

2021 ◽

Vol 14 (5) ◽

pp. 785-798

Author(s):

Daokun Hu ◽

Zhiwen Chen ◽

Jianbing Wu ◽

Jianhua Sun ◽

Hao Chen

Keyword(s):

Future Development ◽

High Performance ◽

Performance Metrics ◽

Comprehensive Evaluation ◽

State Of The Art ◽

Hash Tables ◽

Trade Offs ◽

Depth Analysis ◽

Persistent Memory ◽

Memory Modules

Persistent memory (PM) is increasingly being leveraged to build hash-based indexing structures featuring cheap persistence, high performance, and instant recovery, especially with the recent release of Intel Optane DC Persistent Memory Modules. However, most of them are evaluated on DRAM-based emulators with unreal assumptions, or focus on the evaluation of specific metrics with important properties sidestepped. Thus, it is essential to understand how well the proposed hash indexes perform on real PM and how they differentiate from each other if a wider range of performance metrics are considered. To this end, this paper provides a comprehensive evaluation of persistent hash tables. In particular, we focus on the evaluation of six state-of-the-art hash tables including Level hashing, CCEH, Dash, PCLHT, Clevel, and SOFT, with real PM hardware. Our evaluation was conducted using a unified benchmarking framework and representative workloads. Besides characterizing common performance properties, we also explore how hardware configurations (such as PM bandwidth, CPU instructions, and NUMA) affect the performance of PM-based hash tables. With our in-depth analysis, we identify design trade-offs and good paradigms in prior arts, and suggest desirable optimizations and directions for the future development of PM-based hash tables.

Download Full-text