Erroneous pixel prediction for semantic image segmentation

AbstractWe consider semantic image segmentation. Our method is inspired by Bayesian deep learning which improves image segmentation accuracy by modeling the uncertainty of the network output. In contrast to uncertainty, our method directly learns to predict the erroneous pixels of a segmentation network, which is modeled as a binary classification problem. It can speed up training comparing to the Monte Carlo integration often used in Bayesian deep learning. It also allows us to train a branch to correct the labels of erroneous pixels. Our method consists of three stages: (i) predict pixel-wise error probability of the initial result, (ii) redetermine new labels for pixels with high error probability, and (iii) fuse the initial result and the redetermined result with respect to the error probability. We formulate the error-pixel prediction problem as a classification task and employ an error-prediction branch in the network to predict pixel-wise error probabilities. We also introduce a detail branch to focus the training process on the erroneous pixels. We have experimentally validated our method on the Cityscapes and ADE20K datasets. Our model can be easily added to various advanced segmentation networks to improve their performance. Taking DeepLabv3+ as an example, our network can achieve 82.88% of mIoU on Cityscapes testing dataset and 45.73% on ADE20K validation dataset, improving corresponding DeepLabv3+ results by 0.74% and 0.13% respectively.

Download Full-text

Advancing Stress Detection Methodology with Deep Learning Techniques Targeting UX Evaluation in AAL Scenarios: Applying Embeddings for Categorical Variables

Electronics ◽

10.3390/electronics10131550 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1550

Author(s):

Alexandros Liapis ◽

Evanthia Faliagka ◽

Christos P. Antonopoulos ◽

Georgios Keramidas ◽

Nikolaos Voros

Keyword(s):

Machine Learning ◽

Deep Learning ◽

User Experience ◽

Electrodermal Activity ◽

Binary Classification ◽

Research Question ◽

Classification Problem ◽

Categorical Variables ◽

Stress Detection ◽

Software Failures

Physiological measurements have been widely used by researchers and practitioners in order to address the stress detection challenge. So far, various datasets for stress detection have been recorded and are available to the research community for testing and benchmarking. The majority of the stress-related available datasets have been recorded while users were exposed to intense stressors, such as songs, movie clips, major hardware/software failures, image datasets, and gaming scenarios. However, it remains an open research question if such datasets can be used for creating models that will effectively detect stress in different contexts. This paper investigates the performance of the publicly available physiological dataset named WESAD (wearable stress and affect detection) in the context of user experience (UX) evaluation. More specifically, electrodermal activity (EDA) and skin temperature (ST) signals from WESAD were used in order to train three traditional machine learning classifiers and a simple feed forward deep learning artificial neural network combining continues variables and entity embeddings. Regarding the binary classification problem (stress vs. no stress), high accuracy (up to 97.4%), for both training approaches (deep-learning, machine learning), was achieved. Regarding the stress detection effectiveness of the created models in another context, such as user experience (UX) evaluation, the results were quite impressive. More specifically, the deep-learning model achieved a rather high agreement when a user-annotated dataset was used for validation.

Download Full-text

Deep Learning for Laryngopharyngeal Reflux Diagnosis

Applied Sciences ◽

10.3390/app11114753 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4753

Author(s):

Gen Ye ◽

Chen Du ◽

Tong Lin ◽

Yan Yan ◽

Jack Jiang

Keyword(s):

Deep Learning ◽

Speech Processing ◽

Data Augmentation ◽

Laryngopharyngeal Reflux ◽

Ph Monitoring ◽

Binary Classification ◽

Classification Problem ◽

Learning Approaches ◽

Learning Techniques ◽

Auc Value

(1) Background: Deep learning has become ubiquitous due to its impressive performance in various domains, such as varied as computer vision, natural language and speech processing, and game-playing. In this work, we investigated the performance of recent deep learning approaches on the laryngopharyngeal reflux (LPR) diagnosis task. (2) Methods: Our dataset is composed of 114 subjects with 37 pH-positive cases and 77 control cases. In contrast to prior work based on either reflux finding score (RFS) or pH monitoring, we directly take laryngoscope images as inputs to neural networks, as laryngoscopy is the most common and simple diagnostic method. The diagnosis task is formulated as a binary classification problem. We first tested a powerful backbone network that incorporates residual modules, attention mechanism and data augmentation. Furthermore, recent methods in transfer learning and few-shot learning were investigated. (3) Results: On our dataset, the performance is the best test classification accuracy is 73.4%, while the best AUC value is 76.2%. (4) Conclusions: This study demonstrates that deep learning techniques can be applied to classify LPR images automatically. Although the number of pH-positive images used for training is limited, deep network can still be capable of learning discriminant features with the advantage of technique.

Download Full-text

Review and Evaluation of Deep Learning Architectures for Efficient Land Cover Mapping with UAS Hyper-Spatial Imagery: A Case Study Over a Wetland

Remote Sensing ◽

10.3390/rs12060959 ◽

2020 ◽

Vol 12 (6) ◽

pp. 959 ◽

Cited By ~ 7

Author(s):

Mohammad Pashaei ◽

Hamid Kamangir ◽

Michael J. Starek ◽

Philippe Tissot

Keyword(s):

Image Segmentation ◽

Deep Learning ◽

Land Cover ◽

Spatial Resolution ◽

Unmanned Aircraft ◽

High Temporal Resolution ◽

Semantic Image Segmentation ◽

Image Labeling ◽

Training Samples ◽

Learning Architectures

Deep learning has already been proved as a powerful state-of-the-art technique for many image understanding tasks in computer vision and other applications including remote sensing (RS) image analysis. Unmanned aircraft systems (UASs) offer a viable and economical alternative to a conventional sensor and platform for acquiring high spatial and high temporal resolution data with high operational flexibility. Coastal wetlands are among some of the most challenging and complex ecosystems for land cover prediction and mapping tasks because land cover targets often show high intra-class and low inter-class variances. In recent years, several deep convolutional neural network (CNN) architectures have been proposed for pixel-wise image labeling, commonly called semantic image segmentation. In this paper, some of the more recent deep CNN architectures proposed for semantic image segmentation are reviewed, and each model’s training efficiency and classification performance are evaluated by training it on a limited labeled image set. Training samples are provided using the hyper-spatial resolution UAS imagery over a wetland area and the required ground truth images are prepared by manual image labeling. Experimental results demonstrate that deep CNNs have a great potential for accurate land cover prediction task using UAS hyper-spatial resolution images. Some simple deep learning architectures perform comparable or even better than complex and very deep architectures with remarkably fewer training epochs. This performance is especially valuable when limited training samples are available, which is a common case in most RS applications.

Download Full-text

Generative Scatternet Hybrid Deep Learning (G-Shdl) Network with Structural Priors for Semantic Image Segmentation

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2018.8461542 ◽

2018 ◽

Author(s):

Amarjot Singh ◽

Nick Kingsbury

Keyword(s):

Image Segmentation ◽

Deep Learning ◽

Semantic Image Segmentation

Download Full-text

Distance-based Global Descriptors for Multi-view Object Recognition

Robotica ◽

10.1017/s0263574719000493 ◽

2019 ◽

Vol 38 (1) ◽

pp. 106-117

Author(s):

Prasanna Kannappan ◽

Herbert G. Tanner

Keyword(s):

Image Segmentation ◽

Binary Classification ◽

Classification Problem ◽

Target Object ◽

Global Feature ◽

Data Set ◽

Multiple Images ◽

Feature Descriptors ◽

Single Target ◽

Binary Classification Problem

SummaryThe paper reports on a new multi-view algorithm that combines information from multiple images of a single target object, captured at different distances, to determine the identity of an object. Due to the use of global feature descriptors, the method does not involve image segmentation. The performance of the algorithm has been evaluated on a binary classification problem for a data set consisting of a series of underwater images.

Download Full-text

Multimodal Deep Learning in Semantic Image Segmentation

Proceedings of the 2018 International Conference on Cloud Computing and Internet of Things - CCIOT 2018 ◽

10.1145/3291064.3291067 ◽

2018 ◽

Author(s):

Vishal Raman ◽

Madhu Kumari

Keyword(s):

Image Segmentation ◽

Deep Learning ◽

Semantic Image Segmentation

Download Full-text

Deep-learning-based semantic image segmentation of graphene field-effect transistors

Applied Physics Express ◽

10.35848/1882-0786/abe3db ◽

2021 ◽

Vol 14 (3) ◽

pp. 036504

Author(s):

Shota Ushiba ◽

Naruto Miyakawa ◽

Naoya Ito ◽

Ayumi Shinagawa ◽

Tomomi Nakano ◽

...

Keyword(s):

Image Segmentation ◽

Deep Learning ◽

Field Effect ◽

Field Effect Transistors ◽

Semantic Image Segmentation

Download Full-text

SegFast-V2: Semantic image segmentation with less parameters in deep learning for autonomous driving

International Journal of Machine Learning and Cybernetics ◽

10.1007/s13042-019-01005-5 ◽

2019 ◽

Vol 10 (11) ◽

pp. 3145-3154 ◽

Cited By ~ 7

Author(s):

Swarnendu Ghosh ◽

Anisha Pal ◽

Shourya Jaiswal ◽

K. C. Santosh ◽

Nibaran Das ◽

...

Keyword(s):

Image Segmentation ◽

Deep Learning ◽

Autonomous Driving ◽

Semantic Image Segmentation

Download Full-text

Characterization of Symbolic Rules Embedded in Deep DIMLP Networks: A Challenge to Transparency of Deep Learning

Journal of Artificial Intelligence and Soft Computing Research ◽

10.1515/jaiscr-2017-0019 ◽

2017 ◽

Vol 7 (4) ◽

pp. 265-286 ◽

Cited By ~ 41

Author(s):

Guido Bologna ◽

Yoichi Hayashi

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Predictive Accuracy ◽

Binary Classification ◽

Classification Problem ◽

Rule Extraction ◽

Breast Cancer Dataset ◽

Cancer Dataset ◽

Digit Recognition ◽

Rule Complexity

AbstractRule extraction from neural networks is a fervent research topic. In the last 20 years many authors presented a number of techniques showing how to extract symbolic rules from Multi Layer Perceptrons (MLPs). Nevertheless, very few were related to ensembles of neural networks and even less for networks trained by deep learning. On several datasets we performed rule extraction from ensembles of Discretized Interpretable Multi Layer Perceptrons (DIMLP), and DIMLPs trained by deep learning. The results obtained on the Thyroid dataset and the Wisconsin Breast Cancer dataset show that the predictive accuracy of the extracted rules compare very favorably with respect to state of the art results. Finally, in the last classification problem on digit recognition, generated rules from the MNIST dataset can be viewed as discriminatory features in particular digit areas. Qualitatively, with respect to rule complexity in terms of number of generated rules and number of antecedents per rule, deep DIMLPs and DIMLPs trained by arcing give similar results on a binary classification problem involving digits 5 and 8. On the whole MNIST problem we showed that it is possible to determine the feature detectors created by neural networks and also that the complexity of the extracted rulesets can be well balanced between accuracy and interpretability.

Download Full-text

ABCnet : Self-Attention based Atom, Bond Message Passing Network for Predicting Drug-Target Interaction

10.1101/2021.12.27.474154 ◽

2021 ◽

Author(s):

Segyu Lee ◽

Junil Bang ◽

Sungeun Hong ◽

Woojung Jang

Keyword(s):

Deep Learning ◽

Drug Development ◽

Binding Affinity ◽

Message Passing ◽

Drug Target ◽

Binary Classification ◽

Classification Problem ◽

Molecular Structures ◽

Target Interaction ◽

Molecular Features

Drug-target interaction (DTI) is a methodology for predicting the binding affinity between a compound and a target protein, and a key technology in the derivation of candidate substances in drug discovery. As DTI experiments have progressed for a long time, a substantial volume of chemical, biomedical, and pharmaceutical data have accumulated. This accumulation of data has occurred contemporaneously with the advent of the field of big data, and data-based machine learning methods could significantly reduce the time and cost of drug development. In particular, the deep learning method shows potential when applied to the fields of vision and speech recognition, and studies to apply deep learning to various other fields have emerged. Research applying deep learning is underway in drug development, and among various deep learning models, a graph-based model that can effectively learn molecular structures has received more attention as the SOTA in experimental results were achieved. Our study focused on molecular structure information among graph-based models in message passing neural networks. In this paper, we propose a self-attention-based bond and atom message passing neural network which predicts DTI by extracting molecular features through a graph model using an attention mechanism. Model validation experiments were performed after defining binding affinity as a regression and classification problem: binary classification to predict the presence or absence of binding to the drug-target, and regression to predict binding affinity to the drug-target. Classification was performed with BindingDB, and regression was performed with the DAVIS dataset. In the classification problem, ABCnet showed higher performance than MPNN, as it does in the existing study, and in regression, the potential of ABCnet was checked compared to that of SOTA. Experiments indicated that in binary classification, ABCnet has an average performance improvement of 1% than other MPNN on the DTI task, and in regression, ABCnet has CI and performance degradation between 0.01 and 0.02 compared to SOTA.

Download Full-text