scholarly journals Erroneous pixel prediction for semantic image segmentation

2021 ◽  
Vol 8 (1) ◽  
pp. 165-175
Author(s):  
Lixue Gong ◽  
Yiqun Zhang ◽  
Yunke Zhang ◽  
Yin Yang ◽  
Weiwei Xu

AbstractWe consider semantic image segmentation. Our method is inspired by Bayesian deep learning which improves image segmentation accuracy by modeling the uncertainty of the network output. In contrast to uncertainty, our method directly learns to predict the erroneous pixels of a segmentation network, which is modeled as a binary classification problem. It can speed up training comparing to the Monte Carlo integration often used in Bayesian deep learning. It also allows us to train a branch to correct the labels of erroneous pixels. Our method consists of three stages: (i) predict pixel-wise error probability of the initial result, (ii) redetermine new labels for pixels with high error probability, and (iii) fuse the initial result and the redetermined result with respect to the error probability. We formulate the error-pixel prediction problem as a classification task and employ an error-prediction branch in the network to predict pixel-wise error probabilities. We also introduce a detail branch to focus the training process on the erroneous pixels. We have experimentally validated our method on the Cityscapes and ADE20K datasets. Our model can be easily added to various advanced segmentation networks to improve their performance. Taking DeepLabv3+ as an example, our network can achieve 82.88% of mIoU on Cityscapes testing dataset and 45.73% on ADE20K validation dataset, improving corresponding DeepLabv3+ results by 0.74% and 0.13% respectively.

Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1550
Author(s):  
Alexandros Liapis ◽  
Evanthia Faliagka ◽  
Christos P. Antonopoulos ◽  
Georgios Keramidas ◽  
Nikolaos Voros

Physiological measurements have been widely used by researchers and practitioners in order to address the stress detection challenge. So far, various datasets for stress detection have been recorded and are available to the research community for testing and benchmarking. The majority of the stress-related available datasets have been recorded while users were exposed to intense stressors, such as songs, movie clips, major hardware/software failures, image datasets, and gaming scenarios. However, it remains an open research question if such datasets can be used for creating models that will effectively detect stress in different contexts. This paper investigates the performance of the publicly available physiological dataset named WESAD (wearable stress and affect detection) in the context of user experience (UX) evaluation. More specifically, electrodermal activity (EDA) and skin temperature (ST) signals from WESAD were used in order to train three traditional machine learning classifiers and a simple feed forward deep learning artificial neural network combining continues variables and entity embeddings. Regarding the binary classification problem (stress vs. no stress), high accuracy (up to 97.4%), for both training approaches (deep-learning, machine learning), was achieved. Regarding the stress detection effectiveness of the created models in another context, such as user experience (UX) evaluation, the results were quite impressive. More specifically, the deep-learning model achieved a rather high agreement when a user-annotated dataset was used for validation.


2021 ◽  
Vol 11 (11) ◽  
pp. 4753
Author(s):  
Gen Ye ◽  
Chen Du ◽  
Tong Lin ◽  
Yan Yan ◽  
Jack Jiang

(1) Background: Deep learning has become ubiquitous due to its impressive performance in various domains, such as varied as computer vision, natural language and speech processing, and game-playing. In this work, we investigated the performance of recent deep learning approaches on the laryngopharyngeal reflux (LPR) diagnosis task. (2) Methods: Our dataset is composed of 114 subjects with 37 pH-positive cases and 77 control cases. In contrast to prior work based on either reflux finding score (RFS) or pH monitoring, we directly take laryngoscope images as inputs to neural networks, as laryngoscopy is the most common and simple diagnostic method. The diagnosis task is formulated as a binary classification problem. We first tested a powerful backbone network that incorporates residual modules, attention mechanism and data augmentation. Furthermore, recent methods in transfer learning and few-shot learning were investigated. (3) Results: On our dataset, the performance is the best test classification accuracy is 73.4%, while the best AUC value is 76.2%. (4) Conclusions: This study demonstrates that deep learning techniques can be applied to classify LPR images automatically. Although the number of pH-positive images used for training is limited, deep network can still be capable of learning discriminant features with the advantage of technique.


2020 ◽  
Vol 12 (6) ◽  
pp. 959 ◽  
Author(s):  
Mohammad Pashaei ◽  
Hamid Kamangir ◽  
Michael J. Starek ◽  
Philippe Tissot

Deep learning has already been proved as a powerful state-of-the-art technique for many image understanding tasks in computer vision and other applications including remote sensing (RS) image analysis. Unmanned aircraft systems (UASs) offer a viable and economical alternative to a conventional sensor and platform for acquiring high spatial and high temporal resolution data with high operational flexibility. Coastal wetlands are among some of the most challenging and complex ecosystems for land cover prediction and mapping tasks because land cover targets often show high intra-class and low inter-class variances. In recent years, several deep convolutional neural network (CNN) architectures have been proposed for pixel-wise image labeling, commonly called semantic image segmentation. In this paper, some of the more recent deep CNN architectures proposed for semantic image segmentation are reviewed, and each model’s training efficiency and classification performance are evaluated by training it on a limited labeled image set. Training samples are provided using the hyper-spatial resolution UAS imagery over a wetland area and the required ground truth images are prepared by manual image labeling. Experimental results demonstrate that deep CNNs have a great potential for accurate land cover prediction task using UAS hyper-spatial resolution images. Some simple deep learning architectures perform comparable or even better than complex and very deep architectures with remarkably fewer training epochs. This performance is especially valuable when limited training samples are available, which is a common case in most RS applications.


Robotica ◽  
2019 ◽  
Vol 38 (1) ◽  
pp. 106-117
Author(s):  
Prasanna Kannappan ◽  
Herbert G. Tanner

SummaryThe paper reports on a new multi-view algorithm that combines information from multiple images of a single target object, captured at different distances, to determine the identity of an object. Due to the use of global feature descriptors, the method does not involve image segmentation. The performance of the algorithm has been evaluated on a binary classification problem for a data set consisting of a series of underwater images.


2021 ◽  
Vol 14 (3) ◽  
pp. 036504
Author(s):  
Shota Ushiba ◽  
Naruto Miyakawa ◽  
Naoya Ito ◽  
Ayumi Shinagawa ◽  
Tomomi Nakano ◽  
...  

2019 ◽  
Vol 10 (11) ◽  
pp. 3145-3154 ◽  
Author(s):  
Swarnendu Ghosh ◽  
Anisha Pal ◽  
Shourya Jaiswal ◽  
K. C. Santosh ◽  
Nibaran Das ◽  
...  

2017 ◽  
Vol 7 (4) ◽  
pp. 265-286 ◽  
Author(s):  
Guido Bologna ◽  
Yoichi Hayashi

AbstractRule extraction from neural networks is a fervent research topic. In the last 20 years many authors presented a number of techniques showing how to extract symbolic rules from Multi Layer Perceptrons (MLPs). Nevertheless, very few were related to ensembles of neural networks and even less for networks trained by deep learning. On several datasets we performed rule extraction from ensembles of Discretized Interpretable Multi Layer Perceptrons (DIMLP), and DIMLPs trained by deep learning. The results obtained on the Thyroid dataset and the Wisconsin Breast Cancer dataset show that the predictive accuracy of the extracted rules compare very favorably with respect to state of the art results. Finally, in the last classification problem on digit recognition, generated rules from the MNIST dataset can be viewed as discriminatory features in particular digit areas. Qualitatively, with respect to rule complexity in terms of number of generated rules and number of antecedents per rule, deep DIMLPs and DIMLPs trained by arcing give similar results on a binary classification problem involving digits 5 and 8. On the whole MNIST problem we showed that it is possible to determine the feature detectors created by neural networks and also that the complexity of the extracted rulesets can be well balanced between accuracy and interpretability.


2021 ◽  
Author(s):  
Segyu Lee ◽  
Junil Bang ◽  
Sungeun Hong ◽  
Woojung Jang

Drug-target interaction (DTI) is a methodology for predicting the binding affinity between a compound and a target protein, and a key technology in the derivation of candidate substances in drug discovery. As DTI experiments have progressed for a long time, a substantial volume of chemical, biomedical, and pharmaceutical data have accumulated. This accumulation of data has occurred contemporaneously with the advent of the field of big data, and data-based machine learning methods could significantly reduce the time and cost of drug development. In particular, the deep learning method shows potential when applied to the fields of vision and speech recognition, and studies to apply deep learning to various other fields have emerged. Research applying deep learning is underway in drug development, and among various deep learning models, a graph-based model that can effectively learn molecular structures has received more attention as the SOTA in experimental results were achieved. Our study focused on molecular structure information among graph-based models in message passing neural networks. In this paper, we propose a self-attention-based bond and atom message passing neural network which predicts DTI by extracting molecular features through a graph model using an attention mechanism. Model validation experiments were performed after defining binding affinity as a regression and classification problem: binary classification to predict the presence or absence of binding to the drug-target, and regression to predict binding affinity to the drug-target. Classification was performed with BindingDB, and regression was performed with the DAVIS dataset. In the classification problem, ABCnet showed higher performance than MPNN, as it does in the existing study, and in regression, the potential of ABCnet was checked compared to that of SOTA. Experiments indicated that in binary classification, ABCnet has an average performance improvement of 1% than other MPNN on the DTI task, and in regression, ABCnet has CI and performance degradation between 0.01 and 0.02 compared to SOTA.


Sign in / Sign up

Export Citation Format

Share Document