scholarly journals Joint Optic Disc and Cup Segmentation Using Self-Supervised Multimodal Reconstruction Pre-Training

Proceedings ◽  
2020 ◽  
Vol 54 (1) ◽  
pp. 25
Author(s):  
Álvaro S. Hervella ◽  
Lucía Ramos ◽  
José Rouco ◽  
Jorge Novo ◽  
Marcos Ortega

The analysis of the optic disc and cup in retinal images is important for the early diagnosis of glaucoma. In order to improve the joint segmentation of these relevant retinal structures, we propose a novel approach applying the self-supervised multimodal reconstruction of retinal images as pre-training for deep neural networks. The proposed approach is evaluated on different public datasets. The obtained results indicate that the self-supervised multimodal reconstruction pre-training improves the performance of the segmentation. Thus, the proposed approach presents a great potential for also improving the interpretable diagnosis of glaucoma.

2021 ◽  
Vol 40 (1) ◽  
pp. 551-563
Author(s):  
Liqiong Lu ◽  
Dong Wu ◽  
Ziwei Tang ◽  
Yaohua Yi ◽  
Faliang Huang

This paper focuses on script identification in natural scene images. Traditional CNNs (Convolution Neural Networks) cannot solve this problem perfectly for two reasons: one is the arbitrary aspect ratios of scene images which bring much difficulty to traditional CNNs with a fixed size image as the input. And the other is that some scripts with minor differences are easily confused because they share a subset of characters with the same shapes. We propose a novel approach combing Score CNN, Attention CNN and patches. Attention CNN is utilized to determine whether a patch is a discriminative patch and calculate the contribution weight of the discriminative patch to script identification of the whole image. Score CNN uses a discriminative patch as input and predict the score of each script type. Firstly patches with the same size are extracted from the scene images. Secondly these patches are used as inputs to Score CNN and Attention CNN to train two patch-level classifiers. Finally, the results of multiple discriminative patches extracted from the same image via the above two classifiers are fused to obtain the script type of this image. Using patches with the same size as inputs to CNN can avoid the problems caused by arbitrary aspect ratios of scene images. The trained classifiers can mine discriminative patches to accurately identify some confusing scripts. The experimental results show the good performance of our approach on four public datasets.


2020 ◽  
Vol 10 (14) ◽  
pp. 4916
Author(s):  
Syna Sreng ◽  
Noppadol Maneerat ◽  
Kazuhiko Hamamoto ◽  
Khin Yadanar Win

Glaucoma is a major global cause of blindness. As the symptoms of glaucoma appear, when the disease reaches an advanced stage, proper screening of glaucoma in the early stages is challenging. Therefore, regular glaucoma screening is essential and recommended. However, eye screening is currently subjective, time-consuming and labor-intensive and there are insufficient eye specialists available. We present an automatic two-stage glaucoma screening system to reduce the workload of ophthalmologists. The system first segmented the optic disc region using a DeepLabv3+ architecture but substituted the encoder module with multiple deep convolutional neural networks. For the classification stage, we used pretrained deep convolutional neural networks for three proposals (1) transfer learning and (2) learning the feature descriptors using support vector machine and (3) building ensemble of methods in (1) and (2). We evaluated our methods on five available datasets containing 2787 retinal images and found that the best option for optic disc segmentation is a combination of DeepLabv3+ and MobileNet. For glaucoma classification, an ensemble of methods performed better than the conventional methods for RIM-ONE, ORIGA, DRISHTI-GS1 and ACRIMA datasets with the accuracy of 97.37%, 90.00%, 86.84% and 99.53% and Area Under Curve (AUC) of 100%, 92.06%, 91.67% and 99.98%, respectively, and performed comparably with CUHKMED, the top team in REFUGE challenge, using REFUGE dataset with an accuracy of 95.59% and AUC of 95.10%.


2020 ◽  
Vol 34 (07) ◽  
pp. 11229-11236
Author(s):  
Zhiwei Ke ◽  
Zhiwei Wen ◽  
Weicheng Xie ◽  
Yi Wang ◽  
Linlin Shen

Dropout regularization has been widely used in various deep neural networks to combat overfitting. It works by training a network to be more robust on information-degraded data points for better generalization. Conventional dropout and variants are often applied to individual hidden units in a layer to break up co-adaptations of feature detectors. In this paper, we propose an adaptive dropout to reduce the co-adaptations in a group-wise manner by coarse semantic information to improve feature discriminability. In particular, we showed that adjusting the dropout probability based on local feature densities can not only improve the classification performance significantly but also enhance the network robustness against adversarial examples in some cases. The proposed approach was evaluated in comparison with the baseline and several state-of-the-art adaptive dropouts over four public datasets of Fashion-MNIST, CIFAR-10, CIFAR-100 and SVHN.


2019 ◽  
Author(s):  
David Beniaguev ◽  
Idan Segev ◽  
Michael London

AbstractWe introduce a novel approach to study neurons as sophisticated I/O information processing units by utilizing recent advances in the field of machine learning. We trained deep neural networks (DNNs) to mimic the I/O behavior of a detailed nonlinear model of a layer 5 cortical pyramidal cell, receiving rich spatio-temporal patterns of input synapse activations. A Temporally Convolutional DNN (TCN) with seven layers was required to accurately, and very efficiently, capture the I/O of this neuron at the millisecond resolution. This complexity primarily arises from local NMDA-based nonlinear dendritic conductances. The weight matrices of the DNN provide new insights into the I/O function of cortical pyramidal neurons, and the approach presented can provide a systematic characterization of the functional complexity of different neuron types. Our results demonstrate that cortical neurons can be conceptualized as multi-layered “deep” processing units, implying that the cortical networks they form have a non-classical architecture and are potentially more computationally powerful than previously assumed.


2021 ◽  
Vol 16 (1) ◽  
pp. 1-23
Author(s):  
Keyu Yang ◽  
Yunjun Gao ◽  
Lei Liang ◽  
Song Bian ◽  
Lu Chen ◽  
...  

Text classification is a fundamental task in content analysis. Nowadays, deep learning has demonstrated promising performance in text classification compared with shallow models. However, almost all the existing models do not take advantage of the wisdom of human beings to help text classification. Human beings are more intelligent and capable than machine learning models in terms of understanding and capturing the implicit semantic information from text. In this article, we try to take guidance from human beings to classify text. We propose Crowd-powered learning for Text Classification (CrowdTC for short). We design and post the questions on a crowdsourcing platform to extract keywords in text. Sampling and clustering techniques are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network to incorporate the extracted keywords as human guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTC improves the text classification accuracy of neural networks by using the crowd-powered keyword guidance.


Proceedings ◽  
2019 ◽  
Vol 21 (1) ◽  
pp. 45 ◽  
Author(s):  
Álvaro Hervella ◽  
José Rouco ◽  
Jorge Novo ◽  
Marcos Ortega

This work explores the use of paired and unpaired data for training deep neural networks in the multimodal reconstruction of retinal images. Particularly, we focus on the reconstruction of fluorescein angiography from retinography, which are two complementary representations of the eye fundus. The performed experiments allow to compare the paired and unpaired alternatives.


2020 ◽  
Vol 10 (5) ◽  
pp. 1816 ◽  
Author(s):  
Zaccharie Ramzi ◽  
Philippe Ciuciu ◽  
Jean-Luc Starck

Deep learning is starting to offer promising results for reconstruction in Magnetic Resonance Imaging (MRI). A lot of networks are being developed, but the comparisons remain hard because the frameworks used are not the same among studies, the networks are not properly re-trained, and the datasets used are not the same among comparisons. The recent release of a public dataset, fastMRI, consisting of raw k-space data, encouraged us to write a consistent benchmark of several deep neural networks for MR image reconstruction. This paper shows the results obtained for this benchmark, allowing to compare the networks, and links the open source implementation of all these networks in Keras. The main finding of this benchmark is that it is beneficial to perform more iterations between the image and the measurement spaces compared to having a deeper per-space network.


Sign in / Sign up

Export Citation Format

Share Document