Multi-Task Learning for Diabetic Retinopathy Grading and Lesion Segmentation

Alex Foo; Wynne Hsu; Mong Li Lee; Gilbert Lim; Tien Yin Wong

doi:10.1609/aaai.v34i08.7035

Multi-Task Learning for Diabetic Retinopathy Grading and Lesion Segmentation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i08.7035 ◽

2020 ◽

Vol 34 (08) ◽

pp. 13267-13272

Author(s):

Alex Foo ◽

Wynne Hsu ◽

Mong Li Lee ◽

Gilbert Lim ◽

Tien Yin Wong

Keyword(s):

Diabetic Retinopathy ◽

State Of The Art ◽

Great Success ◽

Automated Segmentation ◽

Lesion Segmentation ◽

Severity Level ◽

Fine Grained ◽

Acceptable Accuracy ◽

Task Learning ◽

Different Types

Although deep learning for Diabetic Retinopathy (DR) screening has shown great success in achieving clinically acceptable accuracy for referable versus non-referable DR, there remains a need to provide more fine-grained grading of the DR severity level as well as automated segmentation of lesions (if any) in the retina images. We observe that the DR severity level of an image is dependent on the presence of different types of lesions and their prevalence. In this work, we adopt a multi-task learning approach to perform the DR grading and lesion segmentation tasks. In light of the lack of lesion segmentation mask ground-truths, we further propose a semi-supervised learning process to obtain the segmentation masks for the various datasets. Experiments results on publicly available datasets and a real world dataset obtained from population screening demonstrate the effectiveness of the multi-task solution over state-of-the-art networks.

Download Full-text

F3SNet: A Four-Step Strategy for QIM Steganalysis of Compressed Speech Based on Hierarchical Attention Network

Security and Communication Networks ◽

10.1155/2021/1627486 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Chuanpeng Guo ◽

Wei Yang ◽

Mengxia Shuai ◽

Liusheng Huang

Keyword(s):

Linear Prediction ◽

State Of The Art ◽

Communication Security ◽

Great Success ◽

Attention Network ◽

Compressed Speech ◽

Different Types ◽

Comprehensive Comparison ◽

Embedding Rate ◽

Memory Characteristics

Traditional machine learning-based steganalysis methods on compressed speech have achieved great success in the field of communication security. However, previous studies lacked mathematical modeling of the correlation between codewords, and there is still room for improvement in steganalysis for small-sized and low embedding rate samples. To deal with the challenge, we use Bayesian networks to measure different types of correlations between codewords in linear prediction code and present F3SNet—a four-step strategy: embedding, encoding, attention, and classification for quantization index modulation steganalysis of compressed speech based on the hierarchical attention network. Among them, embedding converts codewords into high-density numerical vectors, encoding uses the memory characteristics of LSTM to retain more information by distributing it among all its vectors, and attention further determines which vectors have a greater impact on the final classification result. To evaluate the performance of F3SNet, we make a comprehensive comparison of F3SNet with existing steganography methods. Experimental results show that F3SNet surpasses the state-of-the-art methods, particularly for small-sized and low embedding rate samples.

Download Full-text

CARNet: Cascade attentive RefineNet for multi-lesion segmentation of diabetic retinopathy images

Complex & Intelligent Systems ◽

10.1007/s40747-021-00630-4 ◽

2022 ◽

Author(s):

Yanfei Guo ◽

Yanjun Peng

Keyword(s):

Diabetic Retinopathy ◽

Information Fusion ◽

State Of The Art ◽

Complex Structure ◽

Data Sets ◽

Lesion Segmentation ◽

Lesion Area ◽

Patch Image ◽

High Level ◽

Local Image

AbstractDiabetic retinopathy is the leading cause of blindness in working population. Lesion segmentation from fundus images helps ophthalmologists accurately diagnose and grade of diabetic retinopathy. However, the task of lesion segmentation is full of challenges due to the complex structure, the various sizes and the interclass similarity with other fundus tissues. To address the issue, this paper proposes a cascade attentive RefineNet (CARNet) for automatic and accurate multi-lesion segmentation of diabetic retinopathy. It can make full use of the fine local details and coarse global information from the fundus image. CARNet is composed of global image encoder, local image encoder and attention refinement decoder. We take the whole image and the patch image as the dual input, and feed them to ResNet50 and ResNet101, respectively, for downsampling to extract lesion features. The high-level refinement decoder uses dual attention mechanism to integrate the same-level features in the two encoders with the output of the low-level attention refinement module for multiscale information fusion, which focus the model on the lesion area to generate accurate predictions. We evaluated the segmentation performance of the proposed CARNet on the IDRiD, E-ophtha and DDR data sets. Extensive comparison experiments and ablation studies on various data sets demonstrate the proposed framework outperforms the state-of-the-art approaches and has better accuracy and robustness. It not only overcomes the interference of similar tissues and noises to achieve accurate multi-lesion segmentation, but also preserves the contour details and shape features of small lesions without overloading GPU memory usage.

Download Full-text

Deep Multi-Task Learning with Adversarial-and-Cooperative Nets

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/566 ◽

2019 ◽

Cited By ~ 1

Author(s):

Pei Yang ◽

Qi Tan ◽

Jieping Ye ◽

Hanghang Tong ◽

Jingrui He

Keyword(s):

Knowledge Sharing ◽

Domain Adaptation ◽

State Of The Art ◽

Specific Knowledge ◽

Fine Grained ◽

Task Learning ◽

Combine Strategy ◽

Benchmark Datasets ◽

Adaptation Scenarios ◽

And Task

In this paper, we propose a deep multi-Task learning model based on Adversarial-and-COoperative nets (TACO). The goal is to use an adversarial-and-cooperative strategy to decouple the task-common and task-specific knowledge, facilitating the fine-grained knowledge sharing among tasks. TACO accommodates multiple game players, i.e., feature extractors, domain discriminator, and tri-classifiers. They play the MinMax games adversarially and cooperatively to distill the task-common and task-specific features, while respecting their discriminative structures. Moreover, it adopts a divide-and-combine strategy to leverage the decoupled multi-view information to further improve the generalization performance of the model. The experimental results show that our proposed method significantly outperforms the state-of-the-art algorithms on the benchmark datasets in both multi-task learning and semi-supervised domain adaptation scenarios.

Download Full-text

FFU-Net: Feature Fusion U-Net for Lesion Segmentation of Diabetic Retinopathy

BioMed Research International ◽

10.1155/2021/6644071 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Yifei Xu ◽

Zhuming Zhou ◽

Xiao Li ◽

Nuo Zhang ◽

Meizi Zhang ◽

...

Keyword(s):

Diabetic Retinopathy ◽

Feature Fusion ◽

State Of The Art ◽

Small Lesion ◽

Fundus Images ◽

Fundus Image ◽

Lesion Segmentation ◽

Basic Work ◽

Ablation Study ◽

Human Eyes

Diabetic retinopathy is one of the main causes of blindness in human eyes, and lesion segmentation is an important basic work for the diagnosis of diabetic retinopathy. Due to the small lesion areas scattered in fundus images, it is laborious to segment the lesion of diabetic retinopathy effectively with the existing U-Net model. In this paper, we proposed a new lesion segmentation model named FFU-Net (Feature Fusion U-Net) that enhances U-Net from the following points. Firstly, the pooling layer in the network is replaced with a convolutional layer to reduce spatial loss of the fundus image. Then, we integrate multiscale feature fusion (MSFF) block into the encoders which helps the network to learn multiscale features efficiently and enrich the information carried with skip connection and lower-resolution decoder by fusing contextual channel attention (CCA) models. Finally, in order to solve the problems of data imbalance and misclassification, we present a Balanced Focal Loss function. In the experiments on benchmark dataset IDRID, we make an ablation study to verify the effectiveness of each component and compare FFU-Net against several state-of-the-art models. In comparison with baseline U-Net, FFU-Net improves the segmentation performance by 11.97%, 10.68%, and 5.79% on metrics SEN, IOU, and DICE, respectively. The quantitative and qualitative results demonstrate the superiority of our FFU-Net in the task of lesion segmentation of diabetic retinopathy.

Download Full-text

Behavioral Genetics: Concepts for Research and Practice in Language Development and Disorders

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3805.1126 ◽

1995 ◽

Vol 38 (5) ◽

pp. 1126-1142 ◽

Cited By ~ 14

Author(s):

Jeffrey W. Gilger

Keyword(s):

Language Development ◽

Behavioral Genetics ◽

State Of The Art ◽

Genetic Research ◽

Great Promise ◽

Behavioral Genetic ◽

Fine Grained ◽

Future Goals ◽

Current State ◽

Research Designs

This paper is an introduction to behavioral genetics for researchers and practioners in language development and disorders. The specific aims are to illustrate some essential concepts and to show how behavioral genetic research can be applied to the language sciences. Past genetic research on language-related traits has tended to focus on simple etiology (i.e., the heritability or familiality of language skills). The current state of the art, however, suggests that great promise lies in addressing more complex questions through behavioral genetic paradigms. In terms of future goals it is suggested that: (a) more behavioral genetic work of all types should be done—including replications and expansions of preliminary studies already in print; (b) work should focus on fine-grained, theory-based phenotypes with research designs that can address complex questions in language development; and (c) work in this area should utilize a variety of samples and methods (e.g., twin and family samples, heritability and segregation analyses, linkage and association tests, etc.).

Download Full-text

Attention-Based Multi-Task Learning For Fine-Grained Image Classification

10.1109/icip42928.2021.9506745 ◽

2021 ◽

Author(s):

Dichao Liu ◽

Yu Wang ◽

Kenji Mase ◽

Jien Kato

Keyword(s):

Image Classification ◽

Fine Grained ◽

Task Learning

Download Full-text

Multiple objects tracking in the UAV system based on hierarchical deep high-resolution network

Multimedia Tools and Applications ◽

10.1007/s11042-020-10427-1 ◽

2021 ◽

Author(s):

Wei Huang ◽

Xiaoshu Zhou ◽

Mingchao Dong ◽

Huaiyu Xu

Keyword(s):

High Resolution ◽

Object Tracking ◽

High Performance ◽

State Of The Art ◽

Class Imbalance ◽

Unified Framework ◽

Multiple Objects ◽

Tracking Process ◽

Objects Tracking ◽

Different Types

AbstractRobust and high-performance visual multi-object tracking is a big challenge in computer vision, especially in a drone scenario. In this paper, an online Multi-Object Tracking (MOT) approach in the UAV system is proposed to handle small target detections and class imbalance challenges, which integrates the merits of deep high-resolution representation network and data association method in a unified framework. Specifically, while applying tracking-by-detection architecture to our tracking framework, a Hierarchical Deep High-resolution network (HDHNet) is proposed, which encourages the model to handle different types and scales of targets, and extract more effective and comprehensive features during online learning. After that, the extracted features are fed into different prediction networks for interesting targets recognition. Besides, an adjustable fusion loss function is proposed by combining focal loss and GIoU loss to solve the problems of class imbalance and hard samples. During the tracking process, these detection results are applied to an improved DeepSORT MOT algorithm in each frame, which is available to make full use of the target appearance features to match one by one on a practical basis. The experimental results on the VisDrone2019 MOT benchmark show that the proposed UAV MOT system achieves the highest accuracy and the best robustness compared with state-of-the-art methods.

Download Full-text

Representation Learning for Fine-Grained Change Detection

Sensors ◽

10.3390/s21134486 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4486

Author(s):

Niall O’Mahony ◽

Sean Campbell ◽

Lenka Krpalkova ◽

Anderson Carvalho ◽

Joseph Walsh ◽

...

Keyword(s):

Deep Learning ◽

Change Detection ◽

Model Calibration ◽

State Of The Art ◽

Representation Learning ◽

Machine Intelligence ◽

The State ◽

Sensor Data ◽

Fine Grained ◽

Learning Techniques

Fine-grained change detection in sensor data is very challenging for artificial intelligence though it is critically important in practice. It is the process of identifying differences in the state of an object or phenomenon where the differences are class-specific and are difficult to generalise. As a result, many recent technologies that leverage big data and deep learning struggle with this task. This review focuses on the state-of-the-art methods, applications, and challenges of representation learning for fine-grained change detection. Our research focuses on methods of harnessing the latent metric space of representation learning techniques as an interim output for hybrid human-machine intelligence. We review methods for transforming and projecting embedding space such that significant changes can be communicated more effectively and a more comprehensive interpretation of underlying relationships in sensor data is facilitated. We conduct this research in our work towards developing a method for aligning the axes of latent embedding space with meaningful real-world metrics so that the reasoning behind the detection of change in relation to past observations may be revealed and adjusted. This is an important topic in many fields concerned with producing more meaningful and explainable outputs from deep learning and also for providing means for knowledge injection and model calibration in order to maintain user confidence.

Download Full-text

Fighting Together against the Pandemic: Learning Multiple Models on Tomography Images for COVID-19 Diagnosis

AI ◽

10.3390/ai2020016 ◽

2021 ◽

Vol 2 (2) ◽

pp. 261-273

Author(s):

Mario Manzo ◽

Simone Pellino

Keyword(s):

Network Architecture ◽

State Of The Art ◽

Ensemble Classification ◽

Effective Vaccine ◽

Rt Pcr ◽

Neural Network Architecture ◽

Experimental Phase ◽

Different Types ◽

Polymerase Chain ◽

Better Than

COVID-19 has been a great challenge for humanity since the year 2020. The whole world has made a huge effort to find an effective vaccine in order to save those not yet infected. The alternative solution is early diagnosis, carried out through real-time polymerase chain reaction (RT-PCR) tests or thorax Computer Tomography (CT) scan images. Deep learning algorithms, specifically convolutional neural networks, represent a methodology for image analysis. They optimize the classification design task, which is essential for an automatic approach with different types of images, including medical. In this paper, we adopt a pretrained deep convolutional neural network architecture in order to diagnose COVID-19 disease from CT images. Our idea is inspired by what the whole of humanity is achieving, as the set of multiple contributions is better than any single one for the fight against the pandemic. First, we adapt, and subsequently retrain for our assumption, some neural architectures that have been adopted in other application domains. Secondly, we combine the knowledge extracted from images by the neural architectures in an ensemble classification context. Our experimental phase is performed on a CT image dataset, and the results obtained show the effectiveness of the proposed approach with respect to the state-of-the-art competitors.

Download Full-text

Diabetic Retinal Grading Using Attention-Based Bilinear Convolutional Neural Network and Complement Cross Entropy

Entropy ◽

10.3390/e23070816 ◽

2021 ◽

Vol 23 (7) ◽

pp. 816

Author(s):

Pingping Liu ◽

Xiaokang Yang ◽

Baixin Jin ◽

Qiuzhan Zhou

Keyword(s):

Neural Network ◽

Image Processing ◽

Diabetic Retinopathy ◽

Convolutional Neural Network ◽

Image Classification ◽

Network Model ◽

Rapid Development ◽

Image Data ◽

Lesion Detection ◽

Great Success

Diabetic retinopathy (DR) is a common complication of diabetes mellitus (DM), and it is necessary to diagnose DR in the early stages of treatment. With the rapid development of convolutional neural networks in the field of image processing, deep learning methods have achieved great success in the field of medical image processing. Various medical lesion detection systems have been proposed to detect fundus lesions. At present, in the image classification process of diabetic retinopathy, the fine-grained properties of the diseased image are ignored and most of the retinopathy image data sets have serious uneven distribution problems, which limits the ability of the network to predict the classification of lesions to a large extent. We propose a new non-homologous bilinear pooling convolutional neural network model and combine it with the attention mechanism to further improve the network’s ability to extract specific features of the image. The experimental results show that, compared with the most popular fundus image classification models, the network model we proposed can greatly improve the prediction accuracy of the network while maintaining computational efficiency.

Download Full-text