scholarly journals Skin Lesion Classification Using Densely Connected Convolutional Networks with Attention Residual Learning

Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7080
Author(s):  
Jing Wu ◽  
Wei Hu ◽  
Yuan Wen ◽  
Wenli Tu ◽  
Xiaoming Liu

Skin lesion classification is an effective approach aided by computer vision for the diagnosis of skin cancer. Though deep learning models presented advantages over traditional methods and brought tremendous breakthroughs, a precise diagnosis is still challenging because of the intra-class variation and inter-class similarity caused by the diversity of imaging methods and clinicopathology. In this paper, we propose a densely connected convolutional network with an attention and residual learning (ARDT-DenseNet) method for skin lesion classification. Each ARDT block consists of dense blocks, transition blocks and attention and residual modules. Compared to a residual network with the same number of convolutional layers, the size of the parameters of the densely connected network proposed in this paper has been reduced by half, while the accuracy of skin lesion classification is preserved. Our improved densely connected network adds an attention mechanism and residual learning after each dense block and transition block without introducing additional parameters. We evaluate the ARDT-DenseNet model with the ISIC 2016 and ISIC 2017 datasets. Our method achieves an ACC of 85.7% and an AUC of 83.7% in skin lesion classification with ISIC 2016 and an average AUC of 91.8% in skin lesion classification with ISIC 2017. The experimental results show that the method proposed in this paper has achieved a significant improvement in skin lesion classification, which is superior to that of the state-of-the-art method.

2021 ◽  
Vol 11 (15) ◽  
pp. 6975
Author(s):  
Tao Zhang ◽  
Lun He ◽  
Xudong Li ◽  
Guoqing Feng

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.


Author(s):  
Zhichao Huang ◽  
Xutao Li ◽  
Yunming Ye ◽  
Michael K. Ng

Graph Convolutional Networks (GCNs) have been extensively studied in recent years. Most of existing GCN approaches are designed for the homogenous graphs with a single type of relation. However, heterogeneous graphs of multiple types of relations are also ubiquitous and there is a lack of methodologies to tackle such graphs. Some previous studies address the issue by performing conventional GCN on each single relation and then blending their results. However, as the convolutional kernels neglect the correlations across relations, the strategy is sub-optimal. In this paper, we propose the Multi-Relational Graph Convolutional Network (MR-GCN) framework by developing a novel convolution operator on multi-relational graphs. In particular, our multi-dimension convolution operator extends the graph spectral analysis into the eigen-decomposition of a Laplacian tensor. And the eigen-decomposition is formulated with a generalized tensor product, which can correspond to any unitary transform instead of limited merely to Fourier transform. We conduct comprehensive experiments on four real-world multi-relational graphs to solve the semi-supervised node classification task, and the results show the superiority of MR-GCN against the state-of-the-art competitors.


Author(s):  
Liang Yang ◽  
Zesheng Kang ◽  
Xiaochun Cao ◽  
Di Jin ◽  
Bo Yang ◽  
...  

In the past few years, semi-supervised node classification in attributed network has been developed rapidly. Inspired by the success of deep learning, researchers adopt the convolutional neural network to develop the Graph Convolutional Networks (GCN), and they have achieved surprising classification accuracy by considering the topological information and employing the fully connected network (FCN). However, the given network topology may also induce a performance degradation if it is directly employed in classification, because it may possess high sparsity and certain noises. Besides, the lack of learnable filters in GCN also limits the performance. In this paper, we propose a novel Topology Optimization based Graph Convolutional Networks (TO-GCN) to fully utilize the potential information by jointly refining the network topology and learning the parameters of the FCN. According to our derivations, TO-GCN is more flexible than GCN, in which the filters are fixed and only the classifier can be updated during the learning process. Extensive experiments on real attributed networks demonstrate the superiority of the proposed TO-GCN against the state-of-the-art approaches.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Yang Xu ◽  
Zixi Fu ◽  
Guiyong Xu ◽  
Sicong Zhang ◽  
Xiaoyao Xie

Convolutional neural networks as steganalysis have problems such as poor versatility, long training time, and limited image size. For these problems, we present a heterogeneous kernel residual learning framework called DRHNet—Dual Residual Heterogeneous Network—to save time on the networks during the training phase. Instead of using the image as an input of the network, we extract and merge the images into a feature matrix using the rich model and use the generated feature matrix as the real input of the network. The architecture we proposed has good versatility and can reduce the computation and the number of parameters while still getting higher accuracy. On BOSSbase 1.01, we evaluate the performance of DRHNet in the setting of the spatial domain and frequency domain. The preliminary experimental results show that DRHNet shows excellent steganalysis performance against the state-of-the-art steganographic algorithms.


2020 ◽  
Vol 34 (04) ◽  
pp. 5892-5899
Author(s):  
Ke Sun ◽  
Zhouchen Lin ◽  
Zhanxing Zhu

Graph Convolutional Networks (GCNs) play a crucial role in graph learning tasks, however, learning graph embedding with few supervised signals is still a difficult problem. In this paper, we propose a novel training algorithm for Graph Convolutional Network, called Multi-Stage Self-Supervised (M3S) Training Algorithm, combined with self-supervised learning approach, focusing on improving the generalization performance of GCNs on graphs with few labeled nodes. Firstly, a Multi-Stage Training Framework is provided as the basis of M3S training method. Then we leverage DeepCluster technique, a popular form of self-supervised learning, and design corresponding aligning mechanism on the embedding space to refine the Multi-Stage Training Framework, resulting in M3S Training Algorithm. Finally, extensive experimental results verify the superior performance of our algorithm on graphs with few labeled nodes under different label rates compared with other state-of-the-art approaches.


Author(s):  
Min Shi ◽  
Yufei Tang ◽  
Xingquan Zhu ◽  
David Wilson ◽  
Jianxun Liu

Networked data often demonstrate the Pareto principle (i.e., 80/20 rule) with skewed class distributions, where most vertices belong to a few majority classes and minority classes only contain a handful of instances. When presented with imbalanced class distributions, existing graph embedding learning tends to bias to nodes from majority classes, leaving nodes from minority classes under-trained. In this paper, we propose Dual-Regularized Graph Convolutional Networks (DR-GCN) to handle multi-class imbalanced graphs, where two types of regularization are imposed to tackle class imbalanced representation learning. To ensure that all classes are equally represented, we propose a class-conditioned adversarial training process to facilitate the separation of labeled nodes. Meanwhile, to maintain training equilibrium (i.e., retaining quality of fit across all classes), we force unlabeled nodes to follow a similar latent distribution to the labeled nodes by minimizing their difference in the embedding space. Experiments on real-world imbalanced graphs demonstrate that DR-GCN outperforms the state-of-the-art methods in node classification, graph clustering, and visualization.


2019 ◽  
Vol 11 (18) ◽  
pp. 2142 ◽  
Author(s):  
Lianfa Li

Semantic segmentation is a fundamental means of extracting information from remotely sensed images at the pixel level. Deep learning has enabled considerable improvements in efficiency and accuracy of semantic segmentation of general images. Typical models range from benchmarks such as fully convolutional networks, U-Net, Micro-Net, and dilated residual networks to the more recently developed DeepLab 3+. However, many of these models were originally developed for segmentation of general or medical images and videos, and are not directly relevant to remotely sensed images. The studies of deep learning for semantic segmentation of remotely sensed images are limited. This paper presents a novel flexible autoencoder-based architecture of deep learning that makes extensive use of residual learning and multiscaling for robust semantic segmentation of remotely sensed land-use images. In this architecture, a deep residual autoencoder is generalized to a fully convolutional network in which residual connections are implemented within and between all encoding and decoding layers. Compared with the concatenated shortcuts in U-Net, these residual connections reduce the number of trainable parameters and improve the learning efficiency by enabling extensive backpropagation of errors. In addition, resizing or atrous spatial pyramid pooling (ASPP) can be leveraged to capture multiscale information from the input images to enhance the robustness to scale variations. The residual learning and multiscaling strategies improve the trained model’s generalizability, as demonstrated in the semantic segmentation of land-use types in two real-world datasets of remotely sensed images. Compared with U-Net, the proposed method improves the Jaccard index (JI) or the mean intersection over union (MIoU) by 4-11% in the training phase and by 3-9% in the validation and testing phases. With its flexible deep learning architecture, the proposed approach can be easily applied for and transferred to semantic segmentation of land-use variables and other surface variables of remotely sensed images.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
He Bing ◽  
Xu Zhifeng ◽  
Xu Yangjie ◽  
Hu Jinxing ◽  
Ma Zhanwu

Road link speed is one of the important indicators for traffic states. In order to incorporate the spatiotemporal dynamics and correlation characteristics of road links into speed prediction, this paper proposes a method based on LDA and GCN. First, we construct a trajectory dataset from map-matched GPS location data of taxis. Then, we use the LDA algorithm to extract the semantic function vectors of urban zones and quantify the spatial dynamic characteristics of road links based on taxi trajectories. Finally, we add semantic function vectors to the dataset and train a graph convolutional network to learn the spatial and temporal dependencies of road links. The learned model is used to predict the future speed of road links. The proposed method is compared with six baseline models on the same dataset generated by GPS equipped on taxis in Shenzhen, China, and the results show that our method has better prediction performance when semantic zoning information is added. Both composite and single-valued semantic zoning information can improve the performance of graph convolutional networks by 6.46% and 8.35%, respectively, while the baseline machine learning models work only for single-valued semantic zoning information on the experimental dataset.


2019 ◽  
Vol 38 (9) ◽  
pp. 2092-2103 ◽  
Author(s):  
Jianpeng Zhang ◽  
Yutong Xie ◽  
Yong Xia ◽  
Chunhua Shen

Sign in / Sign up

Export Citation Format

Share Document