Unsupervised Deep Learning via Affinity Diffusion

Jiabo Huang; Qi Dong; Shaogang Gong; Xiatian Zhu

doi:10.1609/aaai.v34i07.6757

Unsupervised Deep Learning via Affinity Diffusion

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6757 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11029-11036

Author(s):

Jiabo Huang ◽

Qi Dong ◽

Shaogang Gong ◽

Xiatian Zhu

Keyword(s):

Deep Learning ◽

State Of The Art ◽

General Purpose ◽

Training Data ◽

Learning Approach ◽

Model Learning ◽

Feature Representations ◽

Discriminative Feature ◽

Training Samples ◽

Unsupervised Deep Learning

Convolutional neural networks (CNNs) have achieved unprecedented success in a variety of computer vision tasks. However, they usually rely on supervised model learning with the need for massive labelled training data, limiting dramatically their usability and deployability in real-world scenarios without any labelling budget. In this work, we introduce a general-purpose unsupervised deep learning approach to deriving discriminative feature representations. It is based on self-discovering semantically consistent groups of unlabelled training samples with the same class concepts through a progressive affinity diffusion process. Extensive experiments on object image classification and clustering show the performance superiority of the proposed method over the state-of-the-art unsupervised learning models using six common image recognition benchmarks including MNIST, SVHN, STL10, CIFAR10, CIFAR100 and ImageNet.

Download Full-text

Few-Shot Website Fingerprinting Attack with Data Augmentation

Security and Communication Networks ◽

10.1155/2021/2840289 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Mantun Chen ◽

Yongjun Wang ◽

Zhiquan Qin ◽

Xiatian Zhu

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

State Of The Art ◽

Training Data ◽

Training Dataset ◽

Feature Representations ◽

Open World ◽

Closed World ◽

Training Samples ◽

Unrealistic Assumption

This work introduces a novel data augmentation method for few-shot website fingerprinting (WF) attack where only a handful of training samples per website are available for deep learning model optimization. Moving beyond earlier WF methods relying on manually-engineered feature representations, more advanced deep learning alternatives demonstrate that learning feature representations automatically from training data is superior. Nonetheless, this advantage is subject to an unrealistic assumption that there exist many training samples per website, which otherwise will disappear. To address this, we introduce a model-agnostic, efficient, and harmonious data augmentation (HDA) method that can improve deep WF attacking methods significantly. HDA involves both intrasample and intersample data transformations that can be used in a harmonious manner to expand a tiny training dataset to an arbitrarily large collection, therefore effectively and explicitly addressing the intrinsic data scarcity problem. We conducted expensive experiments to validate our HDA for boosting state-of-the-art deep learning WF attack models in both closed-world and open-world attacking scenarios, at absence and presence of strong defense. For instance, in the more challenging and realistic evaluation scenario with WTF-PAD-based defense, our HDA method surpasses the previous state-of-the-art results by nearly 3% in classification accuracy in the 20-shot learning case. An earlier version of this work Chen et al. (2021) has been presented as preprint in ArXiv (https://arxiv.org/abs/2101.10063).

Download Full-text

Siamese Reconstruction Network: Accurate Image Reconstruction from Human Brain Activity by Learning to Compare

Applied Sciences ◽

10.3390/app9224749 ◽

2019 ◽

Vol 9 (22) ◽

pp. 4749

Author(s):

Lingyun Jiang ◽

Kai Qiao ◽

Linyuan Wang ◽

Chi Zhang ◽

Jian Chen ◽

...

Keyword(s):

Deep Learning ◽

Human Brain ◽

Brain Activity ◽

Feature Space ◽

Training Data ◽

Reconstruction Method ◽

Learning Method ◽

Training Samples ◽

Visual Reconstruction ◽

Relationship Of

Decoding human brain activities, especially reconstructing human visual stimuli via functional magnetic resonance imaging (fMRI), has gained increasing attention in recent years. However, the high dimensionality and small quantity of fMRI data impose restrictions on satisfactory reconstruction, especially for the reconstruction method with deep learning requiring huge amounts of labelled samples. When compared with the deep learning method, humans can recognize a new image because our human visual system is naturally capable of extracting features from any object and comparing them. Inspired by this visual mechanism, we introduced the mechanism of comparison into deep learning method to realize better visual reconstruction by making full use of each sample and the relationship of the sample pair by learning to compare. In this way, we proposed a Siamese reconstruction network (SRN) method. By using the SRN, we improved upon the satisfying results on two fMRI recording datasets, providing 72.5% accuracy on the digit dataset and 44.6% accuracy on the character dataset. Essentially, this manner can increase the training data about from n samples to 2n sample pairs, which takes full advantage of the limited quantity of training samples. The SRN learns to converge sample pairs of the same class or disperse sample pairs of different class in feature space.

Download Full-text

Deep Learning for Transient Image Reconstruction from ToF Data

Sensors ◽

10.3390/s21061962 ◽

2021 ◽

Vol 21 (6) ◽

pp. 1962

Author(s):

Enrico Buratto ◽

Adriano Simonetto ◽

Gianluca Agresti ◽

Henrik Schäfer ◽

Pietro Zanuttigh

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Light Response ◽

Real Data ◽

Depth Image ◽

Learning Approach ◽

Multiple Reflections ◽

Noisy Input ◽

Novel Approach ◽

Incoming Light

In this work, we propose a novel approach for correcting multi-path interference (MPI) in Time-of-Flight (ToF) cameras by estimating the direct and global components of the incoming light. MPI is an error source linked to the multiple reflections of light inside a scene; each sensor pixel receives information coming from different light paths which generally leads to an overestimation of the depth. We introduce a novel deep learning approach, which estimates the structure of the time-dependent scene impulse response and from it recovers a depth image with a reduced amount of MPI. The model consists of two main blocks: a predictive model that learns a compact encoded representation of the backscattering vector from the noisy input data and a fixed backscattering model which translates the encoded representation into the high dimensional light response. Experimental results on real data show the effectiveness of the proposed approach, which reaches state-of-the-art performances.

Download Full-text

Classification of Very-High-Spatial-Resolution Aerial Images Based on Multiscale Features with Limited Semantic Information

Remote Sensing ◽

10.3390/rs13030364 ◽

2021 ◽

Vol 13 (3) ◽

pp. 364

Author(s):

Han Gao ◽

Jinhui Guo ◽

Peng Guo ◽

Xiuwan Chen

Keyword(s):

Deep Learning ◽

Land Cover ◽

Spatial Resolution ◽

Large Scale ◽

High Spatial Resolution ◽

Training Data ◽

Aerial Images ◽

Rural Landscapes ◽

Feature Representations ◽

Object Based

Recently, deep learning has become the most innovative trend for a variety of high-spatial-resolution remote sensing imaging applications. However, large-scale land cover classification via traditional convolutional neural networks (CNNs) with sliding windows is computationally expensive and produces coarse results. Additionally, although such supervised learning approaches have performed well, collecting and annotating datasets for every task are extremely laborious, especially for those fully supervised cases where the pixel-level ground-truth labels are dense. In this work, we propose a new object-oriented deep learning framework that leverages residual networks with different depths to learn adjacent feature representations by embedding a multibranch architecture in the deep learning pipeline. The idea is to exploit limited training data at different neighboring scales to make a tradeoff between weak semantics and strong feature representations for operational land cover mapping tasks. We draw from established geographic object-based image analysis (GEOBIA) as an auxiliary module to reduce the computational burden of spatial reasoning and optimize the classification boundaries. We evaluated the proposed approach on two subdecimeter-resolution datasets involving both urban and rural landscapes. It presented better classification accuracy (88.9%) compared to traditional object-based deep learning methods and achieves an excellent inference time (11.3 s/ha).

Download Full-text

Understanding the relationship between clouds and surface downward radiation forecast errors with Unsupervised Deep Learning

10.5194/ems2021-471 ◽

2021 ◽

Author(s):

Matthias Zech ◽

Lueder von Bremen

Keyword(s):

Deep Learning ◽

Unsupervised Learning ◽

Prediction Models ◽

Weather Prediction ◽

Reanalysis Data ◽

Cloud Formation ◽

Trade Wind ◽

Forecast Errors ◽

Model Learning ◽

Unsupervised Deep Learning

Cloudiness is a difficult parameter to forecast and has improved relatively little over the last decade in numerical weather prediction models as the EMCWF IFS. However, surface downward solar radiation forecast (ssrd) errors are becoming more important with higher penetration of photovoltaics in Europe as forecasts errors induce power imbalances that might lead to high balancing costs. This study continues recent approaches to better understand clouds using satellite images with Deep Learning. Unlike other studies which focus on shallow trade wind cumulus clouds over the ocean, this study investigates the European land area. To better understand the clouds, we use the daily MODIS optical cloud thickness product which shows both water and ice phase of the cloud. This allows to consider both cloud structure and cloud formation during learning. It is also much easier to distinguish between snow and cloud in contrast to using visible bands. Methodologically, it uses the Unsupervised Learning approach tile2vec to derive a lower dimensional representation of the clouds. Three cloud regions with two similar neighboring tiles and one tile from a different time and location are sampled to learn lower-rank embeddings. In contrast to the initial tile2vec&#160;implementation, this study does not sample arbitrarily distant tiles but uses the fractal dimension of the clouds in a pseudo-random sampling fashion to improve model learning.The usefulness of the cloud segments is shown by applying them in a case study to investigate statistical properties of ssrd forecast errors over Europe which are derived from hourly ECMWF IFS forecasts and ERA5 reanalysis data. This study shows how Unsupervised Learning has high potential despite its relatively low usage compared to Supervised Learning in academia. It further shows, how the generated land cloud product can be used to better characterize ssrd forecast errors over Europe.

Download Full-text

Forest Fire Detection via Feature Entropy Guided Neural Network

Entropy ◽

10.3390/e24010128 ◽

2022 ◽

Vol 24 (1) ◽

pp. 128

Author(s):

Zhenwei Guan ◽

Feng Min ◽

Wei He ◽

Wenhua Fang ◽

Tao Lu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Forest Fire ◽

State Of The Art ◽

Fire Detection ◽

High Entropy ◽

Training Samples ◽

Feature Information ◽

Forest Fire Detection ◽

Entropy Source

Forest fire detection from videos or images is vital to forest firefighting. Most deep learning based approaches rely on converging image loss, which ignores the content from different fire scenes. In fact, complex content of images always has higher entropy. From this perspective, we propose a novel feature entropy guided neural network for forest fire detection, which is used to balance the content complexity of different training samples. Specifically, a larger weight is given to the feature of the sample with a high entropy source when calculating the classification loss. In addition, we also propose a color attention neural network, which mainly consists of several repeated multiple-blocks of color-attention modules (MCM). Each MCM module can extract the color feature information of fire adequately. The experimental results show that the performance of our proposed method outperforms the state-of-the-art methods.

Download Full-text

A Probabilistic Deep Learning Approach for Twitter Sentiment Analysis

International Journal of Distributed Artificial Intelligence ◽

10.4018/ijdai.2020070102 ◽

2020 ◽

Vol 12 (2) ◽

pp. 21-34

Author(s):

Mostefai Abdelkader

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

State Of The Art ◽

Learning Approach ◽

Probabilistic Representation ◽

Effective Manner ◽

Textual Data ◽

Positive Class ◽

Negative Class ◽

Deep Learning Model

In recent years, increasing attention is being paid to sentiment analysis on microblogging platforms such as Twitter. Sentiment analysis refers to the task of detecting whether a textual item (e.g., a tweet) contains an opinion about a topic. This paper proposes a probabilistic deep learning approach for sentiments analysis. The deep learning model used is a convolutional neural network (CNN). The main contribution of this approach is a new probabilistic representation of the text to be fed as input to the CNN. This representation is a matrix that stores for each word composing the message the probability that it belongs to a positive class and the probability that it belongs to a negative class. The proposed approach is evaluated on four well-known datasets HCR, OMD, STS-gold, and a dataset provided by the SemEval-2017 Workshop. The results of the experiments show that the proposed approach competes with the state-of-the-art sentiment analyzers and has the potential to detect sentiments from textual data in an effective manner.

Download Full-text

PulseNetOne: Fast Unsupervised Pruning of Convolutional Neural Networks for Remote Sensing

Remote Sensing ◽

10.3390/rs12071092 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1092

Author(s):

David Browne ◽

Michael Giering ◽

Steven Prestwich

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

Recognition Task ◽

Scene Recognition ◽

Training Data ◽

Learning Approach ◽

Scene Classification

Scene classification is an important aspect of image/video understanding and segmentation. However, remote-sensing scene classification is a challenging image recognition task, partly due to the limited training data, which causes deep-learning Convolutional Neural Networks (CNNs) to overfit. Another difficulty is that images often have very different scales and orientation (viewing angle). Yet another is that the resulting networks may be very large, again making them prone to overfitting and unsuitable for deployment on memory- and energy-limited devices. We propose an efficient deep-learning approach to tackle these problems. We use transfer learning to compensate for the lack of data, and data augmentation to tackle varying scale and orientation. To reduce network size, we use a novel unsupervised learning approach based on k-means clustering, applied to all parts of the network: most network reduction methods use computationally expensive supervised learning methods, and apply only to the convolutional or fully connected layers, but not both. In experiments, we set new standards in classification accuracy on four remote-sensing and two scene-recognition image datasets.

Download Full-text

SCFont: Structure-Guided Chinese Font Generation via Deep Stacked Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014015 ◽

2019 ◽

Vol 33 ◽

pp. 4015-4022 ◽

Cited By ~ 4

Author(s):

Yue Jiang ◽

Zhouhui Lian ◽

Yingmin Tang ◽

Jianguo Xiao

Keyword(s):

Deep Learning ◽

Domain Knowledge ◽

State Of The Art ◽

Automatic Generation ◽

Chinese Characters ◽

Generation System ◽

Model Learning ◽

High Quality ◽

Large Numbers ◽

Quantitative Assessments

Automatic generation of Chinese fonts that consist of large numbers of glyphs with complicated structures is now still a challenging and ongoing problem in areas of AI and Computer Graphics (CG). Traditional CG-based methods typically rely heavily on manual interventions, while recentlypopularized deep learning-based end-to-end approaches often obtain synthesis results with incorrect structures and/or serious artifacts. To address those problems, this paper proposes a structure-guided Chinese font generation system, SCFont, by using deep stacked networks. The key idea is to integrate the domain knowledge of Chinese characters with deep generative networks to ensure that high-quality glyphs with correct structures can be synthesized. More specifically, we first apply a CNN model to learn how to transfer the writing trajectories with separated strokes in the reference font style into those in the target style. Then, we train another CNN model learning how to recover shape details on the contour for synthesized writing trajectories. Experimental results validate the superiority of the proposed SCFont compared to the state of the art in both visual and quantitative assessments.

Download Full-text

Deep learning approach for facial age classification: a survey of the state-of-the-art

Artificial Intelligence Review ◽

10.1007/s10462-020-09855-0 ◽

2020 ◽

Author(s):

Olatunbosun Agbo-Ajala ◽

Serestina Viriri

Keyword(s):

Deep Learning ◽

State Of The Art ◽

The State ◽

Learning Approach ◽

Facial Age

Download Full-text