Recognizing Instagram Filtered Images with Feature De-Stylization

Deep neural networks have been shown to suffer from poor generalization when small perturbations are added (like Gaussian noise), yet little work has been done to evaluate their robustness to more natural image transformations like photo filters. This paper presents a study on how popular pretrained models are affected by commonly used Instagram filters. To this end, we introduce ImageNet-Instagram, a filtered version of ImageNet, where 20 popular Instagram filters are applied to each image in ImageNet. Our analysis suggests that simple structure preserving filters which only alter the global appearance of an image can lead to large differences in the convolutional feature space. To improve generalization, we introduce a lightweight de-stylization module that predicts parameters used for scaling and shifting feature maps to “undo” the changes incurred by filters, inverting the process of style transfer tasks. We further demonstrate the module can be readily plugged into modern CNN architectures together with skip connections. We conduct extensive studies on ImageNet-Instagram, and show quantitatively and qualitatively, that the proposed module, among other things, can effectively improve generalization by simply learning normalization parameters without retraining the entire network, thus recovering the alterations in the feature space caused by the filters.

Download Full-text

Deep Convolutional Nets Learning Classification for Artistic Style Transfer

Scientific Programming ◽

10.1155/2022/2038740 ◽

2022 ◽

Vol 2022 ◽

pp. 1-9

Author(s):

R. Dinesh Kumar ◽

E. Golden Julie ◽

Y. Harold Robinson ◽

S. Vimal ◽

Gaurav Dhiman ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

Feature Space ◽

Input Image ◽

Reference Image ◽

Nonphotorealistic Rendering ◽

Style Transfer ◽

Artistic Style ◽

The Neural Network

Humans have mastered the skill of creativity for many decades. The process of replicating this mechanism is introduced recently by using neural networks which replicate the functioning of human brain, where each unit in the neural network represents a neuron, which transmits the messages from one neuron to other, to perform subconscious tasks. Usually, there are methods to render an input image in the style of famous art works. This issue of generating art is normally called nonphotorealistic rendering. Previous approaches rely on directly manipulating the pixel representation of the image. While using deep neural networks which are constructed using image recognition, this paper carries out implementations in feature space representing the higher levels of the content image. Previously, deep neural networks are used for object recognition and style recognition to categorize the artworks consistent with the creation time. This paper uses Visual Geometry Group (VGG16) neural network to replicate this dormant task performed by humans. Here, the images are input where one is the content image which contains the features you want to retain in the output image and the style reference image which contains patterns or images of famous paintings and the input image which needs to be style and blend them together to produce a new image where the input image is transformed to look like the content image but “sketched” to look like the style image.

Download Full-text

GTAE: Graph Transformer–Based Auto-Encoders for Linguistic-Constrained Text Style Transfer

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3448733 ◽

2021 ◽

Vol 12 (3) ◽

pp. 1-16

Author(s):

Yukai Shi ◽

Sen Zhang ◽

Chenxing Zhou ◽

Xiaodan Liang ◽

Xiaojun Yang ◽

...

Keyword(s):

Deep Neural Networks ◽

Model Space ◽

Style Transfer ◽

Unconstrained Model ◽

Linguistic Rules ◽

Comparable Performance ◽

Parallel Text ◽

Content Preservation ◽

Transfer Tasks ◽

Transfer Accuracy

Non-parallel text style transfer has attracted increasing research interests in recent years. Despite successes in transferring the style based on the encoder-decoder framework, current approaches still lack the ability to preserve the content and even logic of original sentences, mainly due to the large unconstrained model space or too simplified assumptions on latent embedding space. Since language itself is an intelligent product of humans with certain grammars and has a limited rule-based model space by its nature, relieving this problem requires reconciling the model capacity of deep neural networks with the intrinsic model constraints from human linguistic rules. To this end, we propose a method called Graph Transformer–based Auto-Encoder, which models a sentence as a linguistic graph and performs feature extraction and style transfer at the graph level, to maximally retain the content and the linguistic structure of original sentences. Quantitative experiment results on three non-parallel text style transfer tasks show that our model outperforms state-of-the-art methods in content preservation, while achieving comparable performance on transfer accuracy and sentence naturalness.

Download Full-text

EVET: Enhancing Visual Explanations of Deep Neural Networks Using Image Transformations

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) ◽

10.1109/wacv48630.2021.00362 ◽

2021 ◽

Author(s):

Youngrock Oh ◽

Hyungsik Jung ◽

Jeonghyung Park ◽

Min Soo Kim

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Image Transformations

Download Full-text

Revision in Continuous Space: Unsupervised Text Style Transfer without Adversarial Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6355 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8376-8383

Author(s):

Dayiheng Liu ◽

Jie Fu ◽

Yidan Zhang ◽

Chris Pal ◽

Jiancheng Lv

Keyword(s):

Experimental Studies ◽

Target Sentence ◽

Sentence Length ◽

Adversarial Learning ◽

Style Transfer ◽

Continuous Space ◽

Fine Grained ◽

Gradient Based ◽

Transfer Tasks ◽

New Framework

Typical methods for unsupervised text style transfer often rely on two key ingredients: 1) seeking the explicit disentanglement of the content and the attributes, and 2) troublesome adversarial learning. In this paper, we show that neither of these components is indispensable. We propose a new framework that utilizes the gradients to revise the sentence in a continuous space during inference to achieve text style transfer. Our method consists of three key components: a variational auto-encoder (VAE), some attribute predictors (one for each attribute), and a content predictor. The VAE and the two types of predictors enable us to perform gradient-based optimization in the continuous space, which is mapped from sentences in a discrete space, to find the representation of a target sentence with the desired attributes and preserved content. Moreover, the proposed method naturally has the ability to simultaneously manipulate multiple fine-grained attributes, such as sentence length and the presence of specific words, when performing text style transfer tasks. Compared with previous adversarial learning based methods, the proposed method is more interpretable, controllable and easier to train. Extensive experimental studies on three popular text style transfer tasks show that the proposed method significantly outperforms five state-of-the-art methods.

Download Full-text

Inter-Class Angular Loss for Convolutional Neural Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013894 ◽

2019 ◽

Vol 33 ◽

pp. 3894-3901 ◽

Cited By ~ 1

Author(s):

Le Hui ◽

Xiang Li ◽

Chen Gong ◽

Meng Fang ◽

Joey Tianyi Zhou ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Deep Neural Networks ◽

Learning Difficulties ◽

Feature Space ◽

Superior Performance ◽

Strongly Correlated ◽

Discriminative Ability ◽

Practical Applications ◽

Classification Tasks

Convolutional Neural Networks (CNNs) have shown great power in various classification tasks and have achieved remarkable results in practical applications. However, the distinct learning difficulties in discriminating different pairs of classes are largely ignored by the existing networks. For instance, in CIFAR-10 dataset, distinguishing cats from dogs is usually harder than distinguishing horses from ships. By carefully studying the behavior of CNN models in the training process, we observe that the confusion level of two classes is strongly correlated with their angular separability in the feature space. That is, the larger the inter-class angle is, the lower the confusion will be. Based on this observation, we propose a novel loss function dubbed “Inter-Class Angular Loss” (ICAL), which explicitly models the class correlation and can be directly applied to many existing deep networks. By minimizing the proposed ICAL, the networks can effectively discriminate the examples in similar classes by enlarging the angle between their corresponding class vectors. Thorough experimental results on a series of vision and nonvision datasets confirm that ICAL critically improves the discriminative ability of various representative deep neural networks and generates superior performance to the original networks with conventional softmax loss.

Download Full-text

A PARTIAL MATCHING FRAMEWORK BASED ON SET EXCLUSION CRITERIA

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800141265003x ◽

2012 ◽

Vol 26 (02) ◽

pp. 1265003 ◽

Cited By ~ 1

Author(s):

WALDEMAR VILLAMAYOR-VENIALBO ◽

HORACIO LEGAL-AYALA ◽

EDSON J. R. JUSTINO ◽

JACQUES FACON

Keyword(s):

Set Theory ◽

Feature Space ◽

Identification System ◽

Shape Similarity ◽

Segmentation Method ◽

Exclusion Criteria ◽

Partial Matching ◽

Dissimilarity Score ◽

Image Transformations ◽

Selection Of

This article introduces a partial matching framework, based on set theory criteria, for the measurement of shape similarity. The matching framework is described in an abstract way because the proposed scheme is independent of the selection of a segmentation method and feature space. This paradigm ensures the high adaptability of the algorithm and brings the implementer a wide control over the robustness, the ability to balance between selectivity and sensitivity, and the freedom to deal with more general and arbitrary image transformations required for some particular problem. A strategy to establish a descriptor set obtained from components segmented from the main shape is expounded, and two exclusion measure functions are formulated. Proofs are given to show that it is not required to match the entire descriptor sets to determine that two shapes are similar. The methodology provides a dissimilarity score that may be used for shape-based retrieval and object recognition; this is demonstrated applying the proposed approach in a cattle brand identification system.

Download Full-text

On preservation of an exponentially stable invariant torus

Tatra Mountains Mathematical Publications ◽

10.1515/tmmp-2015-0033 ◽

2015 ◽

Vol 63 (1) ◽

pp. 215-222 ◽

Cited By ~ 1

Author(s):

Mykola Perestyuk ◽

Petro Feketa

Keyword(s):

Simple Structure ◽

Linear Extensions ◽

Qualitative Behaviour ◽

Small Perturbations ◽

One Dimensional ◽

Exponentially Stable ◽

Invariant Toroidal Manifold ◽

Dimensional Dynamical System ◽

Stable Invariant ◽

Stable Invariant Torus

Abstract New conditions of the preservation of an exponentially stable invariant toroidal manifold of linear extension of one-dimensional dynamical system on torus under small perturbations in ω-limit set are established. This approach is applied to the investigation of the qualitative behaviour of solutions of linear extensions of dynamical systems with simple structure of limit sets.

Download Full-text

Attentively Learning Edge Distributions for Semantic Segmentation of Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs14010102 ◽

2021 ◽

Vol 14 (1) ◽

pp. 102

Author(s):

Xin Li ◽

Tao Li ◽

Ziqi Chen ◽

Kaiwen Zhang ◽

Runliang Xia

Keyword(s):

Remote Sensing ◽

Contextual Information ◽

Semantic Segmentation ◽

Matrix Analysis ◽

Natural Image ◽

Feature Maps ◽

Remote Sensing Imagery ◽

Edge Distribution ◽

Non Local ◽

Ablation Study

Semantic segmentation has been a fundamental task in interpreting remote sensing imagery (RSI) for various downstream applications. Due to the high intra-class variants and inter-class similarities, inflexibly transferring natural image-specific networks to RSI is inadvisable. To enhance the distinguishability of learnt representations, attention modules were developed and applied to RSI, resulting in satisfactory improvements. However, these designs capture contextual information by equally handling all the pixels regardless of whether they around edges. Therefore, blurry boundaries are generated, rising high uncertainties in classifying vast adjacent pixels. Hereby, we propose an edge distribution attention module (EDA) to highlight the edge distributions of leant feature maps in a self-attentive fashion. In this module, we first formulate and model column-wise and row-wise edge attention maps based on covariance matrix analysis. Furthermore, a hybrid attention module (HAM) that emphasizes the edge distributions and position-wise dependencies is devised combing with non-local block. Consequently, a conceptually end-to-end neural network, termed as EDENet, is proposed to integrate HAM hierarchically for the detailed strengthening of multi-level representations. EDENet implicitly learns representative and discriminative features, providing available and reasonable cues for dense prediction. The experimental results evaluated on ISPRS Vaihingen, Potsdam and DeepGlobe datasets show the efficacy and superiority to the state-of-the-art methods on overall accuracy (OA) and mean intersection over union (mIoU). In addition, the ablation study further validates the effects of EDA.

Download Full-text

Specific Area Style Transfer on Real-Time Video

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.e8689.0310521 ◽

2021 ◽

Vol 10 (5) ◽

pp. 50-56

Author(s):

Hyung-Hwa Ko ◽

GeunTae Kim ◽

Hyunmin Kim

Keyword(s):

Real Time ◽

Deep Neural Networks ◽

Specific Area ◽

Image Generation ◽

Style Transfer ◽

Real Time Processing ◽

Time Processing ◽

Related Research ◽

City Street ◽

Speed And Accuracy

Since deep learning applications in object recognition, object detection, segmentation, and image generation are needed increasingly, related research has been actively conducted. In this paper, using segmentation and style transfer together, a method of producing desired images in the desired area in real-time video is proposed. Two deep neural networks were used to enable as possible as in real-time with the trade-off relationship between speed and accuracy. Modified BiSeNet for segmentation and CycleGAN for style transfer were processed on a desktop PC equipped with two RTX-2080-Ti GPU boards. This enables real-time processing over SD video in decent level. We obtained good results in subjective quality to segment Road area in city street video and change into the Grass style at no less than 6(fps).

Download Full-text

Interactive Artistic Multi-style Transfer

International Journal of Computational Intelligence Systems ◽

10.1007/s44196-021-00021-0 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Xiaohui Wang ◽

Yiran Lyu ◽

Junfeng Huang ◽

Ziying Wang ◽

Jingyan Qin

Keyword(s):

Neural Networks ◽

Image Processing ◽

Deep Neural Networks ◽

Subjective Evaluation ◽

User Interaction ◽

Evaluation Framework ◽

Transfer System ◽

Style Transfer ◽

Artistic Style ◽

Artistic Styles

AbstractArtistic style transfer is to render an image in the style of another image, which is a challenge problem in both image processing and arts. Deep neural networks are adopted to artistic style transfer and achieve remarkable success, such as AdaIN (adaptive instance normalization), WCT (whitening and coloring transforms), MST (multimodal style transfer), and SEMST (structure-emphasized multimodal style transfer). These algorithms modify the content image as a whole using only one style and one algorithm, which is easy to cause the foreground and background to be blurred together. In this paper, an iterative artistic multi-style transfer system is built to edit the image with multiple styles by flexible user interaction. First, a subjective evaluation experiment with art professionals is conducted to build an open evaluation framework for style transfer, including the universal evaluation questions and personalized answers for ten typical artistic styles. Then, we propose the interactive artistic multi-style transfer system, in which an interactive image crop tool is designed to cut a content image into several parts. For each part, users select a style image and an algorithm from AdaIN, WCT, MST, and SEMST by referring to the characteristics of styles and algorithms summarized by the evaluation experiments. To obtain richer results, the system provides a semantic-based parameter adjustment mode and the function of preserving colors of content image. Finally, case studies show the effectiveness and flexibility of the system.

Download Full-text