What and Where the Themes Dominate in Image

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019021 ◽

2019 ◽

Vol 33 ◽

pp. 9021-9029

Author(s):

Xinyu Xiao ◽

Lingfeng Wang ◽

Shiming Xiang ◽

Chunhong Pan

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Spatial Attention ◽

Deep Neural Network ◽

State Of The Art ◽

Learning Method ◽

Salient Object ◽

Image Captioning ◽

Object A ◽

Substantial Progress

The image captioning is to describe an image with natural language as human, which has benefited from the advances in deep neural network and achieved substantial progress in performance. However, the perspective of human description to scene has not been fully considered in this task recently. Actually, the human description to scene is tightly related to the endogenous knowledge and the exogenous salient objects simultaneously, which implies that the content in the description is confined to the known salient objects. Inspired by this observation, this paper proposes a novel framework, which explicitly applies the known salient objects in image captioning. Under this framework, the known salient objects are served as the themes to guide the description generation. According to the property of the known salient object, a theme is composed of two components: its endogenous concept (what) and the exogenous spatial attention feature (where). Specifically, the prediction of each word is dominated by the concept and spatial attention feature of the corresponding theme in the process of caption prediction. Moreover, we introduce a novel learning method of Distinctive Learning (DL) to get more specificity of generated captions like human descriptions. It formulates two constraints in the theme learning process to encourage distinctiveness between different images. Particularly, reinforcement learning is introduced into the framework to address the exposure bias problem between the training and the testing modes. Extensive experiments on the COCO and Flickr30K datasets achieve superior results when compared with the state-of-the-art methods.

Download Full-text

Neural Twins Talk and Alternative Calculations

International Journal of Semantic Computing ◽

10.1142/s1793351x21500045 ◽

2021 ◽

Vol 15 (01) ◽

pp. 93-116

Author(s):

Zanyar Zohourianshahzadi ◽

Jugal K. Kalita

Keyword(s):

Neural Network ◽

Deep Learning ◽

Deep Neural Network ◽

State Of The Art ◽

Input Image ◽

Neural Pathways ◽

Image Captioning ◽

Attention Model ◽

Deep Learning Model ◽

Previous Image

Inspired by how the human brain employs more neural pathways when increasing the focus on a subject, we introduce a novel twin cascaded attention model that outperforms a state-of-the-art image captioning model that was originally implemented using one channel of attention for the visual grounding task. Visual grounding ensures the existence of words in the caption sentence that are grounded into a particular region in the input image. After a deep learning model is trained on visual grounding task, the model employs the learned patterns regarding the visual grounding and the order of objects in the caption sentences, when generating captions. We report the results of our experiments in three image captioning tasks on the COCO dataset. The results are reported using standard image captioning metrics to show the improvements achieved by our model over the previous image captioning model. The results gathered from our experiments suggest that employing more parallel attention pathways in a deep neural network leads to higher performance. Our implementation of Neural Twins Talk (NTT) is publicly available at: https://github.com/zanyarz/NeuralTwinsTalk .

Download Full-text

Fast Accurate and Automatic Brushstroke Extraction

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3429742 ◽

2021 ◽

Vol 17 (2) ◽

pp. 1-24

Author(s):

Yunfei Fu ◽

Hongchuan Yu ◽

Chih-Kuo Yeh ◽

Tong-Yee Lee ◽

Jian J. Zhang

Keyword(s):

Neural Network ◽

Efficient Algorithm ◽

Deep Neural Network ◽

High Efficiency ◽

State Of The Art ◽

High Reliability ◽

The Other ◽

Manual Annotation ◽

Stroke Extraction ◽

Art Research

Brushstrokes are viewed as the artist’s “handwriting” in a painting. In many applications such as style learning and transfer, mimicking painting, and painting authentication, it is highly desired to quantitatively and accurately identify brushstroke characteristics from old masters’ pieces using computer programs. However, due to the nature of hundreds or thousands of intermingling brushstrokes in the painting, it still remains challenging. This article proposes an efficient algorithm for brush Stroke extraction based on a Deep neural network, i.e., DStroke. Compared to the state-of-the-art research, the main merit of the proposed DStroke is to automatically and rapidly extract brushstrokes from a painting without manual annotation, while accurately approximating the real brushstrokes with high reliability. Herein, recovering the faithful soft transitions between brushstrokes is often ignored by the other methods. In fact, the details of brushstrokes in a master piece of painting (e.g., shapes, colors, texture, overlaps) are highly desired by artists since they hold promise to enhance and extend the artists’ powers, just like microscopes extend biologists’ powers. To demonstrate the high efficiency of the proposed DStroke, we perform it on a set of real scans of paintings and a set of synthetic paintings, respectively. Experiments show that the proposed DStroke is noticeably faster and more accurate at identifying and extracting brushstrokes, outperforming the other methods.

Download Full-text

Fully distributed actor-critic architecture for multitask deep reinforcement learning

The Knowledge Engineering Review ◽

10.1017/s0269888921000023 ◽

2021 ◽

Vol 36 ◽

Author(s):

Sergio Valcarcel Macua ◽

Ian Davies ◽

Aleksi Tukiainen ◽

Enrique Munoz de Cote

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Duality Theory ◽

Deep Neural Network ◽

Original Problem ◽

Almost Sure Convergence ◽

Continuous Control ◽

Access Data ◽

Central Station ◽

Common Policy

Abstract We propose a fully distributed actor-critic architecture, named diffusion-distributed-actor-critic Diff-DAC, with application to multitask reinforcement learning (MRL). During the learning process, agents communicate their value and policy parameters to their neighbours, diffusing the information across a network of agents with no need for a central station. Each agent can only access data from its local task, but aims to learn a common policy that performs well for the whole set of tasks. The architecture is scalable, since the computational and communication cost per agent depends on the number of neighbours rather than the overall number of agents. We derive Diff-DAC from duality theory and provide novel insights into the actor-critic framework, showing that it is actually an instance of the dual-ascent method. We prove almost sure convergence of Diff-DAC to a common policy under general assumptions that hold even for deep neural network approximations. For more restrictive assumptions, we also prove that this common policy is a stationary point of an approximation of the original problem. Numerical results on multitask extensions of common continuous control benchmarks demonstrate that Diff-DAC stabilises learning and has a regularising effect that induces higher performance and better generalisation properties than previous architectures.

Download Full-text

Optimising Performance for NB-IoT UE Devices through Data Driven Models

Journal of Sensor and Actuator Networks ◽

10.3390/jsan10010021 ◽

2021 ◽

Vol 10 (1) ◽

pp. 21

Author(s):

Omar Nassef ◽

Toktam Mahmoodi ◽

Foivos Michelinakis ◽

Kashif Mahmood ◽

Ahmed Elmokashfi

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Gradient Descent ◽

Deep Neural Network ◽

Narrow Band ◽

Learning Algorithm ◽

Base Station ◽

User Equipment ◽

Data Driven ◽

Superior Performance

This paper presents a data driven framework for performance optimisation of Narrow-Band IoT user equipment. The proposed framework is an edge micro-service that suggests one-time configurations to user equipment communicating with a base station. Suggested configurations are delivered from a Configuration Advocate, to improve energy consumption, delay, throughput or a combination of those metrics, depending on the user-end device and the application. Reinforcement learning utilising gradient descent and genetic algorithm is adopted synchronously with machine and deep learning algorithms to predict the environmental states and suggest an optimal configuration. The results highlight the adaptability of the Deep Neural Network in the prediction of intermediary environmental states, additionally the results present superior performance of the genetic reinforcement learning algorithm regarding its performance optimisation.

Download Full-text

Deep Neural Network Based Salient Object Detection with Image Enhancement

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-04212-7_39 ◽

2018 ◽

pp. 444-453

Author(s):

Lecheng Zhou ◽

Xiaodong Gu

Keyword(s):

Neural Network ◽

Object Detection ◽

Image Enhancement ◽

Deep Neural Network ◽

Salient Object Detection ◽

Salient Object

Download Full-text

Incentive-based demand response for smart grid with reinforcement learning and deep neural network

Applied Energy ◽

10.1016/j.apenergy.2018.12.061 ◽

2019 ◽

Vol 236 ◽

pp. 937-949 ◽

Cited By ~ 60

Author(s):

Renzhi Lu ◽

Seung Ho Hong

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Smart Grid ◽

Demand Response ◽

Deep Neural Network

Download Full-text

An Efficient Method for Detection of DDoS Attacks on the Web Using Deep Learning Algorithms

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/271042021 ◽

2021 ◽

Vol 10 (4) ◽

pp. 2821-2829

Keyword(s):

Neural Network ◽

Deep Learning ◽

Deep Neural Network ◽

State Of The Art ◽

Ddos Attacks ◽

Problem Statement ◽

Neural Network Approach ◽

Learning Techniques ◽

Attack Data ◽

Deep Learning Neural Network

Recently, DDoS attacks is the most significant threat in network security. Both industry and academia are currently debating how to detect and protect against DDoS attacks. Many studies are provided to detect these types of attacks. Deep learning techniques are the most suitable and efficient algorithm for categorizing normal and attack data. Hence, a deep neural network approach is proposed in this study to mitigate DDoS attacks effectively. We used a deep learning neural network to identify and classify traffic as benign or one of four different DDoS attacks. We will concentrate on four different DDoS types: Slowloris, Slowhttptest, DDoS Hulk, and GoldenEye. The rest of the paper is organized as follow: Firstly, we introduce the work, Section 2 defines the related works, Section 3 presents the problem statement, Section 4 describes the proposed methodology, Section 5 illustrate the results of the proposed methodology and shows how the proposed methodology outperforms state-of-the-art work and finally Section VI concludes the paper.

Download Full-text

An Analysis of State-of-the-art Activation Functions For Supervised Deep Neural Network

10.31219/osf.io/2zk6a ◽

2021 ◽

Author(s):

Anh Nguyen ◽

Khoa Pham ◽

Dat Ngo ◽

Thanh Ngo ◽

Lam Pham

Keyword(s):

Neural Network ◽

Supervised Classification ◽

Deep Neural Network ◽

State Of The Art ◽

Network Architectures ◽

Activation Functions ◽

Scene Classification ◽

Learning Network ◽

Deep Learning Network

This paper provides an analysis of state-of-the-art activation functions with respect to supervised classification of deep neural network. These activation functions comprise of Rectified Linear Units (ReLU), Exponential Linear Unit (ELU), Scaled Exponential Linear Unit (SELU), Gaussian Error Linear Unit (GELU), and the Inverse Square Root Linear Unit (ISRLU). To evaluate, experiments over two deep learning network architectures integrating these activation functions are conducted. The first model, basing on Multilayer Perceptron (MLP), is evaluated with MNIST dataset to perform these activation functions.Meanwhile, the second model, likely VGGish-based architecture, is applied for Acoustic Scene Classification (ASC) Task 1A in DCASE 2018 challenge, thus evaluate whether these activation functions work well in different datasets as well as different network architectures.

Download Full-text

Regularized Training and Tight Certification for Randomized Smoothed Classifier with Provable Robustness

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5798 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3858-3865

Author(s):

Huijie Feng ◽

Chunpeng Wu ◽

Guoyang Chen ◽

Weifeng Zhang ◽

Yang Ning

Keyword(s):

Neural Network ◽

High Probability ◽

Deep Neural Network ◽

State Of The Art ◽

Computationally Efficient ◽

Base Classifier ◽

Training Scheme ◽

Adversarial Training ◽

Gaussian Perturbation ◽

Probabilistic Robustness

Recently smoothing deep neural network based classifiers via isotropic Gaussian perturbation is shown to be an effective and scalable way to provide state-of-the-art probabilistic robustness guarantee against ℓ2 norm bounded adversarial perturbations. However, how to train a good base classifier that is accurate and robust when smoothed has not been fully investigated. In this work, we derive a new regularized risk, in which the regularizer can adaptively encourage the accuracy and robustness of the smoothed counterpart when training the base classifier. It is computationally efficient and can be implemented in parallel with other empirical defense methods. We discuss how to implement it under both standard (non-adversarial) and adversarial training scheme. At the same time, we also design a new certification algorithm, which can leverage the regularization effect to provide tighter robustness lower bound that holds with high probability. Our extensive experimentation demonstrates the effectiveness of the proposed training and certification approaches on CIFAR-10 and ImageNet datasets.

Download Full-text

Covid-19 detection via deep neural network and occlusion sensitivity maps

10.36227/techrxiv.14100890 ◽

2021 ◽

Author(s):

Noor Ahmad ◽

Muhammad Aminu ◽

Mohd Halim Mohd Noor

Keyword(s):

Neural Network ◽

Deep Learning ◽

Deep Neural Network ◽

State Of The Art ◽

Color Images ◽

Fine Tuning ◽

Training Dataset ◽

Learning Approaches ◽

Learning Models ◽

Sensitivity Maps

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.

Download Full-text