Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning

Multi-task learning (MTL) is a common paradigm that seeks to improve the generalization performance of task learning by training related tasks simultaneously. However, it is still a challenging problem to search the flexible and accurate architecture that can be shared among multiple tasks. In this paper, we propose a novel deep learning model called Task Adaptive Activation Network (TAAN) that can automatically learn the optimal network architecture for MTL. The main principle of TAAN is to derive flexible activation functions for different tasks from the data with other parameters of the network fully shared. We further propose two functional regularization methods that improve the MTL performance of TAAN. The improved performance of both TAAN and the regularization methods is demonstrated by comprehensive experiments.

Download Full-text

An Experimental Study on State Representation Extraction for Vision-Based Deep Reinforcement Learning

Applied Sciences ◽

10.3390/app112110337 ◽

2021 ◽

Vol 11 (21) ◽

pp. 10337

Author(s):

Junkai Ren ◽

Yujun Zeng ◽

Sihang Zhou ◽

Yichuan Zhang

Keyword(s):

Experimental Study ◽

Reinforcement Learning ◽

Network Architecture ◽

Representation Learning ◽

Evaluation Metrics ◽

High Dimensional ◽

Regularization Methods ◽

Challenging Problem ◽

State Representation ◽

Sample Quality

Scaling end-to-end learning to control robots with vision inputs is a challenging problem in the field of deep reinforcement learning (DRL). While achieving remarkable success in complex sequential tasks, vision-based DRL remains extremely data-inefficient, especially when dealing with high-dimensional pixels inputs. Many recent studies have tried to leverage state representation learning (SRL) to break through such a barrier. Some of them could even help the agent learn from pixels as efficiently as from states. Reproducing existing work, accurately judging the improvements offered by novel methods, and applying these approaches to new tasks are vital for sustaining this progress. However, the demands of these three aspects are seldom straightforward. Without significant criteria and tighter standardization of experimental reporting, it is difficult to determine whether improvements over the previous methods are meaningful. For this reason, we conducted ablation studies on hyperparameters, embedding network architecture, embedded dimension, regularization methods, sample quality and SRL methods to compare and analyze their effects on representation learning and reinforcement learning systematically. Three evaluation metrics are summarized, including five baseline algorithms (including both value-based and policy-based methods) and eight tasks are adopted to avoid the particularity of each experiment setting. We highlight the variability in reported methods and suggest guidelines to make future results in SRL more reproducible and stable based on a wide number of experimental analyses. We aim to spur discussion about how to assure continued progress in the field by minimizing wasted effort stemming from results that are non-reproducible and easily misinterpreted.

Download Full-text

Regional stabilization and H∞ congestion control with input saturation

Transactions of the Institute of Measurement and Control ◽

10.1177/0142331221992739 ◽

2021 ◽

pp. 014233122199273

Author(s):

Sadek Belamfedel Alaoui ◽

El Houssaine Tissir ◽

Noreddine Chaibi ◽

Fatima El Haoussi

Keyword(s):

Linear Systems ◽

Controller Design ◽

Input Saturation ◽

Domain Of Attraction ◽

Queue Management ◽

Variable Delay ◽

Challenging Problem ◽

Regional Stabilization ◽

Small Gain ◽

Improved Performance

Designing robust active queue management subjected to network imperfections is a challenging problem. Motivated by this topic, we addressed the problem of controller design for linear systems with variable delay and unsymmetrical constraints by the scaled small gain theorem. We designed two mechanisms: robust enhanced proportional derivative; and robust enhanced proportional derivative subjected to input saturation. Discussion of their practical implementations along with extensive comparisons by MATLAB and NS3 illustrate the improved performance and the enlargement of the domain of attraction regarding some literature results.

Download Full-text

An edge based hybrid intrusion detection framework for mobile edge computing

Complex & Intelligent Systems ◽

10.1007/s40747-021-00498-4 ◽

2021 ◽

Author(s):

Ashish Singh ◽

Kakali Chatterjee ◽

Suresh Chandra Satapathy

Keyword(s):

Intrusion Detection ◽

Real Time ◽

Network Architecture ◽

Denial Of Service ◽

Edge Computing ◽

Mobile Edge Computing ◽

Hybrid Detection ◽

Game Theoretical Approach ◽

Improved Performance ◽

Edge Based

AbstractThe Mobile Edge Computing (MEC) model attracts more users to its services due to its characteristics and rapid delivery approach. This network architecture capability enables users to access the information from the edge of the network. But, the security of this edge network architecture is a big challenge. All the MEC services are available in a shared manner and accessed by users via the Internet. Attacks like the user to root, remote login, Denial of Service (DoS), snooping, port scanning, etc., can be possible in this computing environment due to Internet-based remote service. Intrusion detection is an approach to protect the network by detecting attacks. Existing detection models can detect only the known attacks and the efficiency for monitoring the real-time network traffic is low. The existing intrusion detection solutions cannot identify new unknown attacks. Hence, there is a need of an Edge-based Hybrid Intrusion Detection Framework (EHIDF) that not only detects known attacks but also capable of detecting unknown attacks in real time with low False Alarm Rate (FAR). This paper aims to propose an EHIDF which is mainly considered the Machine Learning (ML) approach for detecting intrusive traffics in the MEC environment. The proposed framework consists of three intrusion detection modules with three different classifiers. The Signature Detection Module (SDM) uses a C4.5 classifier, Anomaly Detection Module (ADM) uses Naive-based classifier, and Hybrid Detection Module (HDM) uses the Meta-AdaboostM1 algorithm. The developed EHIDF can solve the present detection problems by detecting new unknown attacks with low FAR. The implementation results illustrate that EHIDF accuracy is 90.25% and FAR is 1.1%. These results are compared with previous works and found improved performance. The accuracy is improved up to 10.78% and FAR is reduced up to 93%. A game-theoretical approach is also discussed to analyze the security strength of the proposed framework.

Download Full-text

A TOPSIS-ELM framework for stock index price movement prediction

Intelligent Decision Technologies ◽

10.3233/idt-200013 ◽

2021 ◽

pp. 1-19

Author(s):

Sidharth Samal ◽

Rajashree Dash

Keyword(s):

Network Architecture ◽

Binary Classification ◽

Evaluation Criteria ◽

Multiple Criteria ◽

Stock Index ◽

Activation Functions ◽

Price Movement ◽

Movement Prediction ◽

Learning Machine ◽

Approximation Capability

In recent years Extreme Learning Machine (ELM) has gained the interest of various researchers due to its superior generalization and approximation capability. The network architecture and type of activation functions are the two important factors that influence the performance of an ELM. Hence in this study, a Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) oriented multi-criteria decision making (MCDM) framework is suggested for analyzing various ELM models developed with distinct activation functions with respect to sixteen evaluation criteria. Evaluating the performance of the ELM with respect to multiple criteria instead of single criterion can help in designing a more robust network. The proposed framework is used as a binary classification system for pursuing the problem of stock index price movement prediction. The model is empirically evaluated by using historical data of three stock indices such as BSE SENSEX, S&P 500 and NIFTY 50. The empirical study has disclosed promising results by evaluating ELM with different activation functions as well as multiple criteria.

Download Full-text

Grammar Guided Genetic Programming for Network Architecture Search and Road Detection on Aerial Orthophotography

10.20944/preprints202005.0002.v1 ◽

2020 ◽

Author(s):

Víctor de la Fuente Castillo ◽

Alberto Díaz-Álvarez ◽

Miguel-Ángel Manso-Callejo ◽

Francisco Serradilla García

Keyword(s):

Genetic Programming ◽

Network Architecture ◽

Structural Diversity ◽

Network Architectures ◽

Complex Structures ◽

Road Detection ◽

Learning Techniques ◽

Candidate Network ◽

Optimal Network ◽

Hidden Layer

Photogrammetry involves aerial photography of the earth’s surface and subsequently processing the images to provide a more accurate depiction of the area (Orthophotography). It’s used by the Spanish Instituto Geográfico Nacional to update road cartography but requires a significant amount of manual labor due to the need to perform visual inspection of all tiled images. Deep Learning techniques (artificial neural networks with more than one hidden layer) can perform road detection but it is still unclear how to find the optimal network architecture. Our system applies grammar guided genetic programming to the search of deep neural network architectures. In this kind of evolutive algorithm all the population individuals (here candidate network architectures) are constrained to rules specified by a grammar that defines valid and useful structural patterns to guide the search process. Grammar used includes well-known complex structures (e.g. Inception-like modules) combined with a custom designed mutation operator (dynamically links the mutation probability to structural diversity). Pilot results show that the system is able to design models for road detection that obtain test accuracies similar to that reached by state of the art models when evaluated over a dataset from the Spanish National Aerial Orthophotography Plan.

Download Full-text

A Survey on Intrusion Detection System for Software Defined Networks (SDN)

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch023 ◽

2021 ◽

pp. 467-489

Author(s):

Yogita Hande ◽

Akkalashmi Muddana

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Intrusion Detection ◽

Network Architecture ◽

Detection System ◽

Learning Approach ◽

Distinctive Features ◽

Detection Systems ◽

Security Challenges ◽

Deep Learning Model

Presently, the advances of the internet towards a wide-spread growth and the static nature of traditional networks has limited capacity to cope with organizational business needs. The new network architecture software defined networking (SDN) appeared to address these challenges and provides distinctive features. However, these programmable and centralized approaches of SDN face new security challenges which demand innovative security mechanisms like intrusion detection systems (IDS's). The IDS of SDN are designed currently with a machine learning approach; however, a deep learning approach is also being explored to achieve better efficiency and accuracy. In this article, an overview of the SDN with its security concern and IDS as a security solution is explained. A survey of existing security solutions designed to secure the SDN, and a comparative study of various IDS approaches based on a deep learning model and machine learning methods are discussed in the article. Finally, we describe future directions for SDN security.

Download Full-text

Investigation of Optimal Network Architecture for Asparagus Spear Detection in Robotic Harvesting

IFAC-PapersOnLine ◽

10.1016/j.ifacol.2019.12.535 ◽

2019 ◽

Vol 52 (30) ◽

pp. 283-287 ◽

Cited By ~ 4

Author(s):

M. Peebles ◽

S.H. Lim ◽

M. Duke ◽

B. McGuinness

Keyword(s):

Network Architecture ◽

Optimal Network ◽

Robotic Harvesting

Download Full-text

Transitive Topic Modeling with Conversational Structure Context: Discovering Topics that are Most Popular in Online Discussions

International Journal of Semantic Computing ◽

10.1142/s1793351x20400103 ◽

2020 ◽

Vol 14 (02) ◽

pp. 273-293

Author(s):

Yingcheng Sun ◽

Richard Kolacinski ◽

Kenneth Loparo

Keyword(s):

Social Media ◽

Topic Modeling ◽

Topic Model ◽

Online Discussions ◽

Challenging Problem ◽

Topic Extraction ◽

Limited Success ◽

Social Media Platforms ◽

Improved Performance ◽

Conversational Structure

With the explosive growth of online discussions published everyday on social media platforms, comprehension and discovery of the most popular topics have become a challenging problem. Conventional topic models have had limited success in online discussions because the corpus is extremely sparse and noisy. To overcome their limitations, we use the discussion thread tree structure and propose a “popularity” metric to quantify the number of replies to a comment to extend the frequency of word occurrences, and the “transitivity” concept to characterize topic dependency among nodes in a nested discussion thread. We build a Conversational Structure Aware Topic Model (CSATM) based on popularity and transitivity to infer topics and their assignments to comments. Experiments on real forum datasets are used to demonstrate improved performance for topic extraction with six different measurements of coherence and impressive accuracy for topic assignments.

Download Full-text

An Efficient Perpetual Learning Algorithm

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213019500222 ◽

2019 ◽

Vol 28 (07) ◽

pp. 1950022 ◽

Cited By ~ 1

Author(s):

Haiou Qin ◽

Du Zhang ◽

Xibin Sun ◽

Jiahua Tang ◽

Jun Peng

Keyword(s):

Machine Learning ◽

Real World ◽

Efficient Algorithm ◽

Learning Algorithm ◽

Small Data ◽

Computing Systems ◽

Agent Systems ◽

Multiple Tasks ◽

Improved Performance ◽

Over Time

One of the emerging research opportunities in machine learning is to develop computing systems that learn many tasks continuously and improve the performance of learned tasks incrementally over time. In real world, learners have to adapt to labeled and unlabeled samples from various tasks which arrive randomly. In this paper, we propose an efficient algorithm called Efficient Perpetual Learning Algorithm (EPLA) which is suitable for learning multiple tasks in both offline and online settings. The algorithm, which is an extension of ELLA,4 is part of what we call perpetual learning that can learn new tasks or refine knowledge of learned tasks for improved performance with newly arrived labeled samples in an incremental fashion. Several salient features exist for EPLA. The learning episodes are triggered via either extrinsic or intrinsic stimuli. Agent systems based on the proposed algorithm can be engaged in an open-ended and alternating sequence of learning episodes and working episodes. Unlabeled samples can be used to self-train the learner in small data setting. Compared with ELLA, EPLA shows almost equivalent performance without memorizing any labeled samples learned previously.

Download Full-text

A Hybrid Wavelet Transform Based Short-Term Wind Speed Forecasting Approach

The Scientific World JOURNAL ◽

10.1155/2014/914127 ◽

2014 ◽

Vol 2014 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Jujie Wang

Keyword(s):

Wavelet Transform ◽

Wind Speed ◽

Network Architecture ◽

Hybrid Approach ◽

Experimental Simulation ◽

Wind Speed Forecasting ◽

Optimal Network ◽

Partial Autocorrelation Function ◽

Parks Management ◽

Hidden Layer

It is important to improve the accuracy of wind speed forecasting for wind parks management and wind power utilization. In this paper, a novel hybrid approach known as WTT-TNN is proposed for wind speed forecasting. In the first step of the approach, a wavelet transform technique (WTT) is used to decompose wind speed into an approximate scale and several detailed scales. In the second step, a two-hidden-layer neural network (TNN) is used to predict both approximated scale and detailed scales, respectively. In order to find the optimal network architecture, the partial autocorrelation function is adopted to determine the number of neurons in the input layer, and an experimental simulation is made to determine the number of neurons within each hidden layer in the modeling process of TNN. Afterwards, the final prediction value can be obtained by the sum of these prediction results. In this study, a WTT is employed to extract these different patterns of the wind speed and make it easier for forecasting. To evaluate the performance of the proposed approach, it is applied to forecast Hexi Corridor of China’s wind speed. Simulation results in four different cases show that the proposed method increases wind speed forecasting accuracy.

Download Full-text