Online Clustering Algorithms and Reinforcement Learning

A Reinforcement Learning Approach to Online Clustering

Neural Computation ◽

10.1162/089976699300016025 ◽

1999 ◽

Vol 11 (8) ◽

pp. 1915-1932 ◽

Cited By ~ 19

Author(s):

Aristidis Likas

Keyword(s):

Reinforcement Learning ◽

Clustering Algorithms ◽

Experimental Tests ◽

Competitive Learning ◽

Learning System ◽

Learning Approach ◽

Data Sets ◽

General Technique ◽

Online Clustering ◽

Learning Framework

A general technique is proposed for embedding online clustering algorithms based on competitive learning in a reinforcement learning framework. The basic idea is that the clustering system can be viewed as a reinforcement learning system that learns through reinforcements to follow the clustering strategy we wish to implement. In this sense, the reinforcement guided competitive learning (RGCL) algorithm is proposed that constitutes a reinforcement-based adaptation of learning vector quantization (LVQ) with enhanced clustering capabilities. In addition, we suggest extensions of RGCL and LVQ that are characterized by the property of sustained exploration and significantly improve the performance of those algorithms, as indicated by experimental tests on well-known data sets.

Download Full-text

Online Clustering Algorithms for Semantic-Rich Network Trajectories

Journal of Computing Science and Engineering ◽

10.5626/jcse.2011.5.4.346 ◽

2011 ◽

Vol 5 (4) ◽

pp. 346-353 ◽

Cited By ~ 2

Author(s):

Gook-Pil Roh ◽

Seung-Won Hwang

Keyword(s):

Clustering Algorithms ◽

Online Clustering

Download Full-text

Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge Distillation

10.36227/techrxiv.15105492.v1 ◽

2021 ◽

Author(s):

Tiantian Zhang ◽

Xueqian Wang ◽

Bin Liang ◽

Bo Yuan

Keyword(s):

Neural Networks ◽

Reinforcement Learning ◽

Training Data ◽

Learning Ability ◽

High Dimensional ◽

Online Clustering ◽

Computational Overhead ◽

Knowledge Distillation ◽

The Stability ◽

Catastrophic Interference

The powerful learning ability of deep neural networks enables reinforcement learning (RL) agents to learn competent control policies directly from high-dimensional and continuous environments. In theory, to achieve stable performance, neural networks assume i.i.d. inputs, which unfortunately does no hold in the general RL paradigm where the training data is temporally correlated and non-stationary. This issue may lead to the phenomenon of "catastrophic interference" (a.k.a. "catastrophic forgetting") and the collapse in performance as later training is likely to overwrite and interfer with previously learned good policies. In this paper, we introduce the concept of "context" into the single-task RL and develop a novel scheme, termed as Context Division and Knowledge Distillation (CDaKD) driven RL, to divide all states experienced during training into a series of contexts. Its motivation is to mitigate the challenge of aforementioned catastrophic interference in deep RL, thereby improving the stability and plasticity of RL models. At the heart of CDaKD is a value function, parameterized by a neural network feature extractor shared across all contexts, and a set of output heads, each specializing on an individual context. In CDaKD, we exploit online clustering to achieve context division, and interference is further alleviated by a knowledge distillation regularization term on the output layers for learned contexts. In addition, to effectively obtain the context division in high-dimensional state spaces (e.g., image inputs), we perform clustering in the lower-dimensional representation space of a randomly initialized convolutional encoder, which is fixed throughout training. Our results show that, with various replay memory capacities, CDaKD can consistently improve the performance of existing RL algorithms on classic OpenAI Gym tasks and the more complex high-dimensional Atari tasks, incurring only moderate computational overhead.

Download Full-text

Automatic hyperparameter optimization for clustering algorithms with reinforcement learning

Scientific and technical journal of information technologies mechanics and optics ◽

10.17586/2226-1494-2019-19-3-508-515 ◽

2019 ◽

Vol 19 (3) ◽

pp. 508-515

Author(s):

S.B. Muravyov ◽

V.A. Efimova ◽

V.V. Shalamov ◽

A.A. Filchenkov ◽

I.B. Smetannikov

Keyword(s):

Reinforcement Learning ◽

Clustering Algorithms ◽

Hyperparameter Optimization

Download Full-text

Botnet Detection Using On-line Clustering with Pursuit Reinforcement Competitive Learning (PRCL)

EMITTER International Journal of Engineering Technology ◽

10.24003/emitter.v6i1.207 ◽

2018 ◽

Vol 6 (1) ◽

pp. 1-21

Author(s):

Yesta Medya Mahardhika ◽

Amang Sudarsono ◽

Ali Ridho Barakbah

Keyword(s):

Reinforcement Learning ◽

Personal Information ◽

Competitive Learning ◽

Classification Model ◽

Experimental Result ◽

Classification Methods ◽

New Paradigm ◽

Botnet Detection ◽

Online Clustering ◽

On Line

Botnet is a malicious software that often occurs at this time, and can perform malicious activities, such as DDoS, spamming, phishing, keylogging, clickfraud, steal personal information and important data. Botnets can replicate themselves without user consent. Several systems of botnet detection has been done by using classification methods. Classification methods have high precision, but it needs more effort to determine appropiate classification model. In this paper, we propose reinforcedÂ approach to detect botnet with On-line Clustering using Reinforcement Learning. Reinforcement Learning involving interaction with the environment and became new paradigm in machine learning. The reinforcement learning will be implemented with some rule detection, because botnet ISCX dataset is categorized as unbalanced dataset which have high range of each number of class. Therefore we implemented Reinforcement Learning to Detect Botnet using Pursuit Reinforcement Competitive Learning (PRCL) with additional rule detection which has reward and punisment rules to achieve the solution. Based on the experimental result, PRCL can detect botnet in real time with highÂ accuracy (100% for Neris, 99.9% for Rbot, 78% for SMTP_Spam, 80.9% for Nsis, 80.7% for Virut, and 96.0% for Zeus) and fast processing time up to 176 ms. Meanwhile the step of CPU and memory usage which are 78 % and 4.3 GBÂ for pre-processing, 34% and 3.18 GB for online clustering with PRCL, and Â 23% and 3.11 GB evaluation. The proposed method is one solution for network administrators to detect botnet which has unpredictable behavior in network traffic.

Download Full-text

Online clustering algorithms for radar emitter classification

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2005.166 ◽

2005 ◽

Vol 27 (8) ◽

pp. 1185-1196 ◽

Cited By ~ 56

Author(s):

Jun Liu ◽

J.P.Y. Lee ◽

Lingjie Li ◽

Zhi-Quan Luo ◽

K.M. Wong

Keyword(s):

Clustering Algorithms ◽

Online Clustering

Download Full-text

ONLINE CLUSTERING ALGORITHMS

International Journal of Neural Systems ◽

10.1142/s0129065708001518 ◽

2008 ◽

Vol 18 (03) ◽

pp. 185-194 ◽

Cited By ~ 44

Author(s):

WESAM BARBAKH ◽

COLIN FYFE

Keyword(s):

Initial Conditions ◽

Clustering Algorithms ◽

Global Optimum ◽

Local Optimum ◽

Data Sets ◽

Performance Function ◽

Online Clustering ◽

Standard Data ◽

Latent Space ◽

Online Learning Algorithms

We introduce a set of clustering algorithms whose performance function is such that the algorithms overcome one of the weaknesses of K-means, its sensitivity to initial conditions which leads it to converge to a local optimum rather than the global optimum. We derive online learning algorithms and illustrate their convergence to optimal solutions which K-means fails to find. We then extend the algorithm by underpinning it with a latent space which enables a topology preserving mapping to be found. We show visualisation results on some standard data sets.

Download Full-text

Combination of online Clustering and Q-value based genetic reinforcement learning for fuzzy network design

Proceedings of the International Joint Conference on Neural Networks, 2003. ◽

10.1109/ijcnn.2003.1223695 ◽

2004 ◽

Author(s):

Chia-Feng Juang ◽

Chun-Feng Lu

Keyword(s):

Reinforcement Learning ◽

Network Design ◽

Q Value ◽

Online Clustering ◽

Fuzzy Network

Download Full-text

Analysis of Incremental Cluster Validity for Big Data Applications

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488518400111 ◽

2018 ◽

Vol 26 (Suppl. 2) ◽

pp. 47-62 ◽

Cited By ~ 2

Author(s):

Omar A. Ibrahim ◽

Yiqing Wang ◽

James M. Keller

Keyword(s):

Clustering Algorithm ◽

Distance Measure ◽

Dimensional Space ◽

Clustering Algorithms ◽

Internal Validity ◽

Streaming Data ◽

Cluster Validity ◽

Cluster Validity Index ◽

Online Clustering ◽

Validity Indices

Online clustering has attracted attention due to the explosion of ubiquitous continuous sensing. Streaming clustering algorithms need to look for new structures and adapt as the data evolves, such that outliers are detected, and that new emerging clusters are automatically formed. The performance of a streaming clustering algorithm needs to be monitored over time to understand the behavior of the streaming data in terms of new emerging clusters and number of outlier data points. Small datasets with 2 or 3 dimensions can be monitored by plotting the clustering results as data evolves. However, as the size and dimensions of streaming data increase, plotting the clustering result becomes unfeasible. Therefore, incremental internal Validity Indices (iCVIs) could be applied for monitoring the performance of an online clustering algorithm. In this paper, we study the internal incremental Davies-Bouldin (iDB) cluster validity index in the context of big streaming data analysis. Also, we study the effect of large number of samples on the values of the iCVI (iDB). Finally, we propose a way to project streaming data into a lower space for cases where the distance measure does not perform as expected in the high dimensional space.

Download Full-text

Online data clustering algorithms in an RTLS system

Acta Universitatis Sapientiae Informatica ◽

10.2478/ausi-2014-0001 ◽

2013 ◽

Vol 5 (1) ◽

pp. 5-15

Author(s):

Tamás Németh ◽

Sándor Nagy ◽

Csanád Imreh

Keyword(s):

Mathematical Model ◽

Clustering Algorithms ◽

Optimal Solution ◽

Online Data ◽

Online Clustering ◽

Clustering Problem ◽

Processing Step ◽

Incoming Information ◽

General Mathematical Model ◽

Special Case

Abstract This paper proposes and evaluates solutions for an online clustering problem and gives a mathematical model for it. The problem at hand occurs often in the fusion of data streams for example in real time locating systems. The goal is to gather as much incoming information from several sources as possible but also minimize the delay before the next processing step can be executed. The key characteristic is that the data is available in a bursty fashion, in the special case of an RTLS according to the locating cycles. After an introduction of the background a general mathematical model for the problem is given, and then two basic algorithms referred to as NWT and CWT are analyzed by the method of competitive analysis. Each turning out to deliver an optimal solution under different constraints. Then an experimental evaluation follows based on a simulation involving the CWT and the algorithm referred to as VWT. The later is giving a configuration free solution for the problem.

Download Full-text