Robust anomaly detection algorithms for real-time big data: Comparison of algorithms

Author(s):  
Zirije Hasani
2020 ◽  
Vol 1 (1) ◽  
pp. 35-42
Author(s):  
Péter Ekler ◽  
Dániel Pásztor

Összefoglalás. A mesterséges intelligencia az elmúlt években hatalmas fejlődésen ment keresztül, melynek köszönhetően ma már rengeteg különböző szakterületen megtalálható valamilyen formában, rengeteg kutatás szerves részévé vált. Ez leginkább az egyre inkább fejlődő tanulóalgoritmusoknak, illetve a Big Data környezetnek köszönhető, mely óriási mennyiségű tanítóadatot képes szolgáltatni. A cikk célja, hogy összefoglalja a technológia jelenlegi állapotát. Ismertetésre kerül a mesterséges intelligencia történelme, az alkalmazási területek egy nagyobb része, melyek központi eleme a mesterséges intelligencia. Ezek mellett rámutat a mesterséges intelligencia különböző biztonsági réseire, illetve a kiberbiztonság területén való felhasználhatóságra. A cikk a jelenlegi mesterséges intelligencia alkalmazások egy szeletét mutatja be, melyek jól illusztrálják a széles felhasználási területet. Summary. In the past years artificial intelligence has seen several improvements, which drove its usage to grow in various different areas and became the focus of many researches. This can be attributed to improvements made in the learning algorithms and Big Data techniques, which can provide tremendous amount of training. The goal of this paper is to summarize the current state of artificial intelligence. We present its history, introduce the terminology used, and show technological areas using artificial intelligence as a core part of their applications. The paper also introduces the security concerns related to artificial intelligence solutions but also highlights how the technology can be used to enhance security in different applications. Finally, we present future opportunities and possible improvements. The paper shows some general artificial intelligence applications that demonstrate the wide range usage of the technology. Many applications are built around artificial intelligence technologies and there are many services that a developer can use to achieve intelligent behavior. The foundation of different approaches is a well-designed learning algorithm, while the key to every learning algorithm is the quality of the data set that is used during the learning phase. There are applications that focus on image processing like face detection or other gesture detection to identify a person. Other solutions compare signatures while others are for object or plate number detection (for example the automatic parking system of an office building). Artificial intelligence and accurate data handling can be also used for anomaly detection in a real time system. For example, there are ongoing researches for anomaly detection at the ZalaZone autonomous car test field based on the collected sensor data. There are also more general applications like user profiling and automatic content recommendation by using behavior analysis techniques. However, the artificial intelligence technology also has security risks needed to be eliminated before applying an application publicly. One concern is the generation of fake contents. These must be detected with other algorithms that focus on small but noticeable differences. It is also essential to protect the data which is used by the learning algorithm and protect the logic flow of the solution. Network security can help to protect these applications. Artificial intelligence can also help strengthen the security of a solution as it is able to detect network anomalies and signs of a security issue. Therefore, the technology is widely used in IT security to prevent different type of attacks. As different BigData technologies, computational power, and storage capacity increase over time, there is space for improved artificial intelligence solution that can learn from large and real time data sets. The advancements in sensors can also help to give more precise data for different solutions. Finally, advanced natural language processing can help with communication between humans and computer based solutions.


2019 ◽  
Vol 17 (2) ◽  
pp. 272-280
Author(s):  
Adeel Hashmi ◽  
Tanvir Ahmad

Anomaly/Outlier detection is the process of finding abnormal data points in a dataset or data stream. Most of the anomaly detection algorithms require setting of some parameters which significantly affect the performance of the algorithm. These parameters are generally set by hit-and-trial; hence performance is compromised with default or random values. In this paper, the authors propose a self-optimizing algorithm for anomaly detection based on firefly meta-heuristic, and named as Firefly Algorithm for Anomaly Detection (FAAD). The proposed solution is a non-clustering unsupervised learning approach for anomaly detection. The algorithm is implemented on Apache Spark for scalability and hence the solution can handle big data as well. Experiments were conducted on various datasets, and the results show that the proposed solution is much accurate than the standard algorithms of anomaly detection.


Author(s):  
Dhanya Sudhakaran ◽  
Shini Renjith

Community detection is a common problem in graph and big data analytics. It consists of finding groups of densely connected nodes with few connections to nodes outside of the group. In particular, identifying communities in large-scale networks is an important task in many scientific domains. Community detection algorithms in literature proves to be less efficient, as it leads to generation of communities with noisy interactions. To address this limitation, there is a need to develop a system which identifies the best community among multi-dimensional networks based on relevant selection criteria and dimensionality of entities, thereby eliminating the noisy interactions in a real-time environment.


2020 ◽  
Author(s):  
Zirije Hasani ◽  
Jakup Fondaj

Abstract Most of the today's world data are streaming, time-series data, where anomalies detection gives significant information of possible critical situations. Yet, detecting anomalies in big streaming data is a difficult task, requiring detectors to acquire and process data in a real-time, as they occur, even before they are stored and instantly alarm on potential threats. Suitable to the need for real-time alarm and unsupervised procedures for massive streaming data anomaly detection, algorithms have to be robust, with low processing time, eventually at the cost of the accuracy. In this work we compare the performance of our proposed anomaly detection algorithm HW-GA[1] with other existing methods as ARIMA [10], Moving Average [11] and Holt Winters [12]. The algorithms are tested and results are visualized in the system R, on the three Numenta datasets, with known anomalies and own e-dnevnik dataset with unknown anomalies. Evaluation is done by comparing achieved results (the algorithm execution time and CPU usage). Our interest is monitoring of the streaming log data that are generating in the national educational network (e-dnevnik) that acquires a massive number of online queries and to detect anomalies in order to scale up performance, prevent network downs, alarm on possible attacks and similar.


Sign in / Sign up

Export Citation Format

Share Document