Hybrid Approach to Document Anomaly Detection: An Application to Facilitate RPA in Title Insurance

Author(s):  
Abhijit Guha ◽  
Debabrata Samanta
2021 ◽  
pp. 1-15
Author(s):  
Savaridassan Pankajashan ◽  
G. Maragatham ◽  
T. Kirthiga Devi

Anomaly-based detection is coupled with recognizing the uncommon, to catch the unusual activity, and to find the strange action behind that activity. Anomaly-based detection has a wide scope of critical applications, from bank application security to regular sciences to medical systems to marketing apps. Anomaly-based detection adopted by various Machine Learning techniques is really a type of system that consists of artificial intelligence. With the ever-expanding volume and new sorts of information, for example, sensor information from an incontestably enormous amount of IoT devices and from network flow data from cloud computing, it is implicitly understood without surprise that there is a developing enthusiasm for having the option to deal with more conclusions automatically by means of AI and ML applications. But with respect to anomaly detection, many applications of the scheme are simply the passion for detection. In this paper, Machine Learning (ML) techniques, namely the SVM, Isolation forest classifiers experimented and with reference to Deep Learning (DL) techniques, the proposed DA-LSTM (Deep Auto-Encoder LSTM) model are adopted for preprocessing of log data and anomaly-based detection to get better performance measures of detection. An enhanced LSTM (long-short-term memory) model, optimizing for the suitable parameter using a genetic algorithm (GA), is utilized to recognize better the anomaly from the log data that is filtered, adopting a Deep Auto-Encoder (DA). The Deep Neural network models are utilized to change over unstructured log information to training ready features, which are reasonable for log classification in detecting anomalies. These models are assessed, utilizing two benchmark datasets, the Openstack logs, and CIDDS-001 intrusion detection OpenStack server dataset. The outcomes acquired show that the DA-LSTM model performs better than other notable ML techniques. We further investigated the performance metrics of the ML and DL models through the well-known indicator measurements, specifically, the F-measure, Accuracy, Recall, and Precision. The exploratory conclusion shows that the Isolation Forest, and Support vector machine classifiers perform roughly 81%and 79%accuracy with respect to the performance metrics measurement on the CIDDS-001 OpenStack server dataset while the proposed DA-LSTM classifier performs around 99.1%of improved accuracy than the familiar ML algorithms. Further, the DA-LSTM outcomes on the OpenStack log data-sets show better anomaly detection compared with other notable machine learning models.


Author(s):  
Stevan Novakov ◽  
Chung-Horng Lung ◽  
Ioannis Lambadaris ◽  
Nabil Seddigh

Research into network anomaly detection has become crucial as a result of a significant increase in the number of computer attacks. Many approaches in network anomaly detection have been reported in the literature, but data or solutions typically are not freely available. Recently, a labeled network traffic flow dataset, Kyoto2006+, has been created and is publicly available. Most existing approaches using Kyoto2006+ for network anomaly detection apply various clustering techniques. This paper leverages existing well known statistical analysis and spectral analysis techniques for network anomaly detection. The first popular approach is a statistical analysis technique called Principal Component Analysis (PCA). PCA describes data in a new dimension to unlock otherwise hidden characteristics. The other well known spectral analysis technique is Haar Wavelet filtering analysis. It measures the amount and magnitude of abrupt changes in data. Both approaches have strengths and limitations. In response, this paper proposes a Hybrid PCA–Haar Wavelet Analysis. The hybrid approach first applies PCA to describe the data and then Haar Wavelet filtering for analysis. Based on prototyping and measurement, an investigation of the Hybrid PCA–Haar Wavelet Analysis technique is performed using the Kyoto2006+ dataset. The authors consider a number of parameters and present experimental results to demonstrate the effectiveness of the hybrid approach as compared to the two algorithms individually.


2020 ◽  
Vol 10 (15) ◽  
pp. 5191
Author(s):  
Yıldız Karadayı ◽  
Mehmet N. Aydin ◽  
A. Selçuk Öğrenci

Multivariate time-series data with a contextual spatial attribute have extensive use for finding anomalous patterns in a wide variety of application domains such as earth science, hurricane tracking, fraud, and disease outbreak detection. In most settings, spatial context is often expressed in terms of ZIP code or region coordinates such as latitude and longitude. However, traditional anomaly detection techniques cannot handle more than one contextual attribute in a unified way. In this paper, a new hybrid approach based on deep learning is proposed to solve the anomaly detection problem in multivariate spatio-temporal dataset. It works under the assumption that no prior knowledge about the dataset and anomalies are available. The architecture of the proposed hybrid framework is based on an autoencoder scheme, and it is more efficient in extracting features from the spatio-temporal multivariate datasets compared to the traditional spatio-temporal anomaly detection techniques. We conducted extensive experiments using buoy data of 2005 from National Data Buoy Center and Hurricane Katrina as ground truth. Experiments demonstrate that the proposed model achieves more than 10% improvement in accuracy over the methods used in the comparison where our model jointly processes the spatial and temporal dimensions of the contextual data to extract features for anomaly detection.


Author(s):  
Yang Shi ◽  
Shupei Wang ◽  
Qinpei Zhao ◽  
Jiangfeng Li

2021 ◽  
pp. 757-767
Author(s):  
Sonika Dahiya ◽  
Priyansh Soni ◽  
Hridya Shiju Nadappattel ◽  
Mohammad Fraz

Sign in / Sign up

Export Citation Format

Share Document