Database Security Enhancement by Eliminating the Redundant and Incorrect Spelled Data Entries

Journal of Cyber Security and Mobility ◽

10.13052/jcsm2245-1439.1024 ◽

2021 ◽

Author(s):

Rupali Chopade ◽

Vinod Pachghare

Keyword(s):

Edit Distance ◽

Bloom Filter ◽

Database Security ◽

Performance Comparison ◽

Data Redundancy ◽

Security Enhancement ◽

Redundant Data ◽

Large Size ◽

Document Search ◽

Primary Key

Database is used for storing the data in an easy and efficient format. In recent days large size of data has been generated through number of applications and same has been stored in the database. Considering the importance of data in every sector of digitized world, it is foremost important to secure the data. Hence, database security has been given a prime importance in every organization. Redundant data entries may stop the functioning of the database. Redundant data entries may be inserted in the database because of the absence of primary key or due to incorrect spelled data. This article addresses the solution for database security by protecting the database from redundant data entries based on the concept of Bloom filter. This database security has been obtained by correcting the incorrect spelled data from query values with the help of edit distance algorithm followed by the data redundancy check. This article also presents the performance comparison between proposed technique and MongoDB database for document search functionality.

Download Full-text

A Temporal and Spatial Data Redundancy Processing Algorithm for RFID Surveillance Data

Wireless Communications and Mobile Computing ◽

10.1155/2020/6937912 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Siye Wang ◽

Ziwen Cao ◽

Yanfang Zhang ◽

Weiqing Huang ◽

Jianguo Jiang

Keyword(s):

Radio Frequency Identification ◽

Spatial Data ◽

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

Bloom Filter ◽

Data Redundancy ◽

Filter Performance ◽

Redundant Data ◽

Rfid Data

The Radio Frequency Identification (RFID) data acquisition rate used for monitoring is so high that the RFID data stream contains a large amount of redundant data, which increases the system overhead. To balance the accuracy and real-time performance of monitoring, it is necessary to filter out redundant RFID data. We propose an algorithm called Time-Distance Bloom Filter (TDBF) that takes into account the read time and read distance of RFID tags, which greatly reduces data redundancy. In addition, we have proposed a measurement of the filter performance evaluation indicators. In experiments, we found that the performance score of the TDBF algorithm was 5.2, while the Time Bloom Filter (TBF) score was only 0.03, which indicates that the TDBF algorithm can achieve a lower false negative rate, lower false positive rate, and higher data compression rate. Furthermore, in a dynamic scenario, the TDBF algorithm can filter out valid data according to the actual scenario requirements.

Download Full-text

Lightweight Dynamic Redundancy Control with Adaptive Encoding for Server-based Storage

ACM Transactions on Storage ◽

10.1145/3456292 ◽

2021 ◽

Vol 17 (4) ◽

pp. 1-38

Author(s):

Takayuki Fukatani ◽

Hieu Hanh Le ◽

Haruo Yokota

Keyword(s):

Low Cost ◽

Storage System ◽

Erasure Coding ◽

Performance Impact ◽

Data Redundancy ◽

Performance Improvements ◽

Redundant Data ◽

Redundancy Control ◽

User Data ◽

Capacity Efficiency

With the recent performance improvements in commodity hardware, low-cost commodity server-based storage has become a practical alternative to dedicated-storage appliances. Because of the high failure rate of commodity servers, data redundancy across multiple servers is required in a server-based storage system. However, the extra storage capacity for this redundancy significantly increases the system cost. Although erasure coding (EC) is a promising method to reduce the amount of redundant data, it requires distributing and encoding data among servers. There remains a need to reduce the performance impact of these processes involving much network traffic and processing overhead. Especially, the performance impact becomes significant for random-intensive applications. In this article, we propose a new lightweight redundancy control for server-based storage. Our proposed method uses a new local filesystem-based approach that avoids distributing data by adding data redundancy to locally stored user data. Our method switches the redundancy method of user data between replication and EC according to workloads to improve capacity efficiency while achieving higher performance. Our experiments show up to 230% better online-transaction-processing performance for our method compared with CephFS, a widely used alternative system. We also confirmed that our proposed method prevents unexpected performance degradation while achieving better capacity efficiency.

Download Full-text

Security Enhancement in Distributed Networks Using Link-Based Mapping Scheme for Network Intrusion Detection with Enhanced Bloom Filter

Wireless Personal Communications ◽

10.1007/s11277-015-2662-1 ◽

2015 ◽

Vol 84 (2) ◽

pp. 821-839 ◽

Cited By ~ 4

Author(s):

K. Saravanan ◽

A. Senthilkumar

Keyword(s):

Intrusion Detection ◽

Bloom Filter ◽

Distributed Networks ◽

Network Intrusion Detection ◽

Security Enhancement ◽

Network Intrusion ◽

Mapping Scheme

Download Full-text

Analysis and Improvement of BLP Model for Multi-Level Security Database

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.397-400.2536 ◽

2013 ◽

Vol 397-400 ◽

pp. 2536-2539 ◽

Cited By ~ 3

Author(s):

Hai Yan Zhao ◽

Xiang Yang Liu ◽

Jing Zhao

Keyword(s):

Database System ◽

Security Model ◽

Improved Method ◽

Data Redundancy ◽

Primary Key ◽

Multi Level

In the current multi-level security database system, the BLP model is the most widely used security model. For the problem of data redundancy, primary key loophole and reasoning channel of the BLP model, an improved method is proposed. The proposed method consummates the read-write level and the read-write range of user, increases the audit mechanism, eliminates the primary key loophole and avoids the reasoning channel to some extent. The proposed method in this paper improves the security of the BLP model and makes the security model more practical.

Download Full-text

Improved Streaming Quotient Filter: A Duplicate Detection Approach for Data Streams

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/5/10 ◽

2020 ◽

Vol 17 (5) ◽

pp. 769-777

Author(s):

Shiwei Che ◽

Wu Yang ◽

Wei Wang

Keyword(s):

Data Streams ◽

Data Stream ◽

Parallel Implementation ◽

Academic Research ◽

Bloom Filter ◽

Error Rates ◽

Data Detection ◽

Detection Algorithms ◽

Redundant Data ◽

Detection Approach

The unprecedented development and popularization of the Internet, combined with the emergence of a variety of modern applications, such as search engines, online transactions, climate warning systems and so on, enables the worldwide storage of data to grow unprecedented. Efficient storage, management and processing of such huge amounts of data has become an important academic research topic. The detection and removal of duplicate and redundant data from such multi-trillion data, while ensuring resource and computational efficiency, has constituted a challenging area of research.Because of the fact that all the data of potentially unbounded data streams can not be stored, and the need to delete duplicated data as accurately as possible, intelligent approximate duplicate data detection algorithms are urgently required. Many well-known methods based on the bitmap structure, Bloom Filter and its variants are listed in the literature. In this paper, we propose a new data structure, Improved Streaming Quotient Filter (ISQF), to efficiently detect and remove duplicate data in a data stream. ISQF intelligently stores the signatures of elements in a data stream, while using an eviction strategy to provide near zero error rates. We show that ISQF achieves near optimal performance with fairly low memory requirements, making it an ideal and efficient method for repeated data detection. It has a very low error rate. Empirically, we compared ISQF with some existing methods (especially Steaming Quotient Filter (SQF)). The results show that our proposed method outperforms theexisting methods in terms of memory usage and accuracy.We also discuss the parallel implementation of ISQF

Download Full-text

A Comparison of the Query Execution Algorithms in Secure Database System

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i1.9335 ◽

2016 ◽

Vol 6 (1) ◽

pp. 337 ◽

Cited By ~ 1

Author(s):

Young-Dal Jang ◽

Ji-Hong Kim

Keyword(s):

Bloom Filter ◽

Database Security ◽

Third Party ◽

Search Performance ◽

Security Threat ◽

Sensitive Data ◽

Protection Mechanisms ◽

Encrypted Database ◽

Encryption Method ◽

Secure Databases

In accordance with the database management, DAS(Database as Service) model is one solution for outsourcing. However, we need some data protection mechanisms in order to maintain the database security The most effective algorithm to secure databases from the security threat of third party attackers is to encrypt the sensitive data within the database. However, once we encrypt the sensitive data, we have difficulties in queries execution on the encrypted database. In this paper, we focus on the search process on the encrypted database. We proposed the selective tuple encryption method using Bloom Filter which could tell us the existence of the data. Finally, we compare the search performance between the proposed method and the other encryption methods we know.

Download Full-text

Filtering Redundant Data from RFID Data Streams

Journal of Sensors ◽

10.1155/2016/7107914 ◽

2016 ◽

Vol 2016 ◽

pp. 1-7 ◽

Cited By ~ 4

Author(s):

Hazalila Kamaludin ◽

Hairulnizam Mahdin ◽

Jemal H. Abawajy

Keyword(s):

Radio Frequency Identification ◽

Data Streams ◽

False Positive Rate ◽

Bloom Filter ◽

True Positive Rate ◽

Redundant Data ◽

Rfid Data ◽

Positive Rate ◽

Frequency Identification ◽

Processing And Storage

Radio Frequency Identification (RFID) enabled systems are evolving in many applications that need to know the physical location of objects such as supply chain management. Naturally, RFID systems create large volumes of duplicate data. As the duplicate data wastes communication, processing, and storage resources as well as delaying decision-making, filtering duplicate data from RFID data stream is an important and challenging problem. Existing Bloom Filter-based approaches for filtering duplicate RFID data streams are complex and slow as they use multiple hash functions. In this paper, we propose an approach for filtering duplicate data from RFID data streams. The proposed approach is based on modified Bloom Filter and uses only a single hash function. We performed extensive empirical study of the proposed approach and compared it against the Bloom Filter, d-Left Time Bloom Filter, and the Count Bloom Filter approaches. The results show that the proposed approach outperforms the baseline approaches in terms of false positive rate, execution time, and true positive rate.

Download Full-text

Data Encryption of Big Data Redundancy Elimination Algorithm Combining Bloom Filter Technology

10.1109/icvris51417.2020.00135 ◽

2020 ◽

Author(s):

Gen Li

Keyword(s):

Big Data ◽

Bloom Filter ◽

Data Encryption ◽

Redundancy Elimination ◽

Data Redundancy ◽

Elimination Algorithm

Download Full-text

A Comparison of the Query Execution Algorithms in Secure Database System

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i1.pp337-343 ◽

2016 ◽

Vol 6 (1) ◽

pp. 337

Author(s):

Young-Dal Jang ◽

Ji-Hong Kim

Keyword(s):

Bloom Filter ◽

Database Security ◽

Third Party ◽

Search Performance ◽

Security Threat ◽

Sensitive Data ◽

Protection Mechanisms ◽

Encrypted Database ◽

Encryption Method ◽

Secure Databases

Download Full-text

A Survey on Duplicate Data Filtering Methods in Big Data

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.1.16805 ◽

2018 ◽

Vol 7 (3.1) ◽

pp. 90 ◽

Cited By ~ 1

Author(s):

S P Godlin Jasil ◽

V Ulagamuthalvi

Keyword(s):

Big Data ◽

Data Analytics ◽

Network Performance ◽

False Negative ◽

Big Data Analytics ◽

Large Data ◽

Bloom Filter ◽

Data Set ◽

Redundant Data ◽

Different Sources

Big Data analytics is the process of collecting heterogeneous huge sets of data for analyzing .The data are fetched from different sources and can be in heterogeneous form. Data arriving in the big data system will be in giga-bytes for every second. Since, the data are in huge volume, there is a possibility of redundant data that affect the network performance. This article presents the review of different filtering methods and algorithms that are used for duplicate elimination such as Bloom filter, Stable Bloom Filter, multi-layer bloom filter, Counting Bloom Filter with some disadvantages such as false positive and false negative. The aim of this paper is to propose an algorithm for eliminating the duplicate Data in a large data set by using big data analytics.

Download Full-text