scholarly journals Database Security Enhancement by Eliminating the Redundant and Incorrect Spelled Data Entries

Author(s):  
Rupali Chopade ◽  
Vinod Pachghare

Database is used for storing the data in an easy and efficient format. In recent days large size of data has been generated through number of applications and same has been stored in the database. Considering the importance of data in every sector of digitized world, it is foremost important to secure the data. Hence, database security has been given a prime importance in every organization. Redundant data entries may stop the functioning of the database. Redundant data entries may be inserted in the database because of the absence of primary key or due to incorrect spelled data. This article addresses the solution for database security by protecting the database from redundant data entries based on the concept of Bloom filter. This database security has been obtained by correcting the incorrect spelled data from query values with the help of edit distance algorithm followed by the data redundancy check. This article also presents the performance comparison between proposed technique and MongoDB database for document search functionality.

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Siye Wang ◽  
Ziwen Cao ◽  
Yanfang Zhang ◽  
Weiqing Huang ◽  
Jianguo Jiang

The Radio Frequency Identification (RFID) data acquisition rate used for monitoring is so high that the RFID data stream contains a large amount of redundant data, which increases the system overhead. To balance the accuracy and real-time performance of monitoring, it is necessary to filter out redundant RFID data. We propose an algorithm called Time-Distance Bloom Filter (TDBF) that takes into account the read time and read distance of RFID tags, which greatly reduces data redundancy. In addition, we have proposed a measurement of the filter performance evaluation indicators. In experiments, we found that the performance score of the TDBF algorithm was 5.2, while the Time Bloom Filter (TBF) score was only 0.03, which indicates that the TDBF algorithm can achieve a lower false negative rate, lower false positive rate, and higher data compression rate. Furthermore, in a dynamic scenario, the TDBF algorithm can filter out valid data according to the actual scenario requirements.


2021 ◽  
Vol 17 (4) ◽  
pp. 1-38
Author(s):  
Takayuki Fukatani ◽  
Hieu Hanh Le ◽  
Haruo Yokota

With the recent performance improvements in commodity hardware, low-cost commodity server-based storage has become a practical alternative to dedicated-storage appliances. Because of the high failure rate of commodity servers, data redundancy across multiple servers is required in a server-based storage system. However, the extra storage capacity for this redundancy significantly increases the system cost. Although erasure coding (EC) is a promising method to reduce the amount of redundant data, it requires distributing and encoding data among servers. There remains a need to reduce the performance impact of these processes involving much network traffic and processing overhead. Especially, the performance impact becomes significant for random-intensive applications. In this article, we propose a new lightweight redundancy control for server-based storage. Our proposed method uses a new local filesystem-based approach that avoids distributing data by adding data redundancy to locally stored user data. Our method switches the redundancy method of user data between replication and EC according to workloads to improve capacity efficiency while achieving higher performance. Our experiments show up to 230% better online-transaction-processing performance for our method compared with CephFS, a widely used alternative system. We also confirmed that our proposed method prevents unexpected performance degradation while achieving better capacity efficiency.


2013 ◽  
Vol 397-400 ◽  
pp. 2536-2539 ◽  
Author(s):  
Hai Yan Zhao ◽  
Xiang Yang Liu ◽  
Jing Zhao

In the current multi-level security database system, the BLP model is the most widely used security model. For the problem of data redundancy, primary key loophole and reasoning channel of the BLP model, an improved method is proposed. The proposed method consummates the read-write level and the read-write range of user, increases the audit mechanism, eliminates the primary key loophole and avoids the reasoning channel to some extent. The proposed method in this paper improves the security of the BLP model and makes the security model more practical.


2020 ◽  
Vol 17 (5) ◽  
pp. 769-777
Author(s):  
Shiwei Che ◽  
Wu Yang ◽  
Wei Wang

The unprecedented development and popularization of the Internet, combined with the emergence of a variety of modern applications, such as search engines, online transactions, climate warning systems and so on, enables the worldwide storage of data to grow unprecedented. Efficient storage, management and processing of such huge amounts of data has become an important academic research topic. The detection and removal of duplicate and redundant data from such multi-trillion data, while ensuring resource and computational efficiency, has constituted a challenging area of research.Because of the fact that all the data of potentially unbounded data streams can not be stored, and the need to delete duplicated data as accurately as possible, intelligent approximate duplicate data detection algorithms are urgently required. Many well-known methods based on the bitmap structure, Bloom Filter and its variants are listed in the literature. In this paper, we propose a new data structure, Improved Streaming Quotient Filter (ISQF), to efficiently detect and remove duplicate data in a data stream. ISQF intelligently stores the signatures of elements in a data stream, while using an eviction strategy to provide near zero error rates. We show that ISQF achieves near optimal performance with fairly low memory requirements, making it an ideal and efficient method for repeated data detection. It has a very low error rate. Empirically, we compared ISQF with some existing methods (especially Steaming Quotient Filter (SQF)). The results show that our proposed method outperforms theexisting methods in terms of memory usage and accuracy.We also discuss the parallel implementation of ISQF


Author(s):  
Young-Dal Jang ◽  
Ji-Hong Kim

In accordance with the database management, DAS(Database as Service) model is one solution for outsourcing. However, we need some data protection mechanisms in order to maintain the database security The most effective algorithm to secure databases from the security threat of third party attackers is to encrypt the sensitive data within the database. However, once we encrypt the sensitive data, we have difficulties in queries execution on the encrypted database. In this paper, we focus on the search process on the encrypted database. We proposed the selective tuple encryption method using Bloom Filter which could tell us the existence of the data. Finally, we compare the search performance between the proposed method and the other encryption methods we know.


2016 ◽  
Vol 2016 ◽  
pp. 1-7 ◽  
Author(s):  
Hazalila Kamaludin ◽  
Hairulnizam Mahdin ◽  
Jemal H. Abawajy

Radio Frequency Identification (RFID) enabled systems are evolving in many applications that need to know the physical location of objects such as supply chain management. Naturally, RFID systems create large volumes of duplicate data. As the duplicate data wastes communication, processing, and storage resources as well as delaying decision-making, filtering duplicate data from RFID data stream is an important and challenging problem. Existing Bloom Filter-based approaches for filtering duplicate RFID data streams are complex and slow as they use multiple hash functions. In this paper, we propose an approach for filtering duplicate data from RFID data streams. The proposed approach is based on modified Bloom Filter and uses only a single hash function. We performed extensive empirical study of the proposed approach and compared it against the Bloom Filter, d-Left Time Bloom Filter, and the Count Bloom Filter approaches. The results show that the proposed approach outperforms the baseline approaches in terms of false positive rate, execution time, and true positive rate.


Author(s):  
Young-Dal Jang ◽  
Ji-Hong Kim

In accordance with the database management, DAS(Database as Service) model is one solution for outsourcing. However, we need some data protection mechanisms in order to maintain the database security The most effective algorithm to secure databases from the security threat of third party attackers is to encrypt the sensitive data within the database. However, once we encrypt the sensitive data, we have difficulties in queries execution on the encrypted database. In this paper, we focus on the search process on the encrypted database. We proposed the selective tuple encryption method using Bloom Filter which could tell us the existence of the data. Finally, we compare the search performance between the proposed method and the other encryption methods we know.


2018 ◽  
Vol 7 (3.1) ◽  
pp. 90 ◽  
Author(s):  
S P Godlin Jasil ◽  
V Ulagamuthalvi

Big Data analytics is the process of collecting heterogeneous huge sets of data for analyzing .The data are fetched from different sources and can be in heterogeneous form. Data arriving in the big data system will be in giga-bytes for every second. Since, the data are in huge volume, there is a possibility of redundant data that affect the network performance. This article presents the review of different filtering methods and algorithms that are used for duplicate elimination such as Bloom filter, Stable Bloom Filter, multi-layer bloom filter, Counting Bloom Filter with some disadvantages such as false positive and false negative. The aim of this paper is to propose an algorithm for eliminating the duplicate Data in a large data set by using big data analytics.  


Sign in / Sign up

Export Citation Format

Share Document