scholarly journals Deduplication in Cloud Storage

Cloud Computing is well known today on account of enormous measure of data storage and quick access of information over the system. It gives an individual client boundless extra space, accessibility and openness of information whenever at anyplace. Cloud service provider can boost information storage by incorporating data deduplication into cloud storage, despite the fact that information deduplication removes excess information and reproduced information happens in cloud environment. This paper presents a literature survey alongside different deduplication procedures that have been based on cloud information storage. To all the more likely guarantee secure deduplication in cloud, this paper examines file level data deduplication and block level data deduplication.


2014 ◽  
Vol 513-517 ◽  
pp. 999-1004 ◽  
Author(s):  
Hong Wei Liu ◽  
Shu Lan Wang ◽  
Peng Zhang

Cloud storage can provide a flexible on-demand data storage service to users anywhere and anytime. However, users data is owned by cloud service provider physically, and the physical boundary between two users data is fuzzy. Moreover, cloud storage provider stores multiple replicas of data in order to increase robustness of users data. The user is charged by the amount of replicas. However, the evidence cloud storage provider actually spends more storage space is scarce. In this environment, a method to ensure multi-replica integrity must be provided. In order to avoid retrieving enormous storage data and users themselves checking, a multi-replica public auditing protocol was proposed based on the BLS short signature scheme and the homomorphic hash function. Based on the computational Diffie-Hellman assumption, the presented protocol is secure against the lost attack and tamper attack from cloud service provider. As the independence among blocks and block signatures, this protocol supports block-level dynamic update, including insertion, modification and deletion. So, the protocol is secure and efficient, and supports for multi-replica, public verification, dynamic update and privacy preserving.



Cloud storage is one of the key features of cloud computing, which helps cloud users outsource large numbers of data without upgrading their devices. However, Cloud Service Providers (CSPs) data storage faces problems with data redundancy. The data deduplication technique aims at eliminating redundant information segments and maintains one single instance of the data set, even if any number of users own similar data set. Since blocks of data are spread on many servers, each block of the file has to be downloaded before restoring the file to decrease system output. We suggest a cloud storage server data recovery module to improve file access efficiency and reduce time spent on network bandwidth. Device coding is used in the suggested method to store blocks in distributed cloud storage, and for data integrity, MD5 (Message Digest 5) is used. Running recovery algorithm helps the user to retrieve a file directly from the cloud servers without downloading every block. The scheme proposed improves system time efficiency and the ability to access the stored data quickly. This reduces bandwidth consumption and reduces overhead user processing while downloading the data file.



Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 563
Author(s):  
Babu Rajendiran ◽  
Jayashree Kanniappan

Nowadays, many business organizations are operating on the cloud environment in order to diminish their operating costs and to select the best service from many cloud providers. The increasing number of Cloud Services available on the market encourages the cloud consumer to be conscious in selecting the most apt Cloud Service Provider that satisfies functionality, as well as QoS parameters. Many disciplines of computer-based applications use standardized ontology to represent information in their fields that indicate the necessity of an ontology-based representation. The proposed generic model can help service consumers to identify QoS parameters interrelations in the cloud services selection ontology during run-time, and for service providers to enhance their business by interpreting the various relations. The ontology has been developed using the intended attributes of QoS from various service providers. A generic model has been developed and it is tested with the developed ontology.



Author(s):  
Vladimir Meikshan ◽  
◽  
Natalia Teslya ◽  

Benefits of using cloud technology are obvious, their application is expanding, as a result, it determines the steady growth of demand. Cloud computing has acquired particular relevance for large companies connected with Internet services, retailing, logistics that generate large volume of business and other information. The use of cloud technologies allows organizing the joint consumption of resources, solving the problems of storing and transferring significant amounts of data. Russian consumer cooperation refers to large territory distributed organizations actively forming their own digital ecosystem. The issue of data storing and processing for consumer coo-peration organizations is very relevant. At the same time, the prices of cloud service providers are significantly different and require solving the problem of minimizing the cost of storing and transferring significant amounts of data. The application of the linear programming method is considered to select the optimal data storage scheme for several cloud service providers having different technical and economic parameters of the package (maximum amount of storage, cost of allocated resources). Mathematical model includes the equation of costs for data storing and transferring and restrictions on the amount of storage, the amount of data and its safety. Software tool that allows to perform numerical calculations is selected Microsoft Excel in combination with the "search for solutions" add-on. In accordance with the mathematical model, the conditions for minimizing the amount of cloud storage costs and the necessary restrictions are established. Initial data are set for three data forming centers, storages of certain size for five cloud service providers and nominal price for information storage and transmission. Calculations of expenses are performed in several variants: without optimization, with the solution of the optimization problem, with price increase by cloud service providers. Results of the calculations confirm the necessity to solve the problem of minimizing the cost of cloud services for corporate clients. The presented model can be expanded for any cost conditions as well as for different areas of cloud applications.



Cloud service provider in cloud environment will provide or provision resource based on demand from the user. The cloud service provider (CSP) will provide resources as and when required or demanded by the user for execution of the job on the cloud environment. The CSP will perform this in a static and dynamic manner. The CSP should also consider various other factors in order to provide the resources to the user, the prime among that will be the Service Level Agreement (SLA), which is normally signed by the user and cloud service provider during the inception phase of service. There are many algorithm which are used in order to allocate resources to the user in cloud environment. The algorithm which is proposed will be used to reduce the amount of energy utilized in performing various job execution in cloud environment. Here the energy utilized for execution of various jobs are taken into account by increasing the number of virtual machines that are used on a single physical host system. There is no thumb rule to calculate the number of virtual machines to be executed on a single host. The same can be derived by calculating the amount of space, speed required along with the time to execute the job on a virtual machine. Based up on this we can derive the number of Virtual machine on a single host system. There can be 10 virtual machines on a single system or even 20 number of virtual machines on single physical system. But if the same is calculated by the equation then the result will be exactly matching with the threshold capacity of the physical system[1]. If more number of physical systems are used to execute fewer virtual machines on each then the amount of energy consumed will be very high. So in order to reduce the energy consumption , the algorithm can be used will not only will help to calculate the number of virtual machines on single physical system , but also will help to reduce the energy as less number of physical systems will be in need[2].



2016 ◽  
pp. 2076-2095
Author(s):  
Abhishek Majumder ◽  
Sudipta Roy ◽  
Satarupa Biswas

Cloud is considered as future of Information Technology. User can utilized the cloud on pay-as-you use basis. But many organizations are stringent about the adoption of cloud computing due to their concern regarding the security of the stored data. Therefore, issues related to security of data in the cloud have become very vital. Data security involves encrypting the data and ensuring that suitable policies are imposed for sharing those data. There are several data security issues which need to be addressed. These issues are: data integrity, data intrusion, service availability, confidentiality and non-repudiation. Many schemes have been proposed for ensuring data security in cloud environment. But the existing schemes lag in fulfilling all these data security issues. In this chapter, a new Third Party Auditor based scheme has been proposed for secured storage and retrieval of client's data to and from the cloud service provider. The scheme has been analysed and compared with some of the existing schemes with respect to the security issues. From the analysis and comparison it can be observed that the proposed scheme performs better than the existing schemes.



Author(s):  
Minakshi Sharma ◽  
Rajneesh Kumar ◽  
Anurag Jain

Cloud load balancing is done to persist the services in the cloud environment along with quality of service (QoS) parameters. An efficient load balancing algorithm should be based on better optimization of these QoS parameters which results in efficient scheduling. Most of the load balancing algorithms which exist consider response time or resource utilization constraints but an efficient algorithm must consider both perspectives from the user side and cloud service provider side. This article presents a load balancing strategy that efficiently allocates tasks to virtualized resources to get maximum resource utilization in minimum response time. The proposed approach, join minimum loaded queue (JMLQ), is based on the existing join idle queue (JIQ) model that has been modified by replacing idle servers in the I-queues with servers having one task in execution list. The results of simulation in CloudSim verify that the proposed approach efficiently maximizes resource utilization by reducing the response time in comparison to its other variants.



2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Ohmin Kwon ◽  
Dongyoung Koo ◽  
Yongjoo Shin ◽  
Hyunsoo Yoon

With popularization of cloud services, multiple users easily share and update their data through cloud storage. For data integrity and consistency in the cloud storage, the audit mechanisms were proposed. However, existing approaches have some security vulnerabilities and require a lot of computational overheads. This paper proposes a secure and efficient audit mechanism for dynamic shared data in cloud storage. The proposed scheme prevents a malicious cloud service provider from deceiving an auditor. Moreover, it devises a new index table management method and reduces the auditing cost by employing less complex operations. We prove the resistance against some attacks and show less computation cost and shorter time for auditing when compared with conventional approaches. The results present that the proposed scheme is secure and efficient for cloud storage services managing dynamic shared data.



Sign in / Sign up

Export Citation Format

Share Document