Show simple item record

dc.contributor.advisorTan, Chiu C.
dc.creatorWang, Yufeng
dc.date.accessioned2020-11-05T19:50:35Z
dc.date.available2020-11-05T19:50:35Z
dc.date.issued2015
dc.identifier.other931912260
dc.identifier.urihttp://hdl.handle.net/20.500.12613/4011
dc.description.abstractBackup storage based on cloud service is becoming increasingly popular. Deduplication is a key technique that reduces the transmission and storage overhead of backing up large datasets by identifying multiple copies of redundant data. Elasticity is the ability to scale computing resources such as memory on-demand, and is one of the main advantages of utilizing cloud computing services. With the increasing popularity of cloud based storage, it is natural that more deduplication based storage systems will be migrated to the cloud. Existing deduplication systems however, do not adequately take advantage of elasticity. In this thesis, we illustrate how to use elasticity to improve deduplication based systems, and propose EAD (elasticity aware deduplication), an indexing algorithm that uses the ability to dynamically increase memory resources to improve overall deduplication performance. Our experimental results indicate that EAD is able to detect more than 98\% of all duplicate data, however only consumes less than 5\% of expected memory space. Meanwhile, it claims four times of deduplication efficiency than the state-of-art sampling technique while costs less than half of the amount of memory. Furthermore, as the data growing rapidly in data centers, single-node storage node is no longer be able to provide the corresponding throughput and capacities as expected. Building deduplication clusters is considered as a promising strategy to leverage such bottle-neck on single-node system. However, deduplication relies on how much the system knows about information of previous stored data. The single-node system obviously obtains all such information and is able to detect duplicate data there; however storage nodes in cluster-based system cannot know information on other nodes. It is nontrivial to route data intelligently enough so that the system could support deduplication performance comparable to that of a single-node system, while also at a trivial cost. Thus, we propose an elastic data routing strategy, aiming to achieve deduplication performance comparable to state-of-the-art, while require much less computation resources. To step further, deduplication as it is currently adopted by cloud backup providers is vulnerable to side-channel attacks. Traditional defenses in cloud computing can prevent such attacks, but are cannot be use together with deduplication. Therefore, we explore the impact of encryption on data uploads to the cloud as well as proposing a solution for cloud-based backup services that combines deduplication and encryption to provide both security and high bandwidth and efficiency. Extensive experiments on real world dataset shows that our solution incurs a small overhead compared to native deduplication while offering strong security protections.
dc.format.extent87 pages
dc.language.isoeng
dc.publisherTemple University. Libraries
dc.relation.ispartofTheses and Dissertations
dc.rightsIN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available.
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectComputer Science
dc.titleEfficient and Secure Deduplication for Cloud-based Backups
dc.typeText
dc.type.genreThesis/Dissertation
dc.contributor.committeememberWu, Jie, 1961-
dc.contributor.committeememberShi, Justin Y.
dc.description.departmentComputer and Information Science
dc.relation.doihttp://dx.doi.org/10.34944/dspace/3993
dc.ada.noteFor Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
dc.description.degreeM.S.
refterms.dateFOA2020-11-05T19:50:35Z


Files in this item

Thumbnail
Name:
TETDEDXWang-temple-0225M-12053.pdf
Size:
1.326Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record