• Login
    View Item 
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of TUScholarShareCommunitiesDateAuthorsTitlesSubjectsGenresThis CollectionDateAuthorsTitlesSubjectsGenres

    My Account

    LoginRegister

    Help

    AboutPeoplePoliciesHelp for DepositorsData DepositFAQs

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Efficient and Secure Deduplication for Cloud-based Backups

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    TETDEDXWang-temple-0225M-12053.pdf
    Size:
    1.326Mb
    Format:
    PDF
    Download
    Genre
    Thesis/Dissertation
    Date
    2015
    Author
    Wang, Yufeng
    Advisor
    Tan, Chiu C.
    Committee member
    Wu, Jie, 1961-
    Shi, Justin Y.
    Department
    Computer and Information Science
    Subject
    Computer Science
    Permanent link to this record
    http://hdl.handle.net/20.500.12613/4011
    
    Metadata
    Show full item record
    DOI
    http://dx.doi.org/10.34944/dspace/3993
    Abstract
    Backup storage based on cloud service is becoming increasingly popular. Deduplication is a key technique that reduces the transmission and storage overhead of backing up large datasets by identifying multiple copies of redundant data. Elasticity is the ability to scale computing resources such as memory on-demand, and is one of the main advantages of utilizing cloud computing services. With the increasing popularity of cloud based storage, it is natural that more deduplication based storage systems will be migrated to the cloud. Existing deduplication systems however, do not adequately take advantage of elasticity. In this thesis, we illustrate how to use elasticity to improve deduplication based systems, and propose EAD (elasticity aware deduplication), an indexing algorithm that uses the ability to dynamically increase memory resources to improve overall deduplication performance. Our experimental results indicate that EAD is able to detect more than 98\% of all duplicate data, however only consumes less than 5\% of expected memory space. Meanwhile, it claims four times of deduplication efficiency than the state-of-art sampling technique while costs less than half of the amount of memory. Furthermore, as the data growing rapidly in data centers, single-node storage node is no longer be able to provide the corresponding throughput and capacities as expected. Building deduplication clusters is considered as a promising strategy to leverage such bottle-neck on single-node system. However, deduplication relies on how much the system knows about information of previous stored data. The single-node system obviously obtains all such information and is able to detect duplicate data there; however storage nodes in cluster-based system cannot know information on other nodes. It is nontrivial to route data intelligently enough so that the system could support deduplication performance comparable to that of a single-node system, while also at a trivial cost. Thus, we propose an elastic data routing strategy, aiming to achieve deduplication performance comparable to state-of-the-art, while require much less computation resources. To step further, deduplication as it is currently adopted by cloud backup providers is vulnerable to side-channel attacks. Traditional defenses in cloud computing can prevent such attacks, but are cannot be use together with deduplication. Therefore, we explore the impact of encryption on data uploads to the cloud as well as proposing a solution for cloud-based backup services that combines deduplication and encryption to provide both security and high bandwidth and efficiency. Extensive experiments on real world dataset shows that our solution incurs a small overhead compared to native deduplication while offering strong security protections.
    ADA compliance
    For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
    Collections
    Theses and Dissertations

    entitlement

     
    DSpace software (copyright © 2002 - 2023)  DuraSpace
    Temple University Libraries | 1900 N. 13th Street | Philadelphia, PA 19122
    (215) 204-8212 | scholarshare@temple.edu
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.