• Login
    View Item 
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of TUScholarShareCommunitiesDateAuthorsTitlesSubjectsGenresThis CollectionDateAuthorsTitlesSubjectsGenres

    My Account

    LoginRegister

    Help

    AboutPeoplePoliciesHelp for DepositorsData DepositFAQs

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    ACCELERATING DNN INFERENCE AND TRAINING IN DISTRIBUTED SYSTEMS

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Duan_temple_0225E_14951.pdf
    Embargo:
    2023-08-11
    Size:
    2.340Mb
    Format:
    PDF
    Download
    Genre
    Thesis/Dissertation
    Date
    2022
    Author
    Duan, Yubin
    Advisor
    Wu, Jie, 1961-
    Committee member
    Gao, Hongchang
    Wang, Ning
    Ji, Bo, 1982-
    Department
    Computer and Information Science
    Subject
    Computer science
    Permanent link to this record
    http://hdl.handle.net/20.500.12613/7993
    
    Metadata
    Show full item record
    DOI
    http://dx.doi.org/10.34944/dspace/7965
    Abstract
    Deep Neural Network (DNN) models have been widely deployed in a variety of applications. To achieve better performance, DNN models become more and more complex, which introduces extremely long DNN training time. Although DNN inference typically runs a single round of forward propagation on the DNN model, the inference time for large DNN models is still intolerable for time-sensitive applications in mobile devices. We propose to reduce the DNN inference and training time for mobile distributed systems. To accelerate DNN inference for mobile cloud computing, existing research follows the cooperative inference approach, where a part of the inference workload is offloaded from mobile devices to cloud servers. The cloud/edge offloading approach investigates the collaboration between local and remote computation resources. The collaboration between mobile devices and the cloud server could speed up the DNN inference. We propose to jointly consider the partition and scheduling of multiple DNNs. Our objective is to reduce the makespan that contains both communication and computation latencies. In our DNN offloading pipeline, communication operations work in parallel with computation operations. Towards reducing the overall inference latency, we formulate the offloading pipeline scheduling problem and discuss the DNN partition and/or scheduling strategies for chain-, tree-, and DAG-structure DNN models. The prototype of our offloading scheme is implemented and evaluated on a real-world testbed. Moreover, we also discuss the distributed training for DNN models with the objective of minimizing the overall training time. We investigated several distributed training frameworks. Firstly, we investigate the parameter server framework and improve its scheduler design to reduce inter-machine communication costs and the training time. A two-step heuristic algorithm that balances the data and parameter allocation among training devices is presented. Secondly, we analyze the scheduler design for Spark, which is a unified data analytic engine. We formulate a DAG shop scheduling problem to optimize the execution pipeline of Spark jobs. A heuristic contention-free scheduler and a reinforcement-learning-based scheduler are proposed. Finally, we extend the pipeline parallelism to distributively train DNN models on mobile devices that have heterogeneous computation resources. A clustering algorithm is presented to organize the heterogeneous training devices. Evaluation results on real-world datasets show our proposed methods can efficiently speed up the training process.
    ADA compliance
    For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
    Collections
    Theses and Dissertations

    entitlement

     
    DSpace software (copyright © 2002 - 2023)  DuraSpace
    Temple University Libraries | 1900 N. 13th Street | Philadelphia, PA 19122
    (215) 204-8212 | scholarshare@temple.edu
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.