Show simple item record

dc.contributor.advisorWu, Jie, 1961-
dc.creatorDuan, Yubin
dc.date.accessioned2022-08-15T18:57:16Z
dc.date.available2022-08-15T18:57:16Z
dc.date.issued2022
dc.identifier.urihttp://hdl.handle.net/20.500.12613/7993
dc.description.abstractDeep Neural Network (DNN) models have been widely deployed in a variety of applications. To achieve better performance, DNN models become more and more complex, which introduces extremely long DNN training time. Although DNN inference typically runs a single round of forward propagation on the DNN model, the inference time for large DNN models is still intolerable for time-sensitive applications in mobile devices. We propose to reduce the DNN inference and training time for mobile distributed systems. To accelerate DNN inference for mobile cloud computing, existing research follows the cooperative inference approach, where a part of the inference workload is offloaded from mobile devices to cloud servers. The cloud/edge offloading approach investigates the collaboration between local and remote computation resources. The collaboration between mobile devices and the cloud server could speed up the DNN inference. We propose to jointly consider the partition and scheduling of multiple DNNs. Our objective is to reduce the makespan that contains both communication and computation latencies. In our DNN offloading pipeline, communication operations work in parallel with computation operations. Towards reducing the overall inference latency, we formulate the offloading pipeline scheduling problem and discuss the DNN partition and/or scheduling strategies for chain-, tree-, and DAG-structure DNN models. The prototype of our offloading scheme is implemented and evaluated on a real-world testbed. Moreover, we also discuss the distributed training for DNN models with the objective of minimizing the overall training time. We investigated several distributed training frameworks. Firstly, we investigate the parameter server framework and improve its scheduler design to reduce inter-machine communication costs and the training time. A two-step heuristic algorithm that balances the data and parameter allocation among training devices is presented. Secondly, we analyze the scheduler design for Spark, which is a unified data analytic engine. We formulate a DAG shop scheduling problem to optimize the execution pipeline of Spark jobs. A heuristic contention-free scheduler and a reinforcement-learning-based scheduler are proposed. Finally, we extend the pipeline parallelism to distributively train DNN models on mobile devices that have heterogeneous computation resources. A clustering algorithm is presented to organize the heterogeneous training devices. Evaluation results on real-world datasets show our proposed methods can efficiently speed up the training process.
dc.format.extent195 pages
dc.language.isoeng
dc.publisherTemple University. Libraries
dc.relation.ispartofTheses and Dissertations
dc.rightsIN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available.
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectComputer science
dc.titleACCELERATING DNN INFERENCE AND TRAINING IN DISTRIBUTED SYSTEMS
dc.typeText
dc.type.genreThesis/Dissertation
dc.contributor.committeememberGao, Hongchang
dc.contributor.committeememberWang, Ning
dc.contributor.committeememberJi, Bo, 1982-
dc.description.departmentComputer and Information Science
dc.relation.doihttp://dx.doi.org/10.34944/dspace/7965
dc.ada.noteFor Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
dc.description.degreePh.D.
dc.identifier.proqst14951
dc.date.updated2022-08-11T22:09:04Z
dc.embargo.lift08/11/2023
dc.identifier.filenameDuan_temple_0225E_14951.pdf


Files in this item

Thumbnail
Name:
Duan_temple_0225E_14951.pdf
Size:
2.340Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record