Show simple item record

dc.contributor.advisorLing, Haibin
dc.creatorYang, Fan
dc.date.accessioned2021-01-18T20:12:14Z
dc.date.available2021-01-18T20:12:14Z
dc.date.issued2020
dc.identifier.urihttp://hdl.handle.net/20.500.12613/4716
dc.description.abstractObject perception as a fundamental task in computer vision has a broad of applicationin real word, such as self-driving, industrial defect inspection, intelligent agriculture. Numerous works have been studied to advance the progress of object perception. In particular, due to the powerful feature learning and representation ability of deep learning, object perception algorithm has achieved signicant progress. In the dissertation, I rst introduce the denition of object perception and its three subtasks: object detection, pose estimation, and object segmentation; then specically present our works in the three subtasks, respectively. Object detection: we conduct two works, clustered object detection in aerial image (ClusDet) and dually supervised feature pyramid for object detection and segmentation (DSFPN). The ClusDet is designed to leverage the prior that objects in aerial ( especially in trac scenario) tend to cluster in dierent scales for object detection. By comparing with evenly crop method, ClusDet can achieve superior precision with less computation load. The DSFPN is proposed to alleviate the gradient degradation or vanishing problem in feature pyramid network (FPN) for object detection and segmentation. In particular, we note that performance of the two-stage detectors do not constantly increase with the growing complexity of backbone network, which is consistent with the conclusion in \deep residual learning for image recognition". To mitigate the problem, we propose to add extra supervision signal on bottom-up path of FPN in training phase to enhance the gradient information so as to facilitate the model training. Pose estimation: a robust dynamic fusion (RDF) algorithm is proposed to deal with noisy modalities in patient body modeling. In particular, for patient body modeling, the RGB camera cannot provide sucient information because of the body covered with blanket or loosen cloth. In this case, multi-modality (e.g., RGB, thermal, depth) sensors are required to acquire complimentary information. However, dierent application may need dierent sensors. It is labor-intensive and time-consuming to train a model per an application. In addition, multi-modality images may come to various noise in deployment, so that the trained model fails to work precisely. To deal with the aforementioned issues, we propose the RDF in conjunction with a dynamic training strategy to adaptively depress the features from noisy modalities, such that the model can be trained once and deployed any of the modalities. Object segmentation: the object here refers to crack, we propose a feature pyramid and hierarchical boosting network (FPHBN) for pavement crack detection. Specically, the crack in pavement has various scales (width), based on this characteristic, we introduce a feature pyramid architecture to utilize the inherent hierarchy of deep convolution networks (DConvNets) to construct multi-scale features for multi-scale cracks. Beside, each layer of the DConvNets is not independent, to leverage this dependency, we design a hierarchical boosting module to reweight samples via the prediction from adjunct layer. With the benet of the boosting module, the proposed network can dynamically pay more attention to hard samples.
dc.format.extent127 pages
dc.language.isoeng
dc.publisherTemple University. Libraries
dc.relation.ispartofTheses and Dissertations
dc.rightsIN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available.
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectComputer science
dc.titleDEEP LEARNING BASED OBJECT PERCEPTION ALGORITHM AND APPLICATION
dc.typeText
dc.type.genreThesis/Dissertation
dc.contributor.committeememberShi, Justin Y.
dc.contributor.committeememberJi, Bo
dc.contributor.committeememberWu, Ziyan
dc.contributor.committeememberLing, Haibin
dc.description.departmentComputer and Information Science
dc.relation.doihttp://dx.doi.org/10.34944/dspace/4698
dc.ada.noteFor Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
dc.description.degreePh.D.
dc.identifier.proqst14284
dc.date.updated2021-01-14T17:05:42Z
refterms.dateFOA2021-01-18T20:12:15Z
dc.identifier.filenameyang_temple_0225E_14284.pdf


Files in this item

Thumbnail
Name:
yang_temple_0225E_14284.pdf
Size:
59.55Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record