Loading...
DEEP LEARNING BASED OBJECT PERCEPTION ALGORITHM AND APPLICATION
Yang, Fan
Yang, Fan
Citations
Altmetric:
Genre
Thesis/Dissertation
Date
2020
Advisor
Committee member
Group
Department
Computer and Information Science
Subject
Permanent link to this record
Collections
Research Projects
Organizational Units
Journal Issue
DOI
http://dx.doi.org/10.34944/dspace/4698
Abstract
Object perception as a fundamental task in computer vision has a broad of applicationin real word, such as self-driving, industrial defect inspection, intelligent
agriculture. Numerous works have been studied to advance the progress of
object perception. In particular, due to the powerful feature learning and representation
ability of deep learning, object perception algorithm has achieved
signicant progress. In the dissertation, I rst introduce the denition of object
perception and its three subtasks: object detection, pose estimation, and
object segmentation; then specically present our works in the three subtasks,
respectively.
Object detection: we conduct two works, clustered object detection in
aerial image (ClusDet) and dually supervised feature pyramid for object detection
and segmentation (DSFPN). The ClusDet is designed to leverage the prior
that objects in aerial ( especially in trac scenario) tend to cluster in dierent
scales for object detection. By comparing with evenly crop method, ClusDet
can achieve superior precision with less computation load. The DSFPN is
proposed to alleviate the gradient degradation or vanishing problem in feature
pyramid network (FPN) for object detection and segmentation. In particular,
we note that performance of the two-stage detectors do not constantly increase
with the growing complexity of backbone network, which is consistent with the
conclusion in \deep residual learning for image recognition". To mitigate the problem, we propose to add extra supervision signal on bottom-up path of
FPN in training phase to enhance the gradient information so as to facilitate
the model training.
Pose estimation: a robust dynamic fusion (RDF) algorithm is proposed to
deal with noisy modalities in patient body modeling. In particular, for patient
body modeling, the RGB camera cannot provide sucient information because
of the body covered with blanket or loosen cloth. In this case, multi-modality
(e.g., RGB, thermal, depth) sensors are required to acquire complimentary
information. However, dierent application may need dierent sensors. It is
labor-intensive and time-consuming to train a model per an application. In
addition, multi-modality images may come to various noise in deployment, so
that the trained model fails to work precisely. To deal with the aforementioned
issues, we propose the RDF in conjunction with a dynamic training strategy
to adaptively depress the features from noisy modalities, such that the model
can be trained once and deployed any of the modalities.
Object segmentation: the object here refers to crack, we propose a feature
pyramid and hierarchical boosting network (FPHBN) for pavement crack detection.
Specically, the crack in pavement has various scales (width), based
on this characteristic, we introduce a feature pyramid architecture to utilize
the inherent hierarchy of deep convolution networks (DConvNets) to construct
multi-scale features for multi-scale cracks. Beside, each layer of the DConvNets
is not independent, to leverage this dependency, we design a hierarchical boosting
module to reweight samples via the prediction from adjunct layer. With
the benet of the boosting module, the proposed network can dynamically pay
more attention to hard samples.
Description
Citation
Citation to related work
Has part
ADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu