Show simple item record

dc.contributor.advisorTan, Chiu C.
dc.creatorLi, Xinyi
dc.date.accessioned2021-05-24T18:37:16Z
dc.date.available2021-05-24T18:37:16Z
dc.date.issued2021
dc.identifier.urihttp://hdl.handle.net/20.500.12613/6450
dc.description.abstractVisual localization or camera pose estimation is the core task of many computer vision and robotics tasks, with broad applications including robot navigation, autonomous driving and augmented reality. Camera pose estimation is the process of self-determining the orientation and position with aid of sequential information via image retrieval. As the key component in standard camera pose estimation pipelines, pose-graph optimization (PGO) involves iterative estimations of pair-wise camera relative poses and progressive optimization of the noisy global view-graph. Despite the proliferation of research addressing back-end optimization in multi-view structure-from-motion (SfM) systems, many challenges remain open. Firstly, canonical solvers carry a complexity of cubic order with regards to the input size and gradually slows down, forfeiting the real-time requirements. Secondly, measurements of pair-wise relative camera poses are often noisy, yielding corrupted and erroneous edges in the view-graph and henceforth impairing the performances with both conventional and learning-based methods. Thirdly, direct regressions of structures and motions with deep learning networks are prone to overfitting, hindering the robustness and generality in real-world applications. In this dissertation thesis, I will first introduce the framework, evaluation metrics and challenges for camera pose estimation, followed by the discussion of our work where we propose a hybrid pipeline to dynamically update the 6-DOF camera poses on-the-fly. I will then present several works where we address the robustness issues arising for existing multiple rotation averaging (MRA) approaches and demonstrate that with a given connectivity on a view-graph, the outlier/noisy measurements can be guaranteed to yield a theoretical error upper bound. I will conclude the discussion with presenting one of our recent work, where we propose a GNN-based PGO scheme such that the MRA is conducted by adopting MPNN layers with a novel loss function embedding $l_1$ MRA formulation. Finally, I will summarize our work and discuss potential future works on camera pose estimation in monocular SfM systems.
dc.format.extent149 pages
dc.language.isoeng
dc.publisherTemple University. Libraries
dc.relation.ispartofTheses and Dissertations
dc.rightsIN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available.
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectComputer science
dc.titleRobust 6-DOF Camera Relocalization in Multi-view Structure from Motion
dc.typeText
dc.type.genreThesis/Dissertation
dc.contributor.committeememberLing, Haibin
dc.contributor.committeememberJi, Bo, 1982-
dc.contributor.committeememberSouvenir, Richard
dc.contributor.committeememberLi, Bin
dc.description.departmentComputer and Information Science
dc.relation.doihttp://dx.doi.org/10.34944/dspace/6432
dc.ada.noteFor Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
dc.description.degreePh.D.
dc.identifier.proqst14383
dc.date.updated2021-05-19T16:08:15Z
refterms.dateFOA2021-05-24T18:37:16Z
dc.identifier.filenameLi_temple_0225E_14383.pdf


Files in this item

Thumbnail
Name:
Li_temple_0225E_14383.pdf
Size:
13.72Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record