Loading...
Robust 6-DOF Camera Relocalization in Multi-view Structure from Motion
Li, Xinyi
Li, Xinyi
Citations
Altmetric:
Genre
Thesis/Dissertation
Date
2021
Advisor
Committee member
Group
Department
Computer and Information Science
Subject
Permanent link to this record
Collections
Research Projects
Organizational Units
Journal Issue
DOI
http://dx.doi.org/10.34944/dspace/6432
Abstract
Visual localization or camera pose estimation is the core task of many computer vision and robotics tasks, with broad applications including robot navigation, autonomous driving and augmented reality. Camera pose estimation is the process of self-determining the orientation and position with aid of sequential information via image retrieval. As the key component in standard camera pose estimation pipelines, pose-graph optimization (PGO) involves iterative estimations of pair-wise camera relative poses and progressive optimization of the noisy global view-graph. Despite the proliferation of research addressing back-end optimization in multi-view structure-from-motion (SfM) systems, many challenges remain open. Firstly, canonical solvers carry a complexity of cubic order with regards to the input size and gradually slows down, forfeiting the real-time requirements. Secondly, measurements of pair-wise relative camera poses are often noisy, yielding corrupted and erroneous edges in the view-graph and henceforth impairing the performances with both conventional and learning-based methods. Thirdly, direct regressions of structures and motions with deep learning networks are prone to overfitting, hindering the robustness and generality in real-world applications.
In this dissertation thesis, I will first introduce the framework, evaluation metrics and challenges for camera pose estimation, followed by the discussion of our work where we propose a hybrid pipeline to dynamically update the 6-DOF camera poses on-the-fly. I will then present several works where we address the robustness issues arising for existing multiple rotation averaging (MRA) approaches and demonstrate that with a given connectivity on a view-graph, the outlier/noisy measurements can be guaranteed to yield a theoretical error upper bound. I will conclude the discussion with presenting one of our recent work, where we propose a GNN-based PGO scheme such that the MRA is conducted by adopting MPNN layers with a novel loss function embedding $l_1$ MRA formulation. Finally, I will summarize our work and discuss potential future works on camera pose estimation in monocular SfM systems.
Description
Citation
Citation to related work
Has part
ADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu