• Graph-based Inference with Constraints for Object Detection and Segmentation

      Latecki, Longin; Ling, Haibin; Vucetic, Slobodan; Huang, Xiaolei (Temple University. Libraries, 2013)
      For many fundamental problems of computer vision, adopting a graph-based framework can be straight-forward and very effective. In this thesis, I propose several graph-based inference methods tailored for different computer vision applications. It starts from studying contour-based object detection methods. In particular, We propose a novel framework for contour based object detection, by replacing the hough-voting framework with finding dense subgraph inference. Compared to previous work, we propose a novel shape matching scheme suitable for partial matching of edge fragments. The shape descriptor has the same geometric units as shape context but our shape representation is not histogram based. The key contribution is that we formulate the grouping of partial matching hypotheses to object detection hypotheses is expressed as maximum clique inference on a weighted graph. Consequently, each detection result not only identifies the location of the target object in the image, but also provides a precise location of its contours, since we transform a complete model contour to the image. We achieve very competitive results on ETHZ dataset, obtained in a pure shape-based framework, demonstrate that our method achieves not only accurate object detection but also precise contour localization on cluttered background. Similar to the task of grouping of partial matches in the contour-based method, in many computer vision problems, we would like to discover certain pattern among a large amount of data. For instance, in the application of unsupervised video object segmentation, where we need automatically identify the primary object and segment the object out in every frame. We propose a novel formulation of selecting object region candidates simultaneously in all frames as finding a maximum weight clique in a weighted region graph. The selected regions are expected to have high objectness score (unary potential) as well as share similar appearance (binary potential). Since both unary and binary potentials are unreliable, we introduce two types of mutex (mutual exclusion) constraints on regions in the same clique: intra-frame and inter-frame constraints. Both types of constraints are expressed in a single quadratic form. An efficient algorithm is applied to compute the maximal weight cliques that satisfy the constraints. We apply our method to challenging benchmark videos and obtain very competitive results that outperform state-of-the-art methods. We also show that the same maximum weight subgraph with mutex constraints formulation can be used to solve various computer vision problems, such as points matching, solving image jigsaw puzzle, and detecting object using 3D contours.