Imposing Geometric Constraints: Clustering SIFT Matches Using A Generalized Hough Transform

The SVN address for the paper is:

svn://cessna.cs.umass.edu/Hough Object Tracking Paper

References

David Lowe: Distinctive Image Features from Scale-Invariant Keypoints

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and ﬁnally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

Download

David Lowe: Object Recognition from Local Scale-Invariant Features

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged ﬁltering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest-neighbor indexing method that identifiescandidate object matches. Final verification of each match is achieved by ﬁnding a low-residual least-squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially-occluded images with a computation time of under 2 seconds.

Download

Mathew Brown & David Lowe: Invariant Features From Interest Point Groups

Abstract: This paper approaches the problem of finding correspondences between images in which there are large changes in viewpoint, scale and illumination. Recent work has shown that scale-space ‘interest points’ may be found with good repeatability in spite of such changes. Further more, the high entropy of the surrounding image regions means that local descriptors are highly discriminative for matching. For descriptors at interest points to be robustly matched between images, they must be as far as possible invariant to the imaging process. In this work we introduce a family of features which use groups of interest points to form geometrically invariant descriptors of image regions. Feature descriptors are formed by resampling the image rel ative to canonical frames de?ned by the points. In addition to robust matching, a key advantage of this approach is that each match implies ahypothesis of the local 2D (projective) transformation. This allows us to immediately reject most of the false matches using a Hough trans form. We reject remaining outliers using RANSAC and the epipolar constraint. Results show that dense feature matching can be achieved in a few seconds of computation on 1GHz Pentium III machines.

Download

Alper Yilmaz, Omar Javed, Mubarak Shah: Object Tracking: A Survey

Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different cate- gories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons.Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

Download

Shi & Tomasi: Good Features To Track

Abstract: No feature-based vision system can work unless good features can be identified and tracked from frame to frame. Although tracking itself is by and large a solved problem, selecting features that can be tracked well and correspond to physical points in the world is still hard. We propose a feature selection criterion that is optimal by construction because it is based on how the tracker works, and a feature monitoring method that can detect occlusions, disocclusions, and features that do not correspond to points in the world. These methods are based on a new tracking algorithm that extends previous Newton-Raphson style search methods to work under affine image transformations. We test performance with several simulations and experiments.

Download

Matthew Brown, Richard Szeliski, Simon Winder: Multi-Image Matching Using Multi-Scale Oriented Patches

Abstract: This paper describes a novel multi-view matching frame- work based on a new type of invariant feature. Our feaures are located at Harris corners in discrete scale-space nd oriented using a blurred local gradient. This defines a otationally invariant frame in which we sample a feature escriptor, which consists of an 8×8 patch of bias/gain normalised intensity values. The density of features in the mage is controlled using a novel adaptive non-maximal uppression algorithm, which gives a better spatial distriution of features than previous approaches. Matching is chieved using a fast nearest neighbour algorithm that in- exes features based on their low frequency Haar wavelet oefficients. We also introduce a novel outlier rejection pro- edure that verifies a pairwise feature match based on a ackground distribution of incorrect feature matches. Feaure matches are refined using RANSAC and used in an utomatic 2D panorama stitcher that has been extensively ested on hundreds of sample inputs.

Download

Herbert Bay, Beat Fasel, Luc Van Gool: Fast and Robust Recognition of Museum Objects

Abstract: In this paper, we describe the application of the novel SURF (Speeded Up Robust Features) algorithm [1] for the recognition of objects of art. For this purpose, we developed a prototype of a mobile interactive museum guide consisting of a tablet PC that features a touchscreen and a webcam. This guide recognises objects in museums based on images taken by the visitor. Using different image sets of real museum objects, we demonstrate that both the object recognition performance as well as the speed of the SURF algorithm surpasses the results obtained with SIFT, its main contender.

Download

Zoran Zivkovic, Bram Bakker, Ben Krose: Hierarchical Map Building Using Visual Landmarks and Geometric Constraints

Abstract: This paper addresses the problem of automatic construction of a hierarchical map from images. Our approach departs from a large collection of omnidirectional images taken at many locations in a building. First a low-level map is built that consists of a graph in which relations between images are represented. For this we use a metric based on visual landmarks (SIFT features) and geometrical constraints. Then we use a graph partitioning method to cluster nodes and in this way construct the high-level map. Experiments on real data show that meaningful higher and lower level maps are obtained, which can be used for accurate localization and planning.

Download

Jerry Jun Yokono, Tomaso Poggio: A Multiview Face Identification Model With No Geometric Constraints

Abstract: Face identification systems relying on local descriptors are increasingly used because of their perceived robustness with respect to occlusions and to global geometrical deformations. Descriptors of this type – based on a set of oriented Gaussian derivative filters – are used in our identification system. In this paper, we explore a pose-invariant multiview face identification system that does not use explicit geometrical information. The basic idea of the approach is to find discriminant features to describe a face across different views. A boosting procedure is used to select features out of a large feature pool of local features collected from the positive training examples. We describe experiments on well-known, though small, face databases with excellent recognition rate.

Download

Asaad Hakeem, Roberto Vezzani, Mubarak Shah, Rita Cucchiara: Estimating Geospatial Trajectory of a Moving Camera

Abstract: This paper proposes a novel method for estimating the geospatial trajectory of a moving camera. The proposed method uses a set of reference images with known GPS (global positioning system) locations to recover the trajectory of a moving camera using geometric constraints. The proposed method has three main steps. First, scale invariant features transform (SIFT) are detected and matched between the reference images and the video frames to calculate a weighted adjacency matrix (WAM) based on the number of SIFT matches. Second, using the estimated WAM, the maximum matching reference image is selected for the current video frame, which is then used to estimate the relative position (rotation and translation) of the video frame using the fundamental matrix constraint. The relative position is recovered upto a scale factor and a triangulation among the video frame and two reference images is performed to resolve the scale ambiguity. Third, an outlier rejection and trajectory smoothing (using b-spline) post processing step is employed. This is because the estimated camera locations may be noisy due to bad point correspondence or degenerate estimates of fundamental matrices. Results of recovering camera trajectory are reported for real sequences

Download

Jonathan Rihan: An Exploration of the SIFT Operator

Abstract: The SIFT operator developed by David Lowe (Lowe 1999, Lowe 2004) is an algorithm for object recognition in images. This dissertation is an exploration of the SIFT operator, with the goal of identifying and exploring areas of possible improvement. These might be either in performance characteristics of the implementation of the algorithm or general improvements to the stability or robustness of the algorithm in analysing diﬀerent images and detecting objects. First the algorithm will be implemented in C++ on a Windows operating system, then once it has been successfully implemented, avenues of improvement will be identiﬁed and explored. The areas will be indentiﬁed by experimentation and further research during the course of the project.

Download

Adam Stanski, Olaf Hellwich: Spiders as Robust Point Descriptors

Abstract: This paper introduces a new operator to characterize a point in an image in a distinctive and invariant way. The robust recognition of points is a key technique in computer vision: algorithms for stereo correspondence, motion tracking and object recognition rely heavily on this type of operator. The goal in this paper is to describe the salient point to be characterized by a constellation of surrounding anchor points. Salient points are the most reliably localized points extracted by an interest point operator. The anchor points are multiple in terest points in a visually homogenous segment surrounding the salient point. Because of its appearance, this constellation is called a spider. With a prototype of the spider operator, results in this paper demonstrate how a point can be recognized in spite of significant image noise, inhomogeneous change in illumination and altered perspective. For an example that requires a high performance close to object / background boundaries, the prototype yields better results than David Lowe’s SIFT operator

Download

Stephen Se, Piotr Jasiobedzki: Photo-realistic 3D Model Reconstruction

Abstract: Photo-realistic 3D modeling is a challenging problem and has been a research topic for many years. Quick generation of photo-realistic three-dimensional calibrated models using a hand-held device is highly desirable for applications ranging from forensic investigation, mining, to mobile robotics. In this paper, we present the instant Scene Modeler (iSM), a 3D imaging system that automatically creates 3D models using an off-the-shelf hand-held stereo camera. The user points the camera at a scene of interest and the system will create a photo-realistic 3D calibrated model automatically within minutes. Field tests in various environments have been carried out with promising results

Download

T. Läbe and W. Förstner: Automatic Relative Orientation Of Images

Abstract: This paper presents a new approach to full automatic relative orientation of several digital images taken with a calibrated camera. This approach uses new algorithms for feature extraction and relative orientation developed in the last few years. There is no need for special markers in the scene nor for approximate values for the parameters of the exterior orientation. We use the point operator developed by D. G. Lowe (Lowe, 2004), which extracts points with scale- and rotation-invariant descriptors (SIFT-features). These descriptors allow a successful matching of image points even in situations with highly convergent images. The approach consists of the following steps: After extracting image points on all images each image pair is matched using the SIFT parameters only. No prior information about the pose of the images or the overlapping parts of the images is needed. For every image pair a relative orientation is computed using a RANSAC procedure. Here we use the new 5-point algorithm developed by D. Nister (Nister, 2004). Based on these orientations approximate values for the orientation parameters and the object coordinates are calculated. This is achieved by computing the relative scale and transforming into a common coordinate system. Several tests are carried out to ensure reliable inputs for the currently final step: a bundle block adjustment. The paper discusses the practical impacts of the algorithms involved. Examples of different indoor- and outdoor-scenes including a dataset of tilted aerial images are presented and the results of the approach are evaluated. These results show that the approach can be used for a wide range of scenes with different types of the image geometry and taken with different types of cameras including inexpensive consumer cameras. In particular we investigate in the robustness of the algorithms, e.g. in geometric tests on image triplets. In the outlook further developments like the use of image pyramids with a modified matching are discussed.

Download

P.H.S. Torr, A. Zisserman, D.W. Murray: Motion Clustering using the Trilinear Constraint over Three Views

Abstract: A new method for motion segmentation is presented for clustering features that belong to independently moving objects. It is based on the geometric constraints imposed on the image positions of points and lines arising from rigidly moving objects in the world. The motion of points and lines in the image over three views are linked by the trilinear (or trifocal) constraint, which plays a similar role in three views to that played by the fundamental matrix in two. The fundamental matrix only imposes a one dimensional constraint on the location of features in the second image given its location in the first, whereas the trilinear constraint gives the exact location of a feature in a third image given its location in the other two. The trilinear constraint discriminates a wider range of motion than the epipolar geometry. Furthermore the trilinear constraint has the advantage that it constraints the location of lines as well as points. The segmentation problem is transformed into that of grouping the features in the image consistent with different trilinear constraints. Feasible clusters are generated using robust techniques. It is essential that the method be robust due to the prevalence of mismatches generated by state of the art feature matchers. Degenerate cases are explored, with specific emphasis on the three view constraint on points and lines imposed by the affine camera.

Download

Forestry Project

User Tools

Site Tools