Computer Vision ultimately aims to mimic human visual perception. Although this ambitious goal has inspired research efforts towards the design of automatic systems that can effectively analyze and extract information from visual data, there is still an unfulfilled need for compact, abstract representations of this content in order to bridge the semantic gap that separates automatic perception from human understanding.
The collaboration between the department of Mathematics of Università degli Studi di Milano and the DIEGM of Università degli Studi di Udine (Italy) attempts to make some steps towards this direction, by developing a geometric multi-model fitting technique, namely T-linkage, aimed at extracting parametric models from unstructured data for the purpose of organizing and aggregating visual content in adequate higher-level geometric structures.
The key idea of T-Linkage is to build a conceptual space in which points are portrayed by the preferences they have accorded to a set of random provisional models. Visual data are then segmented exploiting agreement between point preferences thanks to a greedy bottom-up clustering approach.
A straightforward application of this techniques can be found in 3D reconstruction where multi-model fitting is adopted to organize unstructured cloud of points in geometric primitives, for example in planar pathces, in order to produce intermediate geometric interpretations of reconstructed 3D points. Also many other Computer Vision tasks take advantages of multi model fitting. Remaining in 3D reconstruction T-Linkage can also be employed to estimate multiple rigid moving objects and hence to initialize multi-body Structure from Motion. Other scenarios in which the estimation of multiple geometric models plays a primary role include face clustering, body-pose estimation and video motion segmentation; in all these cases visual data are geometrically described as linear subspaces of suitable dimensions.