Feature Extraction Research
Feature extraction is a fundamental part of data processing which focuses on converting raw data into compact and useful representations, and has a wide applicability to many different types of problems. The research here focuses on extracting useful, compact, and discriminative features for a diversity of problems such as patch matching, or finding similar styles to certain images.
-
Deep Convolutional Feature Point Descriptors
We learn compact discriminative feature point descriptors using a convolutional neural network. We directly optimize for using L2 distance by training with a pair of corresponding and non-corresponding patches correspond to small and large distances respectively using a Siamese architecture. We deal with the large number of potential pairs with the combination of a stochastic sampling of the training set and an aggressive mining strategy biased towards patches that are hard to classify. The resulting descriptor is 128 dimensions that can be used as a drop-in replacement for any task involving SIFT. We show that this descriptor generalizes well to various datasets.
-
Geodesic Finite Mixture Models
There are many cases in which data is found to be distributed on a Riemannian manifold. In these cases, Euclidean metrics are not applicable and one needs to resort to geodesic distances consistent with the manifold geometry. For this purpose, we draw inspiration on a variant of the expectation-maximization algorithm, that uses a minimum message length criterion to automatically estimate the optimal number of components from multivariate data lying on an Euclidean space. In order to use this approach on Riemannian manifolds, we propose a formulation in which each component is defined on a different tangent space, thus avoiding the problems associated with the loss of accuracy produced when linearizing the manifold with a single tangent space. Our approach can be applied to any type of manifold for which it is possible to estimate its tangent space.
-
Deformation and Light Invariant Descriptor
DaLI descriptors are local image patch representations that have been shown to be robust to deformation and strong illumination changes. These descriptors are constructed by treating the image patch as a 3D surface and then simulating the diffusion of heat along the surface for different intervals of time. Small time intervals represent local deformation properties while large time intervals represent global deformation properties. Additionally, by performing a logarithmic sampling and then a Fast Fourier Transform, it is possible to obtain robustness against non-linear illumination changes. We have created the first feature point dataset that focuses on deformation and illumination changes of real world objects in order to perform evaluation, where we show the DaLI descriptors outperform all the widely used descriptors.
Publications
@Article{SimoSerraIJCV2016, author = {Edgar Simo-Serra and Carme Torras and Francesc Moreno Noguer}, title = {{3D Human Pose Tracking Priors using Geodesic Mixture Models}}, journal = {International Journal of Computer Vision (IJCV)}, volume = {122}, number = {2}, pages = {388--408}, year = 2016, }
@InProceedings{SimoSerraCVPR2016, author = {Edgar Simo-Serra and Hiroshi Ishikawa}, title = {{Fashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction}}, booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR)", year = 2016, }
@InProceedings{SimoSerraICCV2015, author = {Edgar Simo-Serra and Eduard Trulls and Luis Ferraz and Iasonas Kokkinos and Pascal Fua and Francesc Moreno-Noguer}, title = {{Discriminative Learning of Deep Convolutional Feature Point Descriptors}}, booktitle = "Proceedings of the International Conference on Computer Vision (ICCV)", year = 2015, }
@InProceedings{SimoSerraMVA2015, author = {Edgar Simo-Serra and Carme Torras and Francesc Moreno-Noguer}, title = {{Lie Algebra-Based Kinematic Prior for 3D Human Pose Tracking}}, booktitle = "International Conference on Machine Vision Applications (MVA)", year = 2015, }
@Article{SimoSerraIJCV2015, author = {Edgar Simo-Serra and Carme Torras and Francesc Moreno Noguer}, title = {{DaLI: Deformation and Light Invariant Descriptor}}, journal = {International Journal of Computer Vision (IJCV)}, volume = {115}, number = {2}, pages = {136--154}, year = 2015, }
@InProceedings{SimoSerraBMVC2014, author = {Edgar Simo-Serra and Carme Torras and Francesc Moreno-Noguer}, title = {{Geodesic Finite Mixture Models}}, booktitle = "Proceedings of the British Machine Vision Conference (BMVC)", year = 2014, }