Image Processing Research

Image processing consists of extracting useful information from images. The research here focuses on a wide variety of tasks from estimating the 3D pose of humans from a single image to the automatic colorization of black and white images.

  • Globally and Locally Consistent Image Completion

    Globally and Locally Consistent Image Completion

    We present a novel approach for image completion that results in images that are both locally and globally consistent. With a fully-convolutional neural network, we can complete images of arbitrary resolutions by filling-in missing regions of any shape. To train this image completion network to be consistent, we use global and local context discriminators that are trained to distinguish real images from completed ones. The global discriminator looks at the entire image to assess if it is coherent as a whole, while the local discriminator looks only at a small area centered at the completed region to ensure the local consistency of the generated patches. The image completion network is then trained to fool the both context discriminator networks, which requires it to generate images that are indistinguishable from real ones with regard to overall consistency as well as in details. We show that our approach can be used to complete a wide variety of scenes. Furthermore, in contrast with the patch-based approaches such as PatchMatch, our approach can generate fragments that do not appear elsewhere in the image, which allows us to naturally complete the images of objects with familiar and highly specific structures, such as faces.

  • Colorization of Black and White Images

    Colorization of Black and White Images

    We present a novel technique to automatically colorize grayscale images that combines both global priors and local image features. Based on Convolutional Neural Networks, our deep network features a fusion layer that allows us to elegantly merge local information dependent on small image patches with global priors computed using the entire image. The entire framework, including the global and local priors as well as the colorization model, is trained in an end-to-end fashion. Furthermore, our architecture can process images of any resolution, unlike most existing approaches based on CNN. We leverage an existing large-scale scene classification database to train our model, exploiting the class labels of the dataset to more efficiently and discriminatively learn the global priors. We validate our approach with a user study and compare against the state of the art, where we show significant improvements. Furthermore, we demonstrate our method extensively on many different types of images, including black-and-white photography from over a hundred years ago, and show realistic colorizations.

  • Monocular Single Image 3D Human Pose Estimation

    Monocular Single Image 3D Human Pose Estimation

    This line of research focuses on the estimation of the 3D pose of humans from single monocular images. This is an extremely difficult problem due to the large number of ambiguities that rise from the projection of 3D objects to the image plane. We consider image evidence derived from the usage of different detectors for the different parts of the body, which results in noisy 2D estimations where the estimation uncertainty must be compensation. In order to deal with these issues, we propose different approaches using discriminative and generative models to enforce learnt anthropomorphism constraints. We show that by exploiting prior knowledge of human kinematics it is possible to overcome these ambiguities and obtain good pose estimation performance.

Publications

  • Adaptive Energy Selection For Content-Aware Image Resizing
    • Adaptive Energy Selection For Content-Aware Image Resizing
    • Kazuma Sasaki, Yuya Nagahama, Zheng Ze, Satoshi Iizuka, Edgar Simo-Serra, Yoshihiko Mochizuki, Hiroshi Ishikawa
    • Asian Conference on Pattern Recognition (ACPR), 2017
  • Globally and Locally Consistent Image Completion
  • Detection by Classification of Buildings in Multispectral Satellite Imagery
    • Detection by Classification of Buildings in Multispectral Satellite Imagery
    • Tomohiro Ishii, Edgar Simo-Serra, Satoshi Iizuka, Yoshihiko Mochizuki, Akihiro Sugimoto, Hiroshi Ishikawa, Ryosuke Nakamura
    • International Conference on Pattern Recognition (ICPR), 2016
  • BASS: Boundary-Aware Superpixel Segmentation
    • BASS: Boundary-Aware Superpixel Segmentation
    • Antonio Rubio, Longlong Yu, Edgar Simo-Serra, Francesc Moreno-Noguer
    • International Conference on Pattern Recognition (ICPR), 2016
  • Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification
    • Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification
    • Satoshi Iizuka*, Edgar Simo-Serra*, Hiroshi Ishikawa (* equal contribution)
    • ACM Transactions on Graphics (SIGGRAPH), 2016
  • A Joint Model for 2D and 3D Pose Estimation from a Single Image
    • A Joint Model for 2D and 3D Pose Estimation from a Single Image
    • Edgar Simo-Serra, Ariadna Quattoni, Carme Torras, Francesc Moreno-Noguer
    • Conference in Computer Vision and Pattern Recognition (CVPR), 2013
  • Single Image 3D Human Pose Estimation from Noisy Observations
    • Single Image 3D Human Pose Estimation from Noisy Observations
    • Edgar Simo-Serra, Arnau Ramisa, Guillem Alenyà, Carme Torras, Francesc Moreno-Noguer
    • Conference in Computer Vision and Pattern Recognition (CVPR), 2012

Source Code

  • Inpainting Network
    • Inpainting Network, 1.0 (Feb, 2018)
    • Globally and locally consistent image completion network
    • Satoshi Iizuka and Edgar Simo Serra
  • Colorization Network
  • bttc