Image Processing Research
Image processing consists of extracting useful information from images. The research here focuses on a wide variety of tasks from estimating the 3D pose of humans from a single image to the automatic colorization of black and white images.
-
Globally and Locally Consistent Image Completion
We present a novel approach for image completion that results in images that are both locally and globally consistent. With a fully-convolutional neural network, we can complete images of arbitrary resolutions by filling-in missing regions of any shape. To train this image completion network to be consistent, we use global and local context discriminators that are trained to distinguish real images from completed ones. The global discriminator looks at the entire image to assess if it is coherent as a whole, while the local discriminator looks only at a small area centered at the completed region to ensure the local consistency of the generated patches. The image completion network is then trained to fool the both context discriminator networks, which requires it to generate images that are indistinguishable from real ones with regard to overall consistency as well as in details. We show that our approach can be used to complete a wide variety of scenes. Furthermore, in contrast with the patch-based approaches such as PatchMatch, our approach can generate fragments that do not appear elsewhere in the image, which allows us to naturally complete the images of objects with familiar and highly specific structures, such as faces.
-
Colorization of Black and White Images
We present a novel technique to automatically colorize grayscale images that combines both global priors and local image features. Based on Convolutional Neural Networks, our deep network features a fusion layer that allows us to elegantly merge local information dependent on small image patches with global priors computed using the entire image. The entire framework, including the global and local priors as well as the colorization model, is trained in an end-to-end fashion. Furthermore, our architecture can process images of any resolution, unlike most existing approaches based on CNN. We leverage an existing large-scale scene classification database to train our model, exploiting the class labels of the dataset to more efficiently and discriminatively learn the global priors. We validate our approach with a user study and compare against the state of the art, where we show significant improvements. Furthermore, we demonstrate our method extensively on many different types of images, including black-and-white photography from over a hundred years ago, and show realistic colorizations.
-
Monocular Single Image 3D Human Pose Estimation
This line of research focuses on the estimation of the 3D pose of humans from single monocular images. This is an extremely difficult problem due to the large number of ambiguities that rise from the projection of 3D objects to the image plane. We consider image evidence derived from the usage of different detectors for the different parts of the body, which results in noisy 2D estimations where the estimation uncertainty must be compensation. In order to deal with these issues, we propose different approaches using discriminative and generative models to enforce learnt anthropomorphism constraints. We show that by exploiting prior knowledge of human kinematics it is possible to overcome these ambiguities and obtain good pose estimation performance.
Publications
@InProceedings{LinWACV2024, author = {Shan Lin and Edgar Simo-Serra}, title = {{Restoring Degraded Old Films with Recursive Recurrent Transformer Networks}}, booktitle = "Proceedings of the Winter Conference on Applications of Computer Vision (WACV)", year = 2024, }
@Inproceedings{HaoSIGGRAPHASIA2023, author = {Guoqing Hao and Satoshi Iizuka and Kensho Hara and Edgar Simo-Serra and Hirokatsu Kataoka and Kazuhiro Fukui}, title = {{Diffusion-based Holistic Texture Rectification and Synthesis}}, booktitle = "ACM SIGGRAPH Asia 2023 Conference Papers", year = 2023, }
@InProceedings{CarrilloCVPRW2023, author = {Hernan Carrillo and Micha\"el Cl/'ement and Aur\'elie Bugeau and Edgar Simo-Serra}, title = {{Diffusart: Enhancing Line Art Colorization with Conditional Diffusion Models}}, booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)", year = 2023, }
@InProceedings{SasugaMICCAIW2022, author = {Saeko Sasuga and Akira Kudo and Yoshiro Kitamura and Satoshi Iizuka and Edgar Simo-Serra and Atsushi Hamabe and Masayuki Ishii and Ichiro Takemasa}, title = {{Image Synthesis-based Late Stage Cancer Augmentation and Semi-Supervised Segmentation for MRI Rectal Cancer Staging}}, booktitle = "Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention Workshops (MICCAIW)", year = 2022, }
@InProceedings{YuanCVPRW2021, author = {Mingcheng Yuan and Edgar Simo-Serra}, title = {{Line Art Colorization with Concatenated Spatial Attention}}, booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)", year = 2021, }
@InProceedings{TanakaCVPRW2021, author = {Tsunehiko Tanaka and Edgar Simo-Serra}, title = {{LoL-V2T: Large-Scale Esports Video Description Dataset}}, booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)", year = 2021, }
@InProceedings{HoriuchiCVPRW2021, author = {Yusuke Horiuchi and Edgar Simo-Serra and Satoshi Iizuka and Hiroshi Ishikawa}, title = {{Differentiable Rendering-based Pose-Conditioned Human Image Generation}}, booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)", year = 2021, }
@InProceedings{MasuzawaMICCAI2020, author = {Naoto Masuzawa and Yoshiro Kitamura and Keigo Nakamura and Satoshi Iizuka and Edgar Simo-Serra}, title = {{Automatic Segmentation, Localization and Identification of Vertebrae in 3D CT Images Using Cascaded Convolutional Neural Networks}}, booktitle = "Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI)", year = 2020, }
@InProceedings{KeshwaniMICCAI2020, author = {Deepak Keshwani and Yoshiro Kitamura and Satoshi Ihara and Satoshi Iizuka and Edgar Simo-Serra}, title = {{TopNet: Topology Preserving Metric Learning for Vessel Tree Reconstruction and Labelling}}, booktitle = "Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI)", year = 2020, }
@InProceedings{YokooCVPRW2020, author = {Shuhei Yokoo and Kohei Ozaki and Edgar Simo-Serra and Satoshi Iizuka}, title = {{Two-stage Discriminative Re-ranking for Large-scale Landmark Retrieval}}, booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)", year = 2020, }
@Article{IizukaSIGGRAPHASIA2019, author = {Satoshi Iizuka and Edgar Simo-Serra}, title = {{DeepRemaster: Temporal Source-Reference Attention Networks for Comprehensive Video Enhancement}}, journal = "ACM Transactions on Graphics (SIGGRAPH Asia)", year = 2019, volume = 38, number = 6, }
@InProceedings{ShinyaICCVW2019, author = {Yosuke Shinya and Edgar Simo-Serra and Taiji Suzuki}, title = {{Understanding the Effects of Pre-training for Object Detectors via Eigenspectrum}}, booktitle = "Proceedings of the International Conference on Computer Vision Workshops (ICCVW)", year = 2019, }
@InProceedings{KudoMICCAIW2019, author = {Akira Kudo and Yoshiro Kitamura and Yuanzhong Li and Satoshi Iizuka and Edgar Simo-Serra}, title = {{Virtual Thin Slice: 3D Conditional GAN-based Super-resolution for CT Slice Interval}}, booktitle = "Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention Workshops (MICCAIW)", year = 2019, }
@InProceedings{OmiyaCVPRW2019, author = {Mayu Omiya and Yusuke Horiuchi and Edgar Simo-Serra and Satoshi Iizuka and Hiroshi Ishikawa}, title = {{Optimization-Based Data Generation for Photo Enhancement}}, booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)", year = 2019, }
@InProceedings{OmiyaSIGGRAPASIABRIEF2018, author = {Mayu Omiya and Edgar Simo-Serra and Satoshi Iizuka and Hiroshi Ishikawa}, title = {{Learning Photo Enhancement by Black-Box Model Optimization Data Generation}}, booktitle = "SIGGRAPH Asia 2018 Technical Briefs", year = 2018, }
@InProceedings{SasakiACPR2017, author = {Kazuma Sasaki and Yuya Nagahama and Zheng Ze and Satoshi Iizuka and Edgar Simo-Serra and Yoshihiko Mochizuki and Hiroshi Ishikawa}, title = {{Adaptive Energy Selection For Content-Aware Image Resizing}}, booktitle = "Proceedings of the Asian Conference on Pattern Recognition (ACPR)", year = 2017, }
@Article{IizukaSIGGRAPH2017, author = {Satoshi Iizuka and Edgar Simo-Serra and Hiroshi Ishikawa}, title = {{Globally and Locally Consistent Image Completion}}, journal = "ACM Transactions on Graphics (SIGGRAPH)", year = 2017, volume = 36, number = 4, }
@InProceedings{RubioICPR2016, author = {Antonio Rubio and Longlong Yu and Edgar Simo-Serra and Francesc Moreno-Noguer}, title = {{BASS: Boundary-Aware Superpixel Segmentation}}, booktitle = "Proceedings of the International Conference on Pattern Recognition (ICPR)", year = 2016, }
@InProceedings{IshiiICPR2016, author = {Tomohiro Ishii and Edgar Simo-Serra and Satoshi Iizuka and Yoshihiko Mochizuki and Akihiro Sugimoto and Hiroshi Ishikawa and Ryosuke Nakamura}, title = {{Detection by Classification of Buildings in Multispectral Satellite Imagery}}, booktitle = "Proceedings of the International Conference on Pattern Recognition (ICPR)", year = 2016, }
@Article{IizukaSIGGRAPH2016, author = {Satoshi Iizuka and Edgar Simo-Serra and Hiroshi Ishikawa}, title = {{Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification}}, journal = "ACM Transactions on Graphics (SIGGRAPH)", year = 2016, volume = 35, number = 4, }
@InProceedings{SimoSerraCVPR2013, author = {Edgar Simo-Serra and Ariadna Quattoni and Carme Torras and Francesc Moreno-Noguer}, title = {{A Joint Model for 2D and 3D Pose Estimation from a Single Image}}, booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR)", year = 2013, }
@InProceedings{SimoSerraCVPR2012, author = {Edgar Simo-Serra and Arnau Ramisa and Guillem Aleny\`a and Carme Torras and Francesc Moreno-Noguer}, title = {{Single Image 3D Human Pose Estimation from Noisy Observations}}, booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR)", year = 2012, }