Sketch Simplification Network

This code is the implementation of the “Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup” and “Mastering Sketching: Adversarial Augmentation for Structured Prediction” papers. It contains pre-trained models and example usage code.

Sketch Simplification Network
  • Type: library
  • Version: Dec, 2017
  • Language: python
  • License: CC-by-sa-nc 4.0
  • Dependencies pytorch, torchvision, pillow


This code provides an implementation of the research paper:

Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup
Edgar Simo-Serra, Satoshi Iizuka, Kazuma Sasaki, Hiroshi Ishikawa
ACM Transactions on Graphics (SIGGRAPH), 2016


Mastering Sketching: Adversarial Augmentation for Structured Prediction
Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa
ACM Transactions on Graphics (TOG), 2018

We learn to automatically color grayscale images with a deep network. Our network learns both local features and global features jointly in a single framework. Our approach can then be used on images of any resolution. By incorporating global features we are able to obtain realistic colorings with our model.

See our project page for more detailed information.


Copyright (C) <2017> <Edgar Simo-Serra and Satoshi Iizuka>

This work is licensed under the Creative Commons
Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy
of this license, visit or
send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

Satoshi Iizuka, Waseda University,
Edgar Simo-Serra, Waseda University,


All packages should be part of a standard PyTorch install. For information on how to install PyTorch please refer to the torch website.


Before the first usage, the models have to be downloaded with:


Next test the models with:


You should see a file called out.png created with the output of the model.

Application options can be seen with:

python --help


  • model_mse.t7: Model trained using only MSE loss (SIGGRAPH 2016 model).
  • model_gan.t7: Model trained with MSE and GAN loss using both supervised and unsupervised training data (TOG 2018 model).

Reproducing Paper Figures

For replicability we include code to replicate the figures in the paper. After downloading the models you can run it with:


This will convert the input images in figs/ and save the output in out/. We note that there are small differences with the results in the paper due to hardware differences and small differences in the torch/pytorch implementations. Furthermore, results are shown without the post-processing mentioned in the notes at the bottom of this document.

Please note that we do not have the copyright for all these images and in general only non-commercial research usage is permitted. In particular, fig16_eisaku.png, fig06_eisaku_robo.png, fig06_eisaku_joshi.png, and fig01_eisaku.png are copyright by Eisaku Kubonouchi (@EISAKUSAKU) and only non-commercial research usage is allowed. The imagesfig14_pepper.png and fig06_pepper.png are licensed by David Revoy under CC-by 4.0.


  • Models are in Torch7 format and loaded using the PyTorch legacy code.
  • This was developed and tested on various machines from late 2015 to end of 2016.
  • Provided models are under a non-commercial creative commons license.
  • Post-processing is not performed. You can perform it manually with convert out.png bmp:- | mkbitmap - -t 0.3 -o - | potrace --svg --group -t 15 -o - > out.svg.


If you use these models please cite:

  author    = {Edgar Simo-Serra and Satoshi Iizuka and Kazuma Sasaki and Hiroshi Ishikawa},
  title     = {{Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup}},
  journal   = "ACM Transactions on Graphics (SIGGRAPH)",
  year      = 2016,
  volume    = 35,
  number    = 4,


  author    = {Edgar Simo-Serra and Satoshi Iizuka and Hiroshi Ishikawa},
  title     = {{Mastering Sketching: Adversarial Augmentation for Structured Prediction}},
  journal   = "ACM Transactions on Graphics (TOG)",
  year      = 2018,
  volume    = 37,
  number    = 1,


This work was partially supported by JST CREST Grant Number JPMJCR14D1 and JST ACT-I Grant Number JPMJPR16UD and JPMJPR16U3.