Model
Our model is based on a fully convolutional neural network. We input the model a rough sketch image and obtain as an output a clean simplified sketch. This is done by processing the image with convolutional layers, which can be seen as banks of filters that are run on the input. While the input is a grayscale image, our model internally uses a much larger representation. We build the model upon three different types of convolutions: down-convolution, halves the resolution by using a stride of two; flat-convolutional, processes the image without changing the resolution; and up-convolution, doubles the resolution by using a stride of one half. This allows our model to initially compress the image into a smaller representation, process the small image, and finally expand it into the simplified clean output image that can easily be vectorized.
Results
We evaluate extensively on complicated real scanned sketches and show that our approach is able to significantly outperform the state of the art. We corroborate results with a user test in which we see that our model significantly outperforms vectorization approaches. Images (a), (b), and (d) are part of our test set, while images (c) and (e) were taken from Flickr. Image (c) courtesy of Anna Anjos and image (e) courtesy of Yama Q under creative commons licensing.
Comparison
We perform a user study and compare against vectorization tools that work directly on raster images. In particular we consider the open-source Potrace and the commercial Adobe Live Trace. Users prefer our approach over 97% of the time with respect to either of the two tools.
For more details and results, please consult the full paper.
This research was partially funded by JST CREST.