イラストの研究

線画やラフスケッチのイラストはスパースだけではなく、色々な画風があるため、イラストの画像処理が困難である。さらに、精密な出力は求められているので、小さい間違いでも致命的である。本研究は反復作業を証言し、イラストレーターの応援することを目指す。

  • Smart Inker: ラフスケッチのペン入れ支援

    Smart Inker: ラフスケッチのペン入れ支援

    本研究では、深層学習を応用して対話的にラフスケッチのペン入れができるツール、スマートインカー(Smart Inker)を提案する。 スマートインカーは、途切れた線を自然につなぎ、不要な線を効率的に消すことが可能な“スマート”ツール機能をもち、自動出力された線画を効果的に修正することができる。 このような機能を実現するため、本手法ではデータ駆動型のアプローチを取る。スマートインカーは全層畳み込みニューラルネットワークにもとづいており、 このネットワークはユーザ編集とラフスケッチ両方を入力とし正確な線画を出力できるように学習させている。 これにより、様々な種類の複雑なラフスケッチに対して高精度かつリアルタイムの編集が可能となる。 これらのツールの学習のため、提案手法では2つの重要な技術を考案する。すなわち、ユーザ編集をシミュレーションして学習データを作成するデータ拡張手法、 および線画のベクタデータにより学習した細線化ネットワークを用いた線画標準化手法である。これらの手法とスケッチに特化したデータ拡張を組み合わせることで、 実際のユーザ編集データを用意することなく様々な編集パターンを含む学習データを大量に作成でき、効果的にそれぞれのネットワークを学習させることができる。 実際に提案ツールを用いてラフスケッチにペン入れをするユーザテストを行った結果、商用のイラスト制作ソフトに比べ提案ツールは簡単かつ短時間で線画作成が可能となり、 イラスト作成経験がほとんどないユーザでもきれいな線画を作成できることが確かめられた。

  • 敵対的データ拡張による自動線画化

    敵対的データ拡張による自動線画化

    本研究では、ラフスケッチの自動線画化を効果的に学習するための統合的なフレームワークを提案する。提案手法では、線画化ネットワークおよび線画識別ネットワークを構築し、線画識別ネットワークは本物の線画と線画化ネットワークによって作られた線画を区別するように、線画化ネットワークは出力した線画を識別ネットワークが区別できないように学習を行う。このアプローチには2つの利点がある。一つ目は、識別ネットワークは線画の「構造」を学習できるため、線画化ネットワークがより精細で本物に近い線画を出力できるようになる。二つ目は、対応関係のないラフスケッチと線画を学習に取り入れることができ、実世界の多様な教師なしデータを線画化ネットワークに学習させることができる点である。本学習フレームワークを用いることで、最新の線画化手法よりも精細で多様な線画化が可能となる。さらに、提案手法は入力画像のみをさらに学習することで、入力画像に対する線画化ネットワークの最適化を行うことができる。また、提案手法により、逆問題、すなわち線画から鉛筆画への変換も学習できることを示す。

  • ラフスケッチの自動線画化

    ラフスケッチの自動線画化

    本研究では、畳込みニューラルネットワークを用いてラフスケッチを線画に自動変換する手法を提案する。既存のスケッチ簡略化手法の多くは単純なラフスケッチのベクター画像のみを対象としており、スキャンした鉛筆画など、ラスター形式の複雑なラフスケッチを線画化するのは困難であった。これに対し提案手法では、3種類の畳込み層から構成されるニューラルネットワークモデルによって複雑なラフと線画の対応を学習することで、ラスター形式の様々なラフスケッチを良好に線画化することができる。提案モデルでは、任意のサイズやアスペクト比をもつ画像を入力として扱うことが可能であり、出力される線画は入力画像と同じサイズになる。また、このような多層構造をもつモデルを学習させるため、ラフスケッチと線画がペアになった新しいデータセットを構築し、モデルを効果的に学習させる方法を提案した。得られた結果についてユーザテストを行い、提案手法の性能が既存手法を大きく超えることを確認した。

論文

Data-Driven Ink Painting Brushstroke Rendering
Data-Driven Ink Painting Brushstroke Rendering
Koki Madono, Edgar Simo-Serra
Computer Graphics Forum (Pacific Graphics), 2023
Although digital painting has advanced much in recent years, there is still a significant divide between physically drawn paintings and purely digitally drawn paintings. These differences arise due to the physical interactions between the brush, ink, and paper, which are hard to emulate in the digital domain. Most ink painting approaches have focused on either using heuristics or physical simulation to attempt to bridge the gap between digital and analog, however, these approaches are still unable to capture the diversity of painting effects, such as ink fading or blotting, found in the real world. In this work, we propose a data-driven approach to generate ink paintings based on a semi-automatically collected high-quality real-world ink painting dataset. We use a multi-camera robot-based setup to automatically create a diversity of ink paintings, which allows for capturing the entire process in high resolution, including capturing detailed brush motions and drawing results. To ensure high-quality capture of the painting process, we calibrate the setup and perform occlusion-aware blending to capture all the strokes in high resolution in a robust and efficient way. Using our new dataset, we propose a recursive deep learning-based model to reproduce the ink paintings stroke by stroke while capturing complex ink painting effects such as bleeding and mixing. Our results corroborate the fidelity of the proposed approach to real hand-drawn ink paintings in comparison with existing approaches. We hope the availability of our dataset will encourage new research on digital realistic ink painting techniques.
@Article{MadonoPG2023,
   author    = {Koki Madono and Edgar Simo-Serra},
   title     = {{Data-Driven Ink Painting Brushstroke Rendering}},
   journal   = {Computer Graphics Forum (Pacific Graphics)},
   year      = 2023,
}
Diffusart: Enhancing Line Art Colorization with Conditional Diffusion Models
Hernan Carrillo, Michaël Clément, Aurélie Bugeau, Edgar Simo-Serra
Conference in Computer Vision and Pattern Recognition Workshops (CVPRW), 2023
Colorization of line art drawings is an important task in illustration and animation workflows. However, this highly laborious process is mainly done manually, limiting the creative productivity. This paper presents a novel interactive approach for line art colorization using conditional Diffusion Probabilistic Models (DPMs). In our proposed approach, the user provides initial color strokes for colorizing the line art. The strokes are then integrated into the conditional DPM-based colorization process by means of a coupled implicit and explicit conditioning strategy to generates diverse and high-quality colorized images. We evaluate our proposal and show it outperforms existing state-of-the-art approaches using the FID, LPIPS and SSIM metrics.
@InProceedings{CarrilloCVPRW2023,
   author    = {Hernan Carrillo and Micha\"el Cl/'ement and Aur\'elie Bugeau and Edgar Simo-Serra},
   title     = {{Diffusart: Enhancing Line Art Colorization with Conditional Diffusion Models}},
   booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)",
   year      = 2023,
}
Controllable Multi-domain Semantic Artwork Synthesis
Controllable Multi-domain Semantic Artwork Synthesis
Yuantian Huang, Satoshi Iizuka, Edgar Simo-Serra, Kazuhiro Fukui
Computational Visual Media, 2023
We present a novel framework for multi-domain synthesis of artwork from semantic layouts. One of the main limitations of this challenging task is the lack of publicly available segmentation datasets for art synthesis. To address this problem, we propose a dataset, which we call ArtSem, that contains 40,000 images of artwork from 4 different domains with their corresponding semantic label maps. We generate the dataset by first extracting semantic maps from landscape photography and then propose a conditional Generative Adversarial Network (GAN)-based approach to generate high-quality artwork from the semantic maps without necessitating paired training data. Furthermore, we propose an artwork synthesis model that uses domain-dependent variational encoders for high-quality multi-domain synthesis. The model is improved and complemented with a simple but effective normalization method, based on normalizing both the semantic and style jointly, which we call Spatially STyle-Adaptive Normalization (SSTAN). In contrast to previous methods that only take semantic layout as input, our model is able to learn a joint representation of both style and semantic information, which leads to better generation quality for synthesizing artistic images. Results indicate that our model learns to separate the domains in the latent space, and thus, by identifying the hyperplanes that separate the different domains, we can also perform fine-grained control of the synthesized artwork. By combining our proposed dataset and approach, we are able to generate user-controllable artwork that is of higher quality than existing approaches, as corroborated by both quantitative metrics and a user study.
@Article{HuangCVM2023,
   title     = {{Controllable Multi-domain Semantic Artwork Synthesis}},
   author    = {Yuantian Huang and Satoshi Iizuka and Edgar Simo-Serra and Kazuhiro Fukui},
   journal   = "Computational Visual Media",
   year      = 2023,
   volume    = 39,
   number    = 2,
}
General Virtual Sketching Framework for Vector Line Art
General Virtual Sketching Framework for Vector Line Art
Haoran Mo, Edgar Simo-Serra, Chengying Gao, Changqing Zou, Ruomei Wang
ACM Transactions on Graphics (SIGGRAPH), 2021
Vector line art plays an important role in graphic design, however, it is tedious to manually create. We introduce a general framework to produce line drawings from a wide variety of images, by learning a mapping from raster image space to vector image space. Our approach is based on a recurrent neural network that draws the lines one by one. A differentiable rasterization module allows for training with only supervised raster data. We use a dynamic window around a virtual pen while drawing lines, implemented with a proposed aligned cropping and differentiable pasting modules. Furthermore, we develop a stroke regularization loss that encourages the model to use fewer and longer strokes to simplify the resulting vector image. Ablation studies and comparisons with existing methods corroborate the efficiency of our approach which is able to generate visually better results in less computation time, while generalizing better to a diversity of images and applications.
@Article{HaoranSIGGRAPH2021,
   author    = {Haoran Mo and Edgar Simo-Serra and Chengying Gao and Changqing Zou and Ruomei Wang},
   title     = {{General Virtual Sketching Framework for Vector Line Art}},
   journal   = "ACM Transactions on Graphics (SIGGRAPH)",
   year      = 2021,
   volume    = 40,
   number    = 4,
}
User-Guided Line Art Flat Filling with Split Filling Mechanism
User-Guided Line Art Flat Filling with Split Filling Mechanism
Lvmin Zhang, Chengze Li, Edgar Simo-Serra, Yi Ji, Tien-Tsin Wong, Chunping Liu
Conference in Computer Vision and Pattern Recognition (CVPR), 2021
Flat filling is a critical step in digital artistic content creation with the objective of filling line arts with flat colours. We present a deep learning framework for user-guided line art flat filling that can compute the "influence areas" of the user colour scribbles, i.e., the areas where the user scribbles should propagate and influence. This framework explicitly controls such scribble influence areas for artists to manipulate the colours of image details and avoid colour leakage/contamination between scribbles, and simultaneously, leverages data-driven colour generation to facilitate content creation. This framework is based on a Split Filling Mechanism (SFM), which first splits the user scribbles into individual groups and then independently processes the colours and influence areas of each group with a Convolutional Neural Network (CNN). Learned from more than a million illustrations, the framework can estimate the scribble influence areas in a content-aware manner, and can smartly generate visually pleasing colours to assist the daily works of artists. We show that our proposed framework is easy to use, allowing even amateurs to obtain professional-quality results on a wide variety of line arts.
@InProceedings{ZhangCVPR2021,
   author    = {Lvmin Zhang and Chengze Li and Edgar Simo-Serra and Yi Ji and Tien-Tsin Wong and Chunping Liu},
   title     = {{User-Guided Line Art Flat Filling with Split Filling Mechanism}},
   booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR)",
   year      = 2021,
}
Line Art Colorization with Concatenated Spatial Attention
Mingcheng Yuan, Edgar Simo-Serra
Conference in Computer Vision and Pattern Recognition Workshops (CVPRW), 2021
Line art plays a fundamental role in illustration and design, and allows for iteratively polishing designs. However, as they lack color, they can have issues in conveying final designs. In this work, we propose an interactive colorization approach based on a conditional generative adversarial network that takes both the line art and color hints as inputs to produce a high-quality colorized image. Our approach is based on a U-net architecture with a multi-discriminator framework. We propose a Concatenation and Spatial Attention module that is able to generate more consistent and higher quality of line art colorization from user given hints. We evaluate on a large-scale illustration dataset and comparison with existing approaches corroborate the effectiveness of our approach.
@InProceedings{YuanCVPRW2021,
   author    = {Mingcheng Yuan and Edgar Simo-Serra},
   title     = {{Line Art Colorization with Concatenated Spatial Attention}},
   booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)",
   year      = 2021,
}
Generating Digital Painting Lighting Effects via RGB-space Geometry
Generating Digital Painting Lighting Effects via RGB-space Geometry
Lvmin Zhang, Edgar Simo-Serra, Yi Ji, Chunping Liu
ACM Transactions on Graphics (Presented at SIGGRAPH), 2020
We present an algorithm to generate digital painting lighting effects from a single image. Our algorithm is based on a key observation: artists use many overlapping strokes to paint lighting effects, i.e., pixels with dense stroke history tend to gather more illumination strokes. Based on this observation, we design an algorithm to both estimate the density of strokes in a digital painting using color geometry, and then generate novel lighting effects by mimicking artists' coarse-to-fine workflow. Coarse lighting effects are first generated using a wave transform, and then retouched according to the stroke density of the original illustrations into usable lighting effects.
Our algorithm is content-aware, with generated lighting effects naturally adapting to image structures, and can be used as an interactive tool to simplify current labor-intensive workflows for generating lighting effects for digital and matte paintings. In addition, our algorithm can also produce usable lighting effects for photographs or 3D rendered images. We evaluate our approach with both an in-depth qualitative and a quantitative analysis which includes a perceptual user study. Results show that our proposed approach is not only able to produce favorable lighting effects with respect to existing approaches, but also that it is able to significantly reduce the needed interaction time.
@Article{ZhangTOG2020,
   author    = {Lvmin Zhang and Edgar Simo-Serra and Yi Ji and Chunping Liu},
   title     = {{Generating Digital Painting Lighting Effects via RGB-space Geometry}},
   journal   = "Transactions on Graphics (Presented at SIGGRAPH)",
   year      = 2020,
   volume    = 39,
   number    = 2,
}
Real-Time Data-Driven Interactive Rough Sketch Inking
Real-Time Data-Driven Interactive Rough Sketch Inking
Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa
ACM Transactions on Graphics (SIGGRAPH), 2018
We present an interactive approach for inking, which is the process of turning a pencil rough sketch into a clean line drawing. The approach, which we call the Smart Inker, consists of several "smart" tools that intuitively react to user input, while guided by the input rough sketch, to efficiently and naturally connect lines, erase shading, and fine-tune the line drawing output. Our approach is data-driven: the tools are based on fully convolutional networks, which we train to exploit both the user edits and inaccurate rough sketch to produce accurate line drawings, allowing high-performance interactive editing in real-time on a variety of challenging rough sketch images. For the training of the tools, we developed two key techniques: one is the creation of training data by simulation of vague and quick user edits; the other is a line normalization based on learning from vector data. These techniques, in combination with our sketch-specific data augmentation, allow us to train the tools on heterogeneous data without actual user interaction. We validate our approach with an in-depth user study, comparing it with professional illustration software, and show that our approach is able to reduce inking time by a factor of 1.8x while improving the results of amateur users.
@Article{SimoSerraSIGGRAPH2018,
   author    = {Edgar Simo-Serra and Satoshi Iizuka and Hiroshi Ishikawa},
   title     = {{Real-Time Data-Driven Interactive Rough Sketch Inking}},
   journal   = "ACM Transactions on Graphics (SIGGRAPH)",
   year      = 2018,
   volume    = 37,
   number    = 4,
}
Learning to Restore Deteriorated Line Drawing
Learning to Restore Deteriorated Line Drawing
Kazuma Sasaki, Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa
The Visual Computer (Proc. of Computer Graphics International), 2018
We propose a fully automatic approach to restore aged old line drawings. We decompose the task into two subtasks: the line extraction subtask, which aims to extract line fragments and remove the paper texture background, and the restoration subtask, which fills in possible gaps and deterioration of the lines to produce a clean line drawing. Our approach is based on a convolutional neural network that consists of two sub-networks corresponding to the two subtasks. They are trained as part of a single framework in an end-to-end fashion. We also introduce a new dataset consisting of manually annotated sketches by Leonardo da Vinci which, in combination with a synthetic data generation approach, allows training the network to restore deteriorated line drawings. We evaluate our method on challenging 500-year-old sketches and compare with existing approaches with a user study, in which it is found that our approach is preferred 72.7% of the time.
@Article{SasakiCGI2018,
   author    = {Sasaki Kazuma and Satoshi Iizuka and Edgar Simo-Serra and Hiroshi Ishikawa}},
   title     = {{Learning to Restore Deteriorated Line Drawing}},
   journal   = "The Visual Computer (Proc. of Computer Graphics International)",
   year      = {2018},
   volume    = {34},
   pages     = {1077--1085},
}
Mastering Sketching: Adversarial Augmentation for Structured Prediction
Mastering Sketching: Adversarial Augmentation for Structured Prediction
Edgar Simo-Serra*, Satoshi Iizuka*, Hiroshi Ishikawa (* equal contribution)
ACM Transactions on Graphics (Presented at SIGGRAPH), 2018
We present an integral framework for training sketch simplification networks that convert challenging rough sketches into clean line drawings. Our approach augments a simplification network with a discriminator network, training both networks jointly so that the discriminator network discerns whether a line drawing is a real training data or the output of the simplification network, which in turn tries to fool it. This approach has two major advantages. First, because the discriminator network learns the structure in line drawings, it encourages the output sketches of the simplification network to be more similar in appearance to the training sketches. Second, we can also train the simplification network with additional unsupervised data, using the discriminator network as a substitute teacher. Thus, by adding only rough sketches without simplified line drawings, or only line drawings without the original rough sketches, we can improve the quality of the sketch simplification. We show how our framework can be used to train models that significantly outperform the state of the art in the sketch simplification task, despite using the same architecture for inference. We additionally present an approach to optimize for a single image, which improves accuracy at the cost of additional computation time. Finally, we show that, using the same framework, it is possible to train the network to perform the inverse problem, i.e., convert simple line sketches into pencil drawings, which is not possible using the standard mean squared error loss. We validate our framework with two user tests, where our approach is preferred to the state of the art in sketch simplification 92.3% of the time and obtains 1.2 more points on a scale of 1 to 5.
@Article{SimoSerraTOG2018,
   author    = {Edgar Simo-Serra and Satoshi Iizuka and Hiroshi Ishikawa},
   title     = {{Mastering Sketching: Adversarial Augmentation for Structured Prediction}},
   journal   = "Transactions on Graphics (Presented at SIGGRAPH)",
   year      = 2018,
   volume    = 37,
   number    = 1,
}
Joint Gap Detection and Inpainting of Line Drawings
Joint Gap Detection and Inpainting of Line Drawings
Kazuma Sasaki, Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa
Conference in Computer Vision and Pattern Recognition (CVPR), 2017
We propose a novel data-driven approach for automatically detecting and completing gaps in line drawings with a Convolutional Neural Network. In the case of existing inpainting approaches for natural images, masks indicating the missing regions are generally required as input. Here, we show that line drawings have enough structures that can be learned by the CNN to allow automatic detection and completion of the gaps without any such input. Thus, our method can find the gaps in line drawings and complete them without user interaction. Furthermore, the completion realistically conserves thickness and curvature of the line segments. All the necessary heuristics for such realistic line completion are learned naturally from a dataset of line drawings, where various patterns of line completion are generated on the fly as training pairs to improve the model generalization. We evaluate our method qualitatively on a diverse set of challenging line drawings and also provide quantitative results with a user study, where it significantly outperforms the state of the art.
@InProceedings{SasakiCVPR2017,
   author    = {Kazuma Sasaki Satoshi Iizuka and Edgar Simo-Serra and Hiroshi Ishikawa},
   title     = {{Joint Gap Detection and Inpainting of Line Drawings}},
   booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR)",
   year      = 2017,
}
Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup
Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup
Edgar Simo-Serra*, Satoshi Iizuka*, Kazuma Sasaki, Hiroshi Ishikawa (* equal contribution)
ACM Transactions on Graphics (SIGGRAPH), 2016
In this paper, we present a novel technique to simplify sketch drawings based on learning a series of convolution operators. In contrast to existing approaches that require vector images as input, we allow the more general and challenging input of rough raster sketches such as those obtained from scanning pencil sketches. We convert the rough sketch into a simplified version which is then amendable for vectorization. This is all done in a fully automatic way without user intervention. Our model consists of a fully convolutional neural network which, unlike most existing convolutional neural networks, is able to process images of any dimensions and aspect ratio as input, and outputs a simplified sketch which has the same dimensions as the input image. In order to teach our model to simplify, we present a new dataset of pairs of rough and simplified sketch drawings. By leveraging convolution operators in combination with efficient use of our proposed dataset, we are able to train our sketch simplification model. Our approach naturally overcomes the limitations of existing methods, e.g., vector images as input and long computation time; and we show that meaningful simplifications can be obtained for many different test cases. Finally, we validate our results with a user study in which we greatly outperform similar approaches and establish the state of the art in sketch simplification of raster images.
@Article{SimoSerraSIGGRAPH2016,
   author    = {Edgar Simo-Serra and Satoshi Iizuka and Kazuma Sasaki and Hiroshi Ishikawa},
   title     = {{Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup}},
   journal   = "ACM Transactions on Graphics (SIGGRAPH)",
   year      = 2016,
   volume    = 35,
   number    = 4,
}

ソフトウェア

Sketch Simplification Network
Sketch Simplification Network, 1.0 (2017年12月)
Sketch Simplification Convolutional Neural Network
This code is the implementation of the "Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup" and "Mastering Sketching: Adversarial Augmentation for Structured Prediction" papers. It contains pre-trained models and example usage code.

データセット

Da Vinci Dataset
Da Vinci Dataset
Annotated line drawing sketches drawn by Leonardo Da Vinci.
We present a line drawing restoration dataset which consists of 71 line drawing sketches by Leonardo Da Vinci.