A Neural Representation of Sketch Drawings

From statwiki
Revision as of 23:07, 4 March 2018 by Tadenoud (talk | contribs) (Tadenoud moved page neural sketch drawings to A Neural Representation of Sketch Drawings: More accurately capture paper title)
Jump to navigation Jump to search

Introduction

There have been many recent advances in neural generative models for low resolution pixel-based images. Humans however, do not see the world in a grid of pixels and more typically communicate drawings of the things we see using a series of pen strokes that represent components of objects. These pen strokes are similar to the way vector-based images store data. This paper proposes a new method for creating conditional and unconditional generative models for creating these kinds of vector sketch drawings based on recurrent neural networks (RNNs). The paper explores many applications of these kinds of models, especially creative applications and makes available their unique dataset of vector images.

Related Work

Previous work related to sketch drawing generation includes methods that focussed primarily on converting input photographs into equivalent vector line drawings. Image generating models using neural networks also exist but focussed more on generation of pixel-based imagery. Some recent work has focussed on handwritten character generation using RNNs and Mixture Density Networks to generate continuous data points. This work has been extended somewhat recently to conditionally and unconditionally generate handwritten vectorized Chinese Kanji characters by modelling them as a series of pen strokes. Furthermore, this paper builds on work that employed Sequence-to-Sequence models with Variational Autencoders to model English sentences in latent vector space. One of the limiting factors for creating models that operate on vector datasets has been the dearth of publicly available data. Previously available datasets include: Sketch, a set of 20K vector drawings; Sketchy, a set of 70K vector drawings; and ShadowDraw, a set of 30K raster images with extracted vector drawings.

Methodology

Dataset

The “QuickDraw” dataset used in this research was assembled from 75K user drawings extracted from the game “Quick, Draw!” where users drew objects from one of hundreds of classes in 20 seconds or less. The dataset is split into 70K training samples and 2.5K validation and test samples each and represents each sketch a set of “pen stroke actions”. Each action is provided as a vector in the form [math]\displaystyle{ (\Delta x, \Delta y, p_{1}, p_{2}, p_{3}) }[/math]. For each vector, [math]\displaystyle{ \Delta x }[/math] and [math]\displaystyle{ \Delta y }[/math] give the movement of the pen from the previous point, with the initial location being the origin. The last three vector elements are a one-hot representation of pen states; [math]\displaystyle{ p_{1} }[/math] indicates that the pen is down and a line should be drawn between the current point and the next point, [math]\displaystyle{ p_{2} }[/math] indicates that the pen is up and no line should be drawn between the current point and the next point, and [math]\displaystyle{ p_{3} }[/math] indicates that the drawing is finished and subsequent points and the current point should not be drawn.

Sketch-RNN

Unconditional Generation

Training

Experiments

Conditional Reconstruction

Latent Space Interpolation

Sketch Drawing Analogies

Predicting Different Endings of Incomplete Sketches

Applications and Future Work

Conclusion

Criticisms

Source

Ha, D., & Eck, D. A neural representation of sketch drawings. In Proc. International Conference on Learning Representations (2018).