a neural representation of sketch drawings

From statwiki
Revision as of 02:30, 16 November 2018 by S498chen (talk | contribs)
Jump to navigation Jump to search

Introduction

In this paper, The authors presented a recurrent neural network: sketch-rnn to construct stroke-based drawings. Besides new robust training methods, they also outlined a framework for conditional and unconditional sketch generation.

Neural networks had been heavily used as image generation tools, for example, Generative Adversarial Networks, Variational Inference and Autoregressive models. Most of those models were focusing on modelling pixels of the images. However, people learn to draw using sequences of strokes since very young ages. The authors decided to use this character to create a new model that utilize strokes of the images as a new approach to vector images generations and abstract concept generalization.

Related Work

Methodology

Dataset

Sketch-RNN

Unconditional Generation

Training

Experiments

Conditional Reconstruction

Latent Space Interpolation

Sketch Drawing Analogies

Predicting Different Endings of Incomplete Sketches

Applications and Future Work

Conclusion

References

  1. Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).


fonts and examples

The unsupervised translation scheme has the following outline:

  • The word-vector embeddings of the source and target languages are aligned in an unsupervised manner.
  • Sentences from the source and target language are mapped to a common latent vector space by an encoder, and then mapped to probability distributions over

The objective function is the sum of:

  1. The de-noising auto-encoder loss,

I shall describe these in the following sections.

Alt text
From Conneau et al. (2017). The final row shows the performance of alignment method used in the present paper. Note the degradation in performance for more distant languages.
Alt text
From the present paper. Results of an ablation study. Of note are the first, third, and forth rows, which demonstrate that while the translation component of the loss is relatively unimportant, the word vector alignment scheme and de-noising auto-encoder matter a great deal.