Wasserstein Auto-Encoders: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 3: Line 3:
Recent years have seen a convergence of two previously distinct approaches: representation learning from high dimensional data, and unsupervised generative modeling. In the field that formed at their intersection, Variational Auto-Encoders (VAEs) and Generative Adversarial Networks (GANs) have emerged to become well-established. VAEs are theoretically elegant but with the drawback that they tend to generate blurry samples when applied to natural images. GANs on the other hand produce better visual quality of sampled images, but come without an encoder, are harder to train and suffer from the mode-collapse problem when the trained model is unable to capture all the variability in the true data distribution. Thus there has been a push to come up with the best way to combine them together, but a principled unifying framework is yet to be discovered.
Recent years have seen a convergence of two previously distinct approaches: representation learning from high dimensional data, and unsupervised generative modeling. In the field that formed at their intersection, Variational Auto-Encoders (VAEs) and Generative Adversarial Networks (GANs) have emerged to become well-established. VAEs are theoretically elegant but with the drawback that they tend to generate blurry samples when applied to natural images. GANs on the other hand produce better visual quality of sampled images, but come without an encoder, are harder to train and suffer from the mode-collapse problem when the trained model is unable to capture all the variability in the true data distribution. Thus there has been a push to come up with the best way to combine them together, but a principled unifying framework is yet to be discovered.


This work proposes a new family of regularized auto-encoders called the Wasserstein Auto-Encoder (WAE), and proceed to show that they are a generalization of previously investigated adversarial auto-encoders.
This work proposes a new family of regularized auto-encoders called the Wasserstein Auto-Encoder (WAE), and proceed to show that they are a generalization of the previously investigated adversarial auto-encoders. They provide novel theoretical insights into setting up an objective function for auto-encoders from the point of view of of optimal transport (OT). Their theoretical formulation leads them to examine adversarial and maximum mean discrepancy based regularizers for matching a prior and the distribution of encoded data points in the latent space.


= Motivation =
= Motivation =

Revision as of 22:04, 11 March 2018

Introduction

Recent years have seen a convergence of two previously distinct approaches: representation learning from high dimensional data, and unsupervised generative modeling. In the field that formed at their intersection, Variational Auto-Encoders (VAEs) and Generative Adversarial Networks (GANs) have emerged to become well-established. VAEs are theoretically elegant but with the drawback that they tend to generate blurry samples when applied to natural images. GANs on the other hand produce better visual quality of sampled images, but come without an encoder, are harder to train and suffer from the mode-collapse problem when the trained model is unable to capture all the variability in the true data distribution. Thus there has been a push to come up with the best way to combine them together, but a principled unifying framework is yet to be discovered.

This work proposes a new family of regularized auto-encoders called the Wasserstein Auto-Encoder (WAE), and proceed to show that they are a generalization of the previously investigated adversarial auto-encoders. They provide novel theoretical insights into setting up an objective function for auto-encoders from the point of view of of optimal transport (OT). Their theoretical formulation leads them to examine adversarial and maximum mean discrepancy based regularizers for matching a prior and the distribution of encoded data points in the latent space.

Motivation

Proposed Method

Conclusion