conditional neural process: Difference between revisions

Revision as of 23:42, 18 November 2018

Introduction

To train a model effectively, deep neural networks require large datasets. To mitigate this data efficiency problem, learning in two phases is one approach : the first phase learns the statistics of a generic domain without committing to a specific learning task; the second phase learns a function for a specific task, but does so using only a small number of data points by exploiting the domain-wide statistics already learned.

For example, consider a data set [math]\displaystyle{ \{x_i, y_i\} }[/math] with evaluations [math]\displaystyle{ y_i = f(x_i) }[/math] for some unknown function [math]\displaystyle{ f }[/math]. Assume [math]\displaystyle{ g }[/math] is an approximating function of f. The aim is yo minimize the loss between [math]\displaystyle{ f }[/math] and [math]\displaystyle{ g }[/math] on the entire space [math]\displaystyle{ X }[/math]. In practice, the routine is evaluated on a finite set of observations.

In this work, they proposed a family of models that represent solutions to the supervised problem, and ab end-to-end training approach to learning them, that combine neural networks with features reminiscent if Gaussian Process. They call this family of models Conditional Neural Processes.

Model

Let training set be [math]\displaystyle{ O = \{x_i, y_i\}_{i = 0} ^ n-1 }[/math], and test set be [math]\displaystyle{ T = \{x_i, y_i\}_{i = n} ^ {n + m - 1} }[/math].

We assume the outputs are obtained by the following steps :

P be a probability distribution over functions [math]\displaystyle{ F : X \to Y }[/math]

Revision as of 23:42, 18 November 2018 (view source) S366chen (talk \| contribs) No edit summary ← Older edit		Revision as of 23:42, 18 November 2018 (view source) S366chen (talk \| contribs) (→‎Model) Newer edit →
Line 14:		Line 14:
	We assume the outputs are obtained by the following steps :		We assume the outputs are obtained by the following steps :

	P be a probability distribution over functions <math display="inline"> F : X \ti Y</math>		P be a probability distribution over functions <math display="inline"> F : X \to Y</math>

conditional neural process: Difference between revisions

Revision as of 23:42, 18 November 2018

Introduction

Model

Navigation menu

Search