conditional neural process: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 13: Line 13:


P be a probability distribution over functions  <math display="inline"> F : X \to Y</math>, formally known as a stochastic process. Thus, P defines a joint distribution over the random variables  <math display="inline"> {f(x_i)}_{i = 0} ^{n + m - 1}</math>. Therefore, for  <math display="inline"> P(f(x)|O, T)</math>, our task is to predict the output values  <math display="inline">f(x_i)</math> for  <math display="inline"> x_i \in T</math>, given  <math display="inline"> O</math>,
P be a probability distribution over functions  <math display="inline"> F : X \to Y</math>, formally known as a stochastic process. Thus, P defines a joint distribution over the random variables  <math display="inline"> {f(x_i)}_{i = 0} ^{n + m - 1}</math>. Therefore, for  <math display="inline"> P(f(x)|O, T)</math>, our task is to predict the output values  <math display="inline">f(x_i)</math> for  <math display="inline"> x_i \in T</math>, given  <math display="inline"> O</math>,


== Conditional Neural Process ==
== Conditional Neural Process ==


Conditional Neural Process models directly parametrize conditional stochastic processes without imposing consistency with respect to some prior process. CNP parametrize distributions over
Conditional Neural Process models directly parametrize conditional stochastic processes without imposing consistency with respect to some prior process. CNP parametrize distributions over

Revision as of 01:42, 19 November 2018

Introduction

To train a model effectively, deep neural networks require large datasets. To mitigate this data efficiency problem, learning in two phases is one approach : the first phase learns the statistics of a generic domain without committing to a specific learning task; the second phase learns a function for a specific task, but does so using only a small number of data points by exploiting the domain-wide statistics already learned.

For example, consider a data set [math]\displaystyle{ \{x_i, y_i\} }[/math] with evaluations [math]\displaystyle{ y_i = f(x_i) }[/math] for some unknown function [math]\displaystyle{ f }[/math]. Assume [math]\displaystyle{ g }[/math] is an approximating function of f. The aim is yo minimize the loss between [math]\displaystyle{ f }[/math] and [math]\displaystyle{ g }[/math] on the entire space [math]\displaystyle{ X }[/math]. In practice, the routine is evaluated on a finite set of observations.

In this work, they proposed a family of models that represent solutions to the supervised problem, and ab end-to-end training approach to learning them, that combine neural networks with features reminiscent if Gaussian Process. They call this family of models Conditional Neural Processes.


Model

Let training set be [math]\displaystyle{ O = \{x_i, y_i\}_{i = 0} ^ n-1 }[/math], and test set be [math]\displaystyle{ T = \{x_i, y_i\}_{i = n} ^ {n + m - 1} }[/math].

P be a probability distribution over functions [math]\displaystyle{ F : X \to Y }[/math], formally known as a stochastic process. Thus, P defines a joint distribution over the random variables [math]\displaystyle{ {f(x_i)}_{i = 0} ^{n + m - 1} }[/math]. Therefore, for [math]\displaystyle{ P(f(x)|O, T) }[/math], our task is to predict the output values [math]\displaystyle{ f(x_i) }[/math] for [math]\displaystyle{ x_i \in T }[/math], given [math]\displaystyle{ O }[/math],

Conditional Neural Process

Conditional Neural Process models directly parametrize conditional stochastic processes without imposing consistency with respect to some prior process. CNP parametrize distributions over