conditional neural process: Difference between revisions

Revision as of 17:52, 18 November 2018

Introduction

To train a model effectively, deep neural networks require large datasets. To mitigate this data efficiency problem, learning in two phases is one approach : the first phase learns the statistics of a generic domain without committing to a specific learning task; the second phase learns a function for a specific task, but does so using only a small number of data points by exploiting the domain-wide statistics already learned.

For example, consider a data set [math]\displaystyle{ \{x_i, y_i\} }[/math] for [math]\displaystyle{ i = 0,..., n-1 }[/math] with evaluations [math]\displaystyle{ y_i = f(x_i) }[/math] for some unknown function [math]\displaystyle{ f }[/math]. Assume [math]\displaystyle{ g }[/math] is an approximating function of f. The aim is yo minimize the loss between [math]\displaystyle{ f }[/math] and [math]\displaystyle{ g }[/math] on the entire space [math]\displaystyle{ X }[/math]

Revision as of 17:34, 18 November 2018 (view source) S366chen (talk \| contribs) (→‎Introduction) ← Older edit		Revision as of 17:52, 18 November 2018 (view source) S366chen (talk \| contribs) (→‎Introduction) Newer edit →
Line 4:		Line 4:
	of a generic domain without committing to a specific learning task; the second phase learns a function for a specific task, but does so using only a small number of data points by exploiting the domain-wide statistics already learned.		of a generic domain without committing to a specific learning task; the second phase learns a function for a specific task, but does so using only a small number of data points by exploiting the domain-wide statistics already learned.

	For example, consider a data set <math display="inline"> \{x_i, y_i\} </math> for <math display="inline">i = 0,..., n-1</math>		For example, consider a data set <math display="inline"> \{x_i, y_i\} </math> for <math display="inline">i = 0,..., n-1</math> with evaluations <math display="inline">y_i = f(x_i) </math> for some unknown function <math display="inline">f</math>. Assume <math display="inline">g</math> is an approximating function of f. The aim is yo minimize the loss between <math display="inline">f</math> and <math display="inline">g</math> on the entire space <math display="inline">X</math>

conditional neural process: Difference between revisions

Revision as of 17:52, 18 November 2018

Introduction

Navigation menu

Search