Gradient Episodic Memory for Continual Learning: Difference between revisions

From statwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 4: Line 4:


==  Background and Introduction ==
==  Background and Introduction ==
Supervised learning consist of a training set <math>D_{tx}={(x_i,y_i)}^n_{i=1}</math>, where <math>x_i \in X</math> and <math>y_i \in Y</math>.  
Supervised learning consist of a training set <math>D_{tx}={(x_i,y_i)}^n_{i=1}</math>, where <math>x_i \in X</math> and <math>y_i \in Y</math>. Empirical Risk Minimization (ERM) is one of the common supervised learning method used to minimize a loss function by having multiple passes over the training set.
<center>
<math>
1/|D_{tr}|\textstyle \sum_{x_i,y_i) \in D_{tr}} \ell (f(x_i),y_i)
</math>
</center>
where <math>\ell :math cal{Y} \times math cal{Y} ->[0, \infty)</math>
 
Gradient Episodic Memory (GEM) is a continual learning model that alleviates forgetting on previous acquired knowledge, while solving new problems more efficiently.
Gradient Episodic Memory (GEM) is a continual learning model that alleviates forgetting on previous acquired knowledge, while solving new problems more efficiently.

Revision as of 00:48, 17 November 2018

Group Member

Yu Xuan Lee, Tsen Yee Heng

Background and Introduction

Supervised learning consist of a training set [math]\displaystyle{ D_{tx}={(x_i,y_i)}^n_{i=1} }[/math], where [math]\displaystyle{ x_i \in X }[/math] and [math]\displaystyle{ y_i \in Y }[/math]. Empirical Risk Minimization (ERM) is one of the common supervised learning method used to minimize a loss function by having multiple passes over the training set.

[math]\displaystyle{ 1/|D_{tr}|\textstyle \sum_{x_i,y_i) \in D_{tr}} \ell (f(x_i),y_i) }[/math]

where [math]\displaystyle{ \ell :math cal{Y} \times math cal{Y} -\gt [0, \infty) }[/math]

Gradient Episodic Memory (GEM) is a continual learning model that alleviates forgetting on previous acquired knowledge, while solving new problems more efficiently.