Gradient Episodic Memory for Continual Learning: Difference between revisions

From statwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 14: Line 14:
where <math>\ell :\mathcal {Y} \times \mathcal {Y} \to [0, \infty)</math>
where <math>\ell :\mathcal {Y} \times \mathcal {Y} \to [0, \infty)</math>


Different to machine learning, datas are being observed sequentially ,occurred recurrently and stored limitedly for learning humans. Thus, the iid assumption is not applicable to ERM.   
Different to machine learning, datas are being observed sequentially, occurred recurrently, and stored limitedly for learning humans. Thus, the iid assumption is not applicable to ERM.   


The major downside of ERM is "catastrophic forgetting", which is the problem of recalling past knowledge upon acquiring new ones. Hence, Gradient Episodic Memory (GEM) is introduced to alleviates forgetting on previous acquired knowledge, while solving new problems more efficiently.
ERM also has "catastrophic forgetting", which is the problem of recalling past knowledge upon acquiring new ones. Hence, Gradient Episodic Memory (GEM) is introduced to alleviates forgetting on previous acquired knowledge, while solving new problems more efficiently.

Revision as of 02:44, 17 November 2018

Group Member

Yu Xuan Lee, Tsen Yee Heng

Background and Introduction

Supervised learning consist of a training set [math]\displaystyle{ D_{tx}=(x_i,y_i)^n_{i=1} }[/math], where [math]\displaystyle{ x_i \in X }[/math] and [math]\displaystyle{ y_i \in Y }[/math]. Empirical Risk Minimization (ERM) is one of the common supervised learning method used to minimize a loss function by having multiple passes over the training set.

[math]\displaystyle{ \frac{1}{|D_{tr}|}\textstyle \sum_{x_i,y_i} \in D_{tr} \ell (f(x_i),y_i) }[/math]

where [math]\displaystyle{ \ell :\mathcal {Y} \times \mathcal {Y} \to [0, \infty) }[/math]

Different to machine learning, datas are being observed sequentially, occurred recurrently, and stored limitedly for learning humans. Thus, the iid assumption is not applicable to ERM.

ERM also has "catastrophic forgetting", which is the problem of recalling past knowledge upon acquiring new ones. Hence, Gradient Episodic Memory (GEM) is introduced to alleviates forgetting on previous acquired knowledge, while solving new problems more efficiently.