multi-Task Feature Learning: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 5: Line 5:
=Learning sparse multi-task representations=
=Learning sparse multi-task representations=


Let <math>\,R</math> be the set of real numbers and <math>\,R_{+} (R_{++})</math> the non-negative ones. Let T be the number of tasks and let the tasks as <math>\,N_T := {1,\dots\,T}</math>. For each task <math>\,t \in N_T</math>, we have <math>\,m</math> input/output examples <math>\,(x_{t1}; y_{t1}),\dots,(x_{tm}; y_{tm}) \in R^d \time R</math>. The goal is then to estimate <math>\,T</math> functions <math>\,f_t : R^d \mapsto R, t \in N_T</math>, that approximate well the training data and are also statistically predictive for new data.
Let <math>\,R</math> be the set of real numbers and <math>\,R_{+} (R_{++})</math> the non-negative ones. Let T be the number of tasks and let the tasks as <math>\,N_T := {1,\dots,T}</math>. For each task <math>\,t \in N_T</math>, we have <math>\,m</math> input/output examples <math>\,(x_{t1}; y_{t1}),\dots,(x_{tm}; y_{tm}) \in R^d \times R</math>. The goal is then to estimate <math>\,T</math> functions <math>\,f_t : R^d \mapsto R, t \in N_T</math>, that approximate well the training data and are also statistically predictive for new data.

Revision as of 14:34, 24 November 2010

Introduction

It has been both empirically as well as theoretically shown that learning multiple related tasks simultaneously often significantly improves performance as compared to learning each task independently. Learning multiple related tasks simultaneously is especially beneficial when we only have a few data per task; and this benefit comes from pooling together data across many related tasks. One way that tasks can be related to each other is that some tasks share a common underlying representation; for example, people make product choices (e.g. of books, music CDs, etc.) using a common set of features describing these products. In this paper, the authors explored a way to learn a low-dimensional representation this is shared across multiple related tasks.

Learning sparse multi-task representations

Let [math]\displaystyle{ \,R }[/math] be the set of real numbers and [math]\displaystyle{ \,R_{+} (R_{++}) }[/math] the non-negative ones. Let T be the number of tasks and let the tasks as [math]\displaystyle{ \,N_T := {1,\dots,T} }[/math]. For each task [math]\displaystyle{ \,t \in N_T }[/math], we have [math]\displaystyle{ \,m }[/math] input/output examples [math]\displaystyle{ \,(x_{t1}; y_{t1}),\dots,(x_{tm}; y_{tm}) \in R^d \times R }[/math]. The goal is then to estimate [math]\displaystyle{ \,T }[/math] functions [math]\displaystyle{ \,f_t : R^d \mapsto R, t \in N_T }[/math], that approximate well the training data and are also statistically predictive for new data.