Task Understanding from Confusing Multi-task Data
Presented By
Qianlin Song, William Loh, Junyue Bai, Phoebe Choi
Introduction
Related Work
Confusing Supervised Learning
Confusing supervised learning (CSL) offers a solution to the issue at hand. A major area of improvement can be seen in the choice of risk measure. In traditional supervised learning, assuming the risk measure is mean squared error (MSE), the expected risk functional is
$$ R(g) = \int_x (f(x) - g(x))^2 p(x) \; \mathrm{d}x $$
where [math]\displaystyle{ p(x) }[/math] is the prior distribution of the input variable [math]\displaystyle{ x }[/math]. In practice, model optimizations are performed using the empirical risk
$$ R_e(g) = \sum_{i=1}^n (y_i - g(x_i))^2 $$
When the problem involves different tasks, the model should optimize for each data point depending on the given task. Let [math]\displaystyle{ f_j(x) }[/math] be the true ground-truth function for each task [math]\displaystyle{ j }[/math]. Therefore, for some input variable [math]\displaystyle{ x_i }[/math], an ideal model [math]\displaystyle{ g }[/math] would predict [math]\displaystyle{ g(x_i) = f_j(x_i) }[/math]. With this, the risk functional can be modified to fit this new task for traditional supervised learning methods.
$$ R(g) = \int_x \sum_{j=1}^n (f_j(x) - g(x))^2 p(f_j) p(x) \; \mathrm{d}x $$
We call [math]\displaystyle{ (f_j(x) - g(x))^2 p(f_j) }[/math] the confusing multiple mappings. Then the optimal solution [math]\displaystyle{ g^*(x) }[/math] to the mapping is [math]\displaystyle{ \bar{f}(x) = \sum_{j=1}^n p(f_j) f_j(x) }[/math] under this risk functional. However, the optimal solution is not conditional on the specific task at hand but rather on the entire ground-truth functions. Therefore, for every non-trivial set of tasks where [math]\displaystyle{ f_k(x) \neq f_\ell(x) }[/math] for some input [math]\displaystyle{ x }[/math], [math]\displaystyle{ R(g^*) \gt 0 }[/math] which implies that there is an unavoidable confusion risk.
To overcome this issue, the authors introduce two types of learning functions:
- Deconfusing function — allocation of which samples come from the same task
- Mapping function — mapping relation from input to output of every learned task
CSL-Net
Experiment
Conclusion
Critique
References
Su, Xin, et al. "Task Understanding from Confusing Multi-task Data."