Task Understanding from Confushing Multitask Data: Difference between revisions
Line 58: | Line 58: | ||
There are two measures of accuracy used to evaluate and compare CSL to other methods, corresponding respectively to the accuracy of the task labelling and the accuracy of the learned mapping function. | There are two measures of accuracy used to evaluate and compare CSL to other methods, corresponding respectively to the accuracy of the task labelling and the accuracy of the learned mapping function. | ||
<math>\alpha_T(j)</math> is the average number of times the learned task-assignment function <math>h</math> agrees with the task-assignment ability of humans <math>\tilde h</math> on whether each observation in the data "is" or "is not" in task <math>j</math>. | '''Label Assignment''' Accuracy: <math>\alpha_T(j)</math> is the average number of times the learned task-assignment function <math>h</math> agrees with the task-assignment ability of humans <math>\tilde h</math> on whether each observation in the data "is" or "is not" in task <math>j</math>. | ||
$$ \alpha_T(j) = \operatorname{max}_k\frac{1}{m}\sum_{i=1}^m I[h(x_i,y_i;f_k),\tilde h(x_i,y_i;f_j)]$$ | $$ \alpha_T(j) = \operatorname{max}_k\frac{1}{m}\sum_{i=1}^m I[h(x_i,y_i;f_k),\tilde h(x_i,y_i;f_j)]$$ | ||
Line 64: | Line 64: | ||
The max over <math>k</math> is taken because we need to determine which learned task corresponds to which ground-truth task. | The max over <math>k</math> is taken because we need to determine which learned task corresponds to which ground-truth task. | ||
<math>\alpha_T(j)</math> again chooses <math>f_k</math>, the learned mapping function that is closest to the ground-truth of task <math>j</math>, and measures its average absolute accuracy compared to the ground-truth of task <math>j</math>, <math>f_j</math>, across all <math>m</math> observations. | '''Mapping Function''' Accuracy: <math>\alpha_T(j)</math> again chooses <math>f_k</math>, the learned mapping function that is closest to the ground-truth of task <math>j</math>, and measures its average absolute accuracy compared to the ground-truth of task <math>j</math>, <math>f_j</math>, across all <math>m</math> observations. | ||
$$ \alpha_L(j) = \operatorname{max}_k\frac{1}{m}\sum_{i=1}^m 1-\dfrac{|g_k(x_i)-f_j(x_i)|}{|f_j(x_i)|}$$ | $$ \alpha_L(j) = \operatorname{max}_k\frac{1}{m}\sum_{i=1}^m 1-\dfrac{|g_k(x_i)-f_j(x_i)|}{|f_j(x_i)|}$$ |
Revision as of 17:41, 15 November 2020
Task Understanding from Confusing Multi-task Data
Presented By aslkdfj;awekrf
1. Introduction
hialll
Hello
[math]\displaystyle{ \begin{align*} e & = \pi = \sqrt{g} \end{align*} }[/math]
2. Related Work
How does formatting of paragraphs work? hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi
[math]\displaystyle{ \begin{align*} e & = \text{Hellow}\\ & = \dfrac{123}{4}\\ \end{align*} }[/math]
[math]\displaystyle{
\begin{align*}
h+1 & = \dfrac{abc}{\text{def}}\\
& = \dfrac{123}{4}\\
\end{align*}
}[/math]
Experiment
Setup
3 data sets are used to compare CSL to existing methods, 1 function regression task and 2 image classification tasks.
Function Regression: The function regression data comes in the form of [math]\displaystyle{ (x_i,y_i),i=1,...,m }[/math] pairs. However, unlike typical regression problems, there are multiple [math]\displaystyle{ f_j(x),j=1,...,n }[/math] mapping functions, so the goal is to recover both the mapping functions [math]\displaystyle{ f_j }[/math] as well as determine which mapping function corresponds to each of the [math]\displaystyle{ m }[/math] observations.
Colorful-MNIST: The first image classification data set consists of the MNIST digit data that has been colored. Each observation in this modified set consists of a colored image ([math]\displaystyle{ x_i }[/math]) and either the color, or the digit it represents ([math]\displaystyle{ y_i }[/math]). The goal is to recover the classification task ("color" or "digit") for each observation and construct the 2 classifiers for both tasks.
Kaggle Fashion Product: This data set has more observations than the "colored-MNIST" data and consists of pictures labelled with either the “Gender”, “Category”, and “Color” of the clothing item.
Use of Pre-Trained CNN Feature Layers
In the Kaggle Fashion Product experiment, each of the 3 classification algorithms [math]\displaystyle{ f_j }[/math] consist of fully-connected layers that have been attached to feature-identifying layers from pre-trained Convolutional Neural Networks.
Metrics of Confusing Supervised Learning
There are two measures of accuracy used to evaluate and compare CSL to other methods, corresponding respectively to the accuracy of the task labelling and the accuracy of the learned mapping function.
Label Assignment Accuracy: [math]\displaystyle{ \alpha_T(j) }[/math] is the average number of times the learned task-assignment function [math]\displaystyle{ h }[/math] agrees with the task-assignment ability of humans [math]\displaystyle{ \tilde h }[/math] on whether each observation in the data "is" or "is not" in task [math]\displaystyle{ j }[/math].
$$ \alpha_T(j) = \operatorname{max}_k\frac{1}{m}\sum_{i=1}^m I[h(x_i,y_i;f_k),\tilde h(x_i,y_i;f_j)]$$
The max over [math]\displaystyle{ k }[/math] is taken because we need to determine which learned task corresponds to which ground-truth task.
Mapping Function Accuracy: [math]\displaystyle{ \alpha_T(j) }[/math] again chooses [math]\displaystyle{ f_k }[/math], the learned mapping function that is closest to the ground-truth of task [math]\displaystyle{ j }[/math], and measures its average absolute accuracy compared to the ground-truth of task [math]\displaystyle{ j }[/math], [math]\displaystyle{ f_j }[/math], across all [math]\displaystyle{ m }[/math] observations.
$$ \alpha_L(j) = \operatorname{max}_k\frac{1}{m}\sum_{i=1}^m 1-\dfrac{|g_k(x_i)-f_j(x_i)|}{|f_j(x_i)|}$$