Task Understanding from Confushing Multitask Data: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 64: Line 64:
The max over <math>k</math> is taken because we need to determine which learned task corresponds to which ground-truth task.
The max over <math>k</math> is taken because we need to determine which learned task corresponds to which ground-truth task.


<math>\alpha_T(j)</math> again chooses <math>f_k</math>, the learned mapping function that is closest to the ground-truth of task <math>j</math>, and measures its average absolute accuracy compared to task <math>j</math> across all <math>m</math> observations.
<math>\alpha_T(j)</math> again chooses <math>f_k</math>, the learned mapping function that is closest to the ground-truth of task <math>j</math>, and measures its average absolute accuracy compared to the ground-truth of task <math>j</math>, <math>f_j</math>, across all <math>m</math> observations.
 
$$ \alpha_L(j) = \operatorname{max}_k\frac{1}{m}\sum_{i=1}^m 1-\dfrac{|g_k(x_i)|-|f_j(x_i)|}{|f_j(x_i)|}$$


==Results==
==Results==


==Application of Multi-label Learning==
==Application of Multi-label Learning==

Revision as of 18:39, 15 November 2020

Task Understanding from Confusing Multi-task Data

Presented By aslkdfj;awekrf

1. Introduction

hialll

Hello

[math]\displaystyle{ \begin{align*} e & = \pi = \sqrt{g} \end{align*} }[/math]


2. Related Work

How does formatting of paragraphs work? hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi

[math]\displaystyle{ \begin{align*} e & = \text{Hellow}\\ & = \dfrac{123}{4}\\ \end{align*} }[/math]


[math]\displaystyle{ \begin{align*} h+1 & = \dfrac{abc}{\text{def}}\\ & = \dfrac{123}{4}\\ \end{align*} }[/math]


Experiment

Setup

3 data sets are used to compare CSL to existing methods, 1 function regression task and 2 image classification tasks.

Function Regression: The function regression data comes in the form of [math]\displaystyle{ (x_i,y_i),i=1,...,m }[/math] pairs. However, unlike typical regression problems, there are multiple [math]\displaystyle{ f_j(x),j=1,...,n }[/math] mapping functions, so the goal is to recover both the mapping functions [math]\displaystyle{ f_j }[/math] as well as determine which mapping function corresponds to each of the [math]\displaystyle{ m }[/math] observations.

Colorful-MNIST: The first image classification data set consists of the MNIST digit data that has been colored. Each observation in this modified set consists of a colored image ([math]\displaystyle{ x_i }[/math]) and either the color, or the digit it represents ([math]\displaystyle{ y_i }[/math]). The goal is to recover the classification task ("color" or "digit") for each observation and construct the 2 classifiers for both tasks.

Kaggle Fashion Product: This data set has more observations than the "colored-MNIST" data and consists of pictures labelled with either the “Gender”, “Category”, and “Color” of the clothing item.

Use of Pre-Trained CNN Feature Layers

In the Kaggle Fashion Product experiment, each of the 3 classification algorithms [math]\displaystyle{ f_j }[/math] consist of fully-connected layers that have been attached to feature-identifying layers from pre-trained Convolutional Neural Networks.

Metrics of Confusing Supervised Learning

There are two measures of accuracy used to evaluate and compare CSL to other methods, corresponding respectively to the accuracy of the task labelling and the accuracy of the learned mapping function.

[math]\displaystyle{ \alpha_T(j) }[/math] is the average number of times the learned task-assignment function [math]\displaystyle{ h }[/math] agrees with the task-assignment ability of humans [math]\displaystyle{ \tilde h }[/math] on whether each observation in the data "is" or "is not" in task [math]\displaystyle{ j }[/math].

$$ \alpha_T(j) = \operatorname{max}_k\frac{1}{m}\sum_{i=1}^m I[h(x_i,y_i;f_k),\tilde h(x_i,y_i;f_j)]$$

The max over [math]\displaystyle{ k }[/math] is taken because we need to determine which learned task corresponds to which ground-truth task.

[math]\displaystyle{ \alpha_T(j) }[/math] again chooses [math]\displaystyle{ f_k }[/math], the learned mapping function that is closest to the ground-truth of task [math]\displaystyle{ j }[/math], and measures its average absolute accuracy compared to the ground-truth of task [math]\displaystyle{ j }[/math], [math]\displaystyle{ f_j }[/math], across all [math]\displaystyle{ m }[/math] observations.

$$ \alpha_L(j) = \operatorname{max}_k\frac{1}{m}\sum_{i=1}^m 1-\dfrac{|g_k(x_i)|-|f_j(x_i)|}{|f_j(x_i)|}$$

Results

Application of Multi-label Learning