CRITICAL ANALYSIS OF SELF-SUPERVISION: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 8: Line 8:
== Method ==
== Method ==


In the self-supervision methods, the hypothesis function without target labels is defined.
In the self-supervision methods, a hypothesis function should be defined without target labels.
Let <math> x <math> be a sample from the unlabeled dataset. The weights of the CNN are learnt in a way that minimizes <math> |h(x)-x| <math> where $h(x)$ is a hypothesis function, i.e. BiGAN, RoTNet and DeepCluster.
Let <math> x </math> be a sample from the unlabeled dataset. The weights of the CNN are learnt in a way that minimizes <math> ||h(x)-x|| </math> where <math> h(x) </math> is a hypothesis function, i.e. BiGAN, RoTNet and DeepCluster.
AlexNet as CNN, and various methods of data augmentation including cropping, rotation, scaling, contrast changes, and adding noise, have been used in this paper.
AlexNet as CNN, and various methods of data augmentation including cropping, rotation, scaling, contrast changes, and adding noise, have been used in this paper.
To measure the quality of features, they train a linear classifier on top of each convolutional layer of AlexNet to find whether features are linearly separable. In general, the main purpose of CNN is to reach a linearly separable representation for images.  
To measure the quality of features, they train a linear classifier on top of each convolutional layer of AlexNet to find whether features are linearly separable. In general, the main purpose of CNN is to reach a linearly separable representation for images.  

Revision as of 19:33, 26 November 2020

Presented by

Maral Rasoolijaberi

Introduction

Previous Work

Method

In the self-supervision methods, a hypothesis function should be defined without target labels. Let [math]\displaystyle{ x }[/math] be a sample from the unlabeled dataset. The weights of the CNN are learnt in a way that minimizes [math]\displaystyle{ ||h(x)-x|| }[/math] where [math]\displaystyle{ h(x) }[/math] is a hypothesis function, i.e. BiGAN, RoTNet and DeepCluster. AlexNet as CNN, and various methods of data augmentation including cropping, rotation, scaling, contrast changes, and adding noise, have been used in this paper. To measure the quality of features, they train a linear classifier on top of each convolutional layer of AlexNet to find whether features are linearly separable. In general, the main purpose of CNN is to reach a linearly separable representation for images. Next, they compared the results of a million images in the ImageNet dataset with a million augmented imaged generated from a single image.

results

Conclusion

Critiques

References