learning Hierarchical Features for Scene Labeling

From statwiki
Revision as of 15:00, 2 November 2015 by Seanny123 (talk | contribs) (Methodology)
Jump to: navigation, search


Test input: The input into the network was a static image such as the one below:

File:cows in field.png

Training data and desired result: The desired result (which is the same format as the training data given to the network for supervised learning) is an image with large features labelled.


Below we can see a flow of the overall approach.


Before being put into the Convolutional Neural Network (CNN) the image is first passed through a Laplacian image processing pyramid to acquire different scale maps. There were three different scale outputs of the image created.

Network Architecture

A typical three layer (convolution of kernel with feature map, non-linearity, pooling) CNN architecture was used. The function tanh served as the non-linearity. the kernel being used were Toeplitz matrices. The pooling operation was performed by the max-pool operator.

The connection weights were applied to all of the images, thus allowing for the detection of scale-invariant features.