learning Hierarchical Features for Scene Labeling

From statwiki
Jump to navigation Jump to search

Introduction

Test input: The input into the network was a static image such as the one below:

File:cows in field.png

Training data and desired result: The desired result (which is the same format as the training data given to the network for supervised learning) is an image with large features labelled.

Methodology

Below we can see a flow of the overall approach.

Pre-processing

Before being put into the Convolutional Neural Network (CNN) the image is first passed through a Laplacian image processing pyramid to acquire different scale maps. There were three different scale outputs of the image created.

Network Architecture

A typical three layer (convolution of kernel with feature map, non-linearity, pooling) CNN architecture was used. The function tanh served as the non-linearity. the kernel being used were Toeplitz matrices. The pooling operation was performed by the max-pool operator.

The connection weights were applied to all of the images, thus allowing for the detection of scale-invariant features.

Post-Processing

Results