Difference between revisions of "STAT946F17/Cognitive Psychology For Deep Neural Networks: A Shape Bias Case Study"

From statwiki
Jump to: navigation, search
Line 150: Line 150:
* Vinyals, Oriol, Blundell, Charles, Lillicrap, Timothy, Kavukcuoglu, Koray, and Wierstra, Daan. Matching networks for one shot learning. arXiv preprint arXiv:1606.04080, 2016.
* Vinyals, Oriol, Blundell, Charles, Lillicrap, Timothy, Kavukcuoglu, Koray, and Wierstra, Daan. Matching networks for one shot learning. arXiv preprint arXiv:1606.04080, 2016.
* Bloom, P. (2000). How children learn the meanings of words. The MIT Press.
* https://www.slideshare.net/KazukiFujikawa/matching-networks-for-one-shot-learning-71257100
* https://www.slideshare.net/KazukiFujikawa/matching-networks-for-one-shot-learning-71257100

Revision as of 17:43, 6 November 2017


The recent burgeon on the use of Deep Neural Networks (DNNs) have resulted in giant leaps of accuracy in prediction. They are also being used to solve a variety of complex tasks which earlier methodologies have struggled to excel in.

While it is all good to see incredibly high accuracy as a result of the use of DNN, we must begin to question why they perform so well. It has become an interesting field of study to actually represent the features/feature maps or interpret the meaning of the learnt values in a DNN's hidden layers. Currently we treat models of DNNs as black boxes which we practically tune the tweakable parameters like number of layers, number of units in each layer, number & size of feature maps(in case of CNN) etc. The opacity created by the lack of an intuitive representation of the internal learnt parameters of DNNs hinders both basic research as well as its application to real world problems.

Recent pushes have aimed to better understand DNNs: tailor-made loss functions and architectures produce more interpretable features (Higgins et al., 2016; Raposo et al., 2017) while output-behavior analyses unveil previously opaque operations of these networks (Karpathy et al., 2015). Parallel to this work, neuroscience-inspired methods such as activation visualization (Li et al., 2015), ablation analysis (Zeiler & Fergus, 2014) and activation maximization (Yosinski et al., 2015) have also been applied

This paper aims to provide another methodology to attempt to decipher & better understand how DNNs solve a particular task. This methodology was inspired by psychological concepts to test whether the DNN's were able to make accurate predictions with biases similar to that the human mind makes.

Research in developmental psychology shows that when learning new words, humans tend to assign the same name to similarly shaped items rather than to items with similar color, texture, or size. This bias/knowledge tend to be forged into the brains of humans and humans then take this forward to easily associate these shapes with new objects they have not seen before.

The authors of this paper try to simulate if DNNs behave similarly in one-shot learning applications. They attempt to prove that when the models of state-of-the-art DNNs are used to learn objects from images, they exhibit a stronger shape bias than a color bias. To emulate the human brain, they use the parameters of pre-trained DNN models and use this to perform one-shot learning on a new data set with different labels.


One Shot Learning

One-shot learning is an object categorization problem in computer vision. Whereas most machine learning based object categorization algorithms require training on hundreds or thousands of images and very large datasets, one-shot learning aims to learn information about object categories from one, or only a few, training images.

The one-shot word learning task is to label a novel data example $\hat{x}$ (e.g. a novel probe image) with a novel class label $\hat{y}$ (e.g. a new word) after only a single example.

More specifically, given a support set $S = {(x_i, y_i) , i \in [1, k]}$, of images $x_i$, and their associated labels $y_i$, and an unlabeled probe image $\hat{x}$, the one-shot learning task is to identify the true label of the probe image, $\hat{y}$, from the support set labels $ {y_i , i \in [1, k]} $:

$\displaystyle \hat{y} = arg \max_{y}$ $P(y | \hat{x}, S)$

We assume that the image labels $y_i$ are represented using a one-hot encoding and that $P(y|\hat{x}, S)$ is parameterised by a DNN, allowing us to leverage the ability of deep networks to learn powerful representations.

Inception Networks

A probe image $\hat{x}$ is given the label of the nearest neighbour from the support set:

$\hat{y} = y$

$(x, y) = \displaystyle arg \min_{(x_i,y_i) \in S} d(h(x_i), h(\hat{x})) $

where d is a distance function.

The function h is parameterized by Inception – one of the best performing ImageNet classification models. Specifically, h returns features from the last layer (the softmax input) of a pre-trained Inception classifier. With these features as input and cosine distance as the distance function, the classifier in achieves 87.6% accuracy on one-shot classification on the ImageNet dataset (Vinyals et al., 2016). We call the Inception classifier together with the nearest-neighbor component the Inception Baseline (IB) model.

Matching Networks

MNs (Vinyals et al.,2016) are neural network architectures with state-of-the-art one shot learning performance on ImageNet (93.2% one-shot labelling accuracy). MNs are trained to assign label $\hat{y}$ to probe image $\hat{x}$ using an attention mechanism a acting on image embeddings stored in the support set S:


where d is a cosine distance and where f and g provide context-dependent embeddings of $\hat{x}$ and $x_i$ (with contextS). The embedding $g(x_i, S)$ is a bi-directional LSTM (Hochreiter & Schmidhuber, 1997) with the support set S provided as an input sequence. The embedding $f(\hat{x}, S)$ is an LSTM with a read-attention mechanism operating over the entire embedded support set. The input to the LSTM is given by the penultimate layer features of a pre-trained deep convolutional network, specifically Inception.

To train MNs we proceed as follows:

Training MN

  • Step 1: At each step of training, the model is given a small support set of images and associated labels. In addition to the support set, the model is fed an unlabeled probe image $\hat{x}$
  • Step 2: The model parameters are then updated to improve classification accuracy of the probe image $\hat{x}$ given the support set. Parameters are updated using stochastic gradient descent with a learning rate of 0.1
  • Step 3: After each update, the labels ${(y_i, i \in [1, k]}$ in the training set are randomly re-assigned to new image classes (the label indices are randomly permuted,

but the image labels are not changed). This is a critical step. It prevents MNs from learning a consistent mapping between a category and a label. Usually, in classification, this is what we want, but in one-shot learning we want to train our model for classification after viewing a single in-class example from the support set.

The objective function used is:


where T is the set of all possible labelings of our classes, S is a support set sampled with a class labeling C ~ T and B is a batch of probe images and labels, also with the same randomly chosen class labeling as the support set.

Cognitive Biases

Cognitive bias is a concept from developmental psychology which attempts to explain how children can extract meanings of words with very few examples, similar to the concept of one-shot learning discussed above. The theory, as explained by the authors, is that humans form biases that allow them to eliminate many potential hypotheses about word meaning where the amount of data available is insufficient for this purpose. These include:

  • Whole object bias
  • Taxonomic bias
  • Mutual exclusivity bias
  • Shape bias

A more complete list of cognitive biases is given by (Bloom, 2000). The bias the authors investigate in this paper is the shape bias, which denotes a tendency to assign the same name to similarly shaped items rather than to items with similar color, texture, or size.


Inductive Biases & Probe Data

Inductive biases are those criteria which are artificially selected or learnt by the network as a classifying/distinguishing property. It has been observed that the biases that DNNs learnt are complex composite features. We, as researchers can take advantage of the fact that DNNs learnt complex distinguishing features by constructing probe data sets which particularly target on exposing a particular bias that a DNN might have.

  • Step 1: Take a known composite feature which we suspect the DNNs are biased against
  • Step 2: Train the target model with an appropriate dataset
  • Step 3: Transfer Learning: Use the pre-trained model with a new data set which is curated to contain data to prove/disprove the existence of the bias
  • Step 4: Model/Decide on a function which quantifies the bias under study
  • Step 5: Measure the bias with the bias function

Data Sets Used

  • Training Set: ImageNet
  • Test Set:
    • The Cognitive Psychology Probe Data (CogPsyc data) that is used consists of 150 images of objects. The images are arranged in triples consisting of a probe image, a shape-match image (that matches the probe in colour but not shape), and a color-match image (that matches the probe in shape but not colour). In the dataset there are 10 triples, each shown on 5 different backgrounds, giving a total of 50 triples.
    • A real-world dataset consisting of 90 images of objects (30 triples) collected using Google Image Search. The images are arranged in triples consisting of a probe, a shape-match and a colour-match.


Evaluation Criteria

  • For a given probe image $\hat{x}$, we loaded the shape-match image $x_s$ and corresponding label $y_s$, along with the colour-match image $x_c$ and corresponding label $y_c$ into memory, as the support set $S = \{(x_s, ys), (x_c, y_c)\}$
  • Calculate $\hat{y}$
  • The model assigns either $y_c$ or $y_s$ to the probe image.
  • To estimate the shape bias Bs, calculate the proportion of shape labels assigned to the probe: $B_s = E(\delta(\hat{y} - y_s))$

where E is an expectation across probe images and $\delta$ is the Dirac delta function.

Experiment 1: Shape bias statistics in Inception Baseline:

  • Shape bias of IB to be $B_s = 0.68$. Similarly, the shape bias of IB using our real-world dataset was $B_s = 0.97$. Together, these results strongly suggest that IB trained on ImageNet has a stronger bias towards shape than colour

Experiment 2: Shape bias statistics in Matching Network:

  • They found that MNs have a shape of bias $B_s = 0.7$ using the CogPsyc dataset and a bias of $Bs = 1$ using the real-world dataset. Once again, these results suggest that MNs trained seeding from Inception using ImageNet has a stronger bias towards shape than colour.

Experiment 3: Shape bias statistics between and across models:

The authors extended the shape bias analysis to calculate the shape bias in a population of IB models and in a population of MN models with different random initialization

Dependence on the initialization of parameters:

A strong variability was observed when variation in the initial values of the parameters. For the CogPsyc dataset, the average shape bias was $B_s = 0.628$ with standard deviation $\sigma B_s = 0.049$ at the end of training and for the real-world dataset the average shape bias was $B_s = 0:958$ with $\sigma B_s = 0.037$.

Dependence of shape bias on model performance:

For the CogPsych dataset, the correlation between bias and classification accuracy was $\rho = 0.15$, and for the real world dataset, correlation between bias and classification accuracy was $\rho = -0.06$. This would be evident since the accuracy of the models remained nearly constant when the initialization parameters varied whereas the shape bias tended to vary a lot, hence highlighting the lack of correlation amongst them.

Emergence of shape bias during training:

The shape bias spiked to a large value very early.

Variation of shape bias within models & across models:

With different initialization parameters, the shape bias varied a lot within IB during training while the shape bias did not fluctuate during the training of MN. It was found that the MN inherits the shape bias of the IB which seeded its embeddings and thereafter, the shape bias remained constant throughout training. It is important to note that the output of the penultimate layer of the Inception was not fine tuned when it was pipelined to the MN. This was to ensure that the MN properties were independent of the IB model properties.

Learnings, Inferences & Implications

  • Both the Inception Baseline and the Matching Network exhibit strong shape bias when trained on ImageNet. Researchers who use Inception & MN DNNs can now use this fact as a consideration for their application while using pre-trained models for new datasets. If it is known before hand that the new data set is strongly classifiable through a color bias, then they would either want to defer using the pre-trained models or explore methods to decrease/remove the strong shape bias.
  • There exists a high variability in the shape bias with the variation in the initialization parameters. This is an important finding since it uncovers the fact that the same architecture which exhibit similar accuracy in predictions can display a variety of shape bias just with different initialization parameters. Researchers can explore methods of tuning the random initialization such that the models start out with a low shape bias without compromising the accuracy of the model.
  • MNs inherit the shape bias which is seeded to it by the Inception Network's input embedding. This is also another fact which researchers & practitioners should be careful about. When using cascaded or pipelined heterogeneous architectures, the models downstream tend to inherit/become/are fed with the properties/biases of the models upstream. This may be desirable or undesirable according to the application, but it is important to be aware of its presence.
  • The biases under consideration are the property of the collection of the architecture, the dataset and the optimization procedure. Hence in order to increase or decrease the effect of a particular bias, one or more of the mentioned factors must be adjusted/tuned/changed.
  • The fact that a high shape bias emerged in the early epochs with less variability in further epochs can be thought of analogous to the biases that humans develop at an infancy which gets fortified as they age.

Conclusion, Future Work and Open questions

  • Just as cognitive psychology exposes the shape bias observed in this experiment, we should try to uncover other biases as well using multiple approaches
  • Study the underlying mechanisms which cause biases such as shape bias in DNNs
  • Research into various methods of probing and creating probe data sets which can be used to test architectures for various biases
  • Exploration into a research field called Artificial Cognitive Psychology which focuses on probing how DNN architectures can be understood further using known behaviors of the human brain


  • Ritter, Samuel & G. T. Barrett, David & Santoro, Adam & M. Botvinick, Matt. (2017). Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
  • Vinyals, Oriol, Blundell, Charles, Lillicrap, Timothy, Kavukcuoglu, Koray, and Wierstra, Daan. Matching networks for one shot learning. arXiv preprint arXiv:1606.04080, 2016.
  • Bloom, P. (2000). How children learn the meanings of words. The MIT Press.