Homework 3

In this homework you will train your first deep network on actual images from supertux. You will extend the non-linear multi-layer perceptron trained in the previous assignment.

starter code data

Classifying supertux

The goal of this assignment is to classify images from supertux. You’re given a dataset of 64x64 RGB images of objects cropped from supertux. The goal of this assignment is to classify these images into 6 classes:

Objects (0)
Tiles (1)
Tux (2)
Bad guys (3)
Bonus (4)
Projectiles (5)

We will solve the classification problem in a few different ways. Some will work better, some worse.

First, let’s look at the architecture.

Multi-layer perceptron

First you’ll define the compute graph of your multi-layer perceptron, and the corresponding loss function. In this assignment we will treat all images as vectors of size 12288 (64643) and flatten them first. The network should have one hidden layer of size 100. Anything larger than 100 will likely be very slow to train, anything less than 100 might not perform all that well. You can use the solution of homework 2 to help build your network. Here is an overview of the architecture:

Loss functions

In this assignment you’ll try four different losses for the classification problem:

L2 regression to the label $\|o - g\|^2$
L2 regression to a one-hot encoding $\|l - q\|^2$
Softmax + log-likelihood $\log p(g)$
Softmax + L2-regression to one-hot $\|p - q\|^2$

where $o$ is a one dimensional regression output (of your network), $l$ is a 6-dimensional output (of your network), $g$ is the ground truth label, and $q$ is a one hot vector with $q_i = [i=g]$ , and $p = softmax(l)$ . Note use torch.nn.CrossEntropyLoss() for (3) instead of log and softmax.

Getting Started

We provide you with starter code that loads the image dataset and the corresponding labels from a training and validation set.

Define your model in models.py. You’ll need to define a model with a scalar outout for regression, and a model with 6 outputs for all other losses.
Implement the different loss functions in train.py
Train your model. e.g. python3 -m homework.train oneHot (for the one-hot loss)
Test your model. e.g. python3 -m homework.test oneHot (for the one-hot loss)

If your model trains slowly or you do not get the accuracy you’d like, you can increase the number training iterations in homework.train by providing an -i argument with the desired number of training iterations, e.g. -i 20000.

You should also develop your model using fewer iterations, e.g. -i 1000..

Input example

tux

Output example

Note

The default parameter initialization of PyTorch is not well tuned for small networks, this leads to slow training of your model. In your model definition __init__ make sure to set the weights and biases for the second linear layer (called fc2 here) to zero:

from torch import nn
nn.init.constant_(fc2.weight, 0)
nn.init.constant_(fc2.bias, 0)

Grading Policy

5pts each for using the correct loss function in each of the 4 models as specified in the description above (Total: 20pts)
20 pts for classification accuracies greater than minimum thresholds specified for each of the 4 models. These accuracies will be computed on our own test set. This is different from the validation set that has provided to you so make sure your model doesn’t overfit. The minimum classification accuracies are as follows:
- L2 regression to the label: 20%
- L2 regression to a one-hot encoding: 75%
- Softmax + log-likelihood: 80%
- Softmax + L2-regression to one-hot: 80%