In this homework you will train your first deep network on actual images from supertux. You will extend the non-linear multi-layer perceptron trained in the previous assignment.
The goal of this assignment is to classify images from supertux. You’re given a dataset of 64x64 RGB images of objects cropped from supertux. The goal of this assignment is to classify these images into 6 classes:
- Objects (0)
- Tiles (1)
- Tux (2)
- Bad guys (3)
- Bonus (4)
- Projectiles (5)
We will solve the classification problem in a few different ways. Some will work better, some worse.
First, let’s look at the architecture.
First you’ll define the compute graph of your multi-layer perceptron, and the corresponding loss function. In this assignment we will treat all images as vectors of size 12288 (64643) and flatten them first. The network should have one hidden layer of size 100. Anything larger than 100 will likely be very slow to train, anything less than 100 might not perform all that well. You can use the solution of homework 2 to help build your network. Here is an overview of the architecture:
In this assignment you’ll try four different losses for the classification problem:
- L2 regression to the label
- L2 regression to a one-hot encoding
- Softmax + log-likelihood
- Softmax + L2-regression to one-hot
where is a one dimensional regression output (of your network), is a 6-dimensional output (of your network), is the ground truth label, and is a one hot vector with , and .
Note use torch.nn.CrossEntropyLoss() for (3) instead of
We provide you with starter code that loads the image dataset and the corresponding labels from a training and validation set.
- Define your model in
models.py. You’ll need to define a model with a scalar outout for regression, and a model with 6 outputs for all other losses.
- Implement the different loss functions in
- Train your model. e.g.
python3 -m homework.train oneHot(for the one-hot loss)
- Test your model. e.g.
python3 -m homework.test oneHot(for the one-hot loss)
If your model trains slowly or you do not get the accuracy you’d like, you can increase the number training iterations in
homework.train by providing an
-i argument with the desired number of training iterations, e.g.
You should also develop your model using fewer iterations, e.g.
The default parameter initialization of PyTorch is not well tuned for small networks, this leads to slow training of your model. In your model definition
__init__ make sure to set the weights and biases for the second linear layer (called
fc2 here) to zero:
from torch import nn nn.init.constant_(fc2.weight, 0) nn.init.constant_(fc2.bias, 0)
- 5pts each for using the correct loss function in each of the 4 models as specified in the description above (Total: 20pts)
- 20 pts for classification accuracies greater than minimum thresholds specified for each of the 4 models. These accuracies will be computed on our own test set. This is different from the validation set that has provided to you so make sure your model doesn’t overfit. The minimum classification accuracies are as follows:
- L2 regression to the label: 20%
- L2 regression to a one-hot encoding: 75%
- Softmax + log-likelihood: 80%
- Softmax + L2-regression to one-hot: 80%