Homework 2

starter code data colab notebook

In this homework, we will train a convolutional network to classify images from SuperTuxKart.

This assignment should be solved individually. No collaboration, sharing of solutions, or exchange of models is allowed. Please, do not directly copy existing code from anywhere other than your previous solutions, or the previous master solution. We will check assignments for duplicates. See below for more details.

Starter code and dataset

The starter code for this assignment can be found here. The starter code contains several useful scripts:

The starter code also contains a data directory where you’ll copy (or symlink) the SuperTuxKart classification dataset. Unzip the data directly into the homework folder, replacing the existing data directory completely. Make sure you see the following directories and files inside your main directory

homework
grader
bundle.py
data
data/train
data/valid

You will run all scripts from inside this main directory.

CNN Model (10pts)

Implement the CNNClassifier in models.py. Similar to homework 1, your model should return a (B,5) torch.Tensor which represents the logits of the classes. Use convolutions this time. Use python -m grader homework -v to grade the first part.

Relevant Operations

Logging (30pts)

In this part, we learn how to use tensorboard. We created a dummy training procedure in logging.py, and provided you with two tb.SummaryWriter as logging utilities. Use those summary writers to log the training loss at every iteration, the training accuracy at each epoch and the validation accuracy at each epoch. Here is a simple example of how to use the SummaryWriter.

import torch.utils.tensorboard as tb
logger = tb.SummaryWriter('cnn')

logger.add_scalar('train/loss', t_loss, 0)

In logging.py, you should not create your own SummaryWriter, but rather use the one provided. You can test your logger by calling python -m homework.logging log, where log is your favorite directory. Then start up tensorboard: tensoboard --logdir log. Use python -m grader homework -v to grade the logging.

Relevant Operations

Training your CNN model (60pts)

Train your model and save it as cnn.th. You can reuse some of the training functionality in train.py from homework 1. We highly recommend you incorporate the logging functionality from section 2 into your training routine. Once you trained your model, you can optionally visualize your model’s prediction using python -m homework.viz_prediction [DATASET_PATH].

viz

After implementing everything, you can use python -m grader homework to test your solutions against the validation grader locally. Your model should achieve a 0.85 test accuracy to receive full points. Note that we will use a testing dataset to grade the accuracy part of your model, so your local grades are not guaranteed to be your actual grades. (Don’t overfit!)

Relevant Operations

Grading

You can test your code using

python -m grader homework -v

This will run a subset of test cases we use during the actual testing. The point distributions will be the same, but we will use additional test cases. More importantly, we evaluate your model on the test set. The performance on the test grader may vary. Try not to overfit to the validation set too much.

Submission

Once you finished the assignment, create a submission bundle using

python bundle.py homework [YOUR UT ID]

and submit the zip file online. If you want to double-check that your zip file was properly created, you can grade it again

python -m grader [YOUR UT ID].zip

Running your assignment on google colab

You might need a GPU to train your models. You can get a free one on google colab. We provide you with a ipython notebook that can get you started on colab for each homework. Follow the instructions below to use it.

Honor code

This assignment should be solved individually.

What interaction with classmates is allowed?

What interaction is not allowed?

Ways students failed in past years (do not do this):

Installation and setup

Installing python 3

Go to https://www.python.org/downloads/ to download python 3. Alternatively, you can install a python distribution such as Anaconda. Please select python 3 (not python 2).

Installing the dependencies

Install all dependencies using

pip install -r requirements.txt

Note: On some systems, you might be required to use pip3 instead of pip for python 3.

If you’re using conda use

conda env create environment.yml

Manual installation of pytorch

Go to https://pytorch.org/get-started/locally/ then select the stable Pytorch build, your OS, package (pip if you installed python 3 directly, conda if you installed Anaconda), python version, cuda version. Run the provided command. Note that cuda is not required, you can select cuda = None if you don’t have a GPU or don’t want to do GPU training locally. We will provide instruction for doing remote GPU training on Google Colab for free.

Manual installation of the Python Imaging Library (PIL)

The easiest way to install the PIL is through pip/pip3 or conda.

pip install -U Pillow

There are a few important considerations when using PIL. First, make sure that your OS uses libjpeg-turbo and not the slower libjpeg (all modern Ubuntu versions do by default). Second, if you’re frustrated with slow image transformations in PIL use Pillow-SIMD instead:

CC="cc -mavx2" pip install -U --force-reinstall Pillow-SIMD

The CC="cc -mavx2" is only needed if your CPU supports AVX2 instructions. pip will most likely complain a bit about missing dependencies. Install them, either through conda, or your favorite package manager (apt, brew, …).