CS 395T - Deep learning seminar - Fall 2019

meets MW 2:00 - 3:30pm in GDC 4.304

instructor Philipp Krähenbühl
email philkr (at) utexas.edu
office hours by appointment (send email)

TA Dian Chen
email dchen (at) cs.utexas.edu
TA hours M 4:00-5:00pm, TA station desk 2

Please use canvas for assignment questions.


Class overview

This year the class focuses on two major themes: “Image generation”, and “vision and action”. This is meant to be a very interactive class for upper level students (MS or PhD). For every class we read two recent research papers (most no older than two years), which we will discuss in class.

Goals of the class

After this class you should be able to



Wed Aug 28 Course introduction
Mon Sep 02 No class - Labor day
Wed Sep 04 ImageNet Classification with Deep Convolutional Neural Networks, Krizhevsky etal. 2012

Deep Residual Learning for Image Recognition, He etal. 2016
Mon Sep 09 ImageNet: A Large-Scale Hierarchical Image Database, Deng etal. 2009

Microsoft COCO: Common Objects in Context, Lin etal. 2014
Wed Sep 11 Learning Deep Features for Scene Recognition using Places Database, Zhou etal. 2014

LVIS: A Dataset for Large Vocabulary Instance Segmentation, Gupta etal. 2019
Fri Sep 13 Project 1 due - 11:59pm
Mon Sep 16 Implementation discussion - best of Project 1
Wed Sep 18 Generative Adversarial Nets, Goodfellow etal. 2014

Implicit Maximum Likelihood Estimation, Li etal. 2018
Mon Sep 23 Perceptual Losses for Real-Time Style Transfer and Super-Resolution, Johnson etal. 2016

Glow: Generative Flow with Invertible 1x1 Convolutions, Kingma etal. 2018
Wed Sep 25 Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, Zhu etal. 2017

Large Scale GAN Training for High Fidelity Natural Image Synthesis, Brock etal. 2018
Fri Sep 27 Project 2 due - 11:59pm
Mon Sep 30 Implementation discussion - best of Project 2
Wed Oct 02 Full Resolution Image Compression with Recurrent Neural Networks, Toderici etal 2016

Real-Time Adaptive Image Compression, Rippel and Bourdev 2017
Mon Oct 07 Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations, Agustsson etal. 2017

End-to-end Optimized Image Compression , Ballé etal. 2016
Wed Oct 09 Learned Video Compression, Rippel etal. 2018

DVC: An End-To-End Deep Video Compression Framework, Lu etal. 2019
Fri Oct 11 Project 3 due - 11:59pm
Mon Oct 14 Implementation discussion - best of Project 3
Wed Oct 16 Fully Convolutional Networks for Semantic Segmentation, Long etal 2015

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, Chen etal. 2018
Mon Oct 21 Deep Layer Aggregation, Yu etal. 2017

U-Net: Convolutional Networks for Biomedical Image Segmentation, Ronneberger etal. 2015
Wed Oct 23 Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, Ren etal. 2015

Mask R-CNN, He etal. 2017
Mon Oct 28 Stacked Hourglass Networks for Human Pose Estimation, Newell etal. 2016

Simple Baselines for Human Pose Estimation and Tracking, Xiao etal. 2018
Wed Oct 30 Proximal Policy Optimization Algorithms, Schulman etal. 2017

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, Haarnoja etal. 2018
Fri Nov 01 Project 4 due - 11:59pm
Mon Nov 04 Implementation discussion - best of Project 4
Wed Nov 06 ALVINN: An Autonomous Land Vehicle in a Neural Network, Pomerleau etal. 1989

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning, Ross etal. 2011
Mon Nov 11 Human-level control through deep reinforcement learning, Mhin etal. 2015

Mastering the game of Go without human knowledge
Wed Nov 13 Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset, Carreira etal. 2017

Non-local Neural Networks, Wang etal. 2018
Fri Nov 15 Project 5 due - 11:59pm
Mon Nov 18 Implementation discussion - best of Project 5
Wed Nov 20 Habitat: A Platform for Embodied AI Research, Savva etal. 2019

Learning by Cheating, 2019
Mon Nov 25 Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model, 2019

Momentum Contrast for Unsupervised Visual Representation Learning, He et al. 2019
Wed Nov 27 No class - Thanksgiving
Mon Dec 02 Research presentations I
Wed Dec 04 Research presentations II
Fri Dec 06 Project 6 due - 11:59pm
Mon Dec 09 Implementation discussion - best of Project 6


There are 6 mini-projects on class. Project 1 implements an auto-encoder and should be a warmup. Projects 2-5 implement specific papers. You’ll have a choice of four papers to implement per project. The best two implementations per project receive extra credit. It is ok to fail at implementing a project, but clearly highlight and document why you failed on a writeup. If you manage to re-implement the project, your writeup can just be a few sentences (e.g. it worked as advertised).

Project 6 is a bit more open-ended and uses a 4 very recent simulation environment.

Expected workload

Estimates of required effort to pass the class are:

General tips

Tips for reading a paper


Syllabus subject to change.