MIT EECS6.S898 Deep Learning |
||
Fall 2022 |
||
Description: Fundamentals of deep learning, including both theory and applications. Topics include neural net architectures (MLPs, CNNs, RNNs, transformers), backpropagation and automatic differentiation, learning theory and generalization in high-dimensions, and applications to computer vision, natural language processing, and robotics.
Pre-requisites: (6.3900 [6.036] or 6.C01 or 6.3720 [6.401]) and (6.3700[6.041] or 6.3800 [6.008] or 18.05) and (18.C06 or 18.06)
Note: This is course is appropriate for advanced undergraduates and graduate students, and is 3-0-9 units. For non-students who want access to Piazza or Canvas, email Aidan Curtis (curtisa@mit.edu) to be added manually. For non-MIT students, refer to cross-registeration.
Lectures will be in-person only; if there is an important reason you cannot make class, you may email Aidan Curtis (curtisa@mit.edu) to get a recording.
** class schedule is subject to change **
Date | Topics | Speaker | Course Materials | Assignments | |
Week 1 | |||||
Thu 9/8 | Course overview, introduction to deep neural networks and their basic building blocks | Phillip Isola & Stefanie Jegelka |
slides notes |
||
Week 2 | |||||
Tue 9/13 | How to train a neural net+ detailsSGD, Backprop and autodiff, differentiable programming |
Phillip Isola |
slides notes |
pset 1 out | |
Thu 9/15 | Approximation theory+ detailsHow well can you approximate a given function by a DNN? We will explore various facets of this issue, from universal approximation to Barron's theorem. And does increasing the depth provably help for expressivity? |
Stefanie Jegelka |
slides notes |
||
Week 3 | |||||
Tue 9/20 | Generalization theory (IID) + detailsWe will start by briefly discussing the classical approach to generalization bounds, large margin theory, and complexity of neural networks. We then discuss recent interpolation results, the double or multiple-descent phenomenon, and the linear regime in overparametrized neural networks. |
Stefanie Jegelka |
slides double descent |
||
Thu 9/22 | Architectures -- Grids
+ detailsCNNs |
Phillip Isola |
slides notes |
||
Week 4 | |||||
Tue 9/27 | Architectures -- Graphs
+ detailsGNNs |
Stefanie Jegelka |
slides notes: GNNs (§5, optional §7.3) representation power of GNNs + optional notesGNN intro part 1 part 2GCN GAT Neural message passing for quantum chemistry (§2) GNN representation theory |
pset 1 due pset 2 out |
|
Thu 9/29 | Geometric deep learning + detailsInductive biases of archs, invariances and equivariances |
Stefanie Jegelka |
slides Geometric DL (§3, rest optional) |
||
Week 5 | |||||
Tue 10/4 | Hacker's guide to DL + detailsIn this lecture, we'll discuss the practical side of developing deep learning systems. We will focus on best practices, common mistakes to look for, and evaluation methods for developing deep learning models. While optimization methods and software design practices for Deep Learning are still under development, this lecture will try to present several tried and true implementation and debugging strategies for diagnosing failures in model training and help make model training less painful in the future. |
Phillip Isola |
slides notes |
||
Thu 10/6 | Architectures -- transformers + detailsTransformers. Three key ideas: tokens, attention, positional codes. Relationship between transformers and MLPS, GNNs, and CNNs -- they are all variations on the same themes! |
Phillip Isola |
slides notes |
||
Week 6 | |||||
Wed 10/12 | pset 3 out | ||||
Thu 10/13 | Architectures -- memory
+ detailsRNNs, LSTMs, memory, sequence models. |
Phillip Isola |
slides notes |
||
Week 7 | |||||
Tue 10/18 | Representation learning -- reconstruction-based
+ detailsIntro to representation learning, representations in nets and in the brain, autoencoders, clustering and VQ, self-supevised learning with reconstruction losses. |
Phillip Isola |
slides notes notes (optional) |
||
Thu 10/20 | Representation learning -- similarity-based
+ detailsIn this lecture, we will talk about unsupervised and weakly supervised learning, primarily through the lens of similarity driven learning. I’ll briefly talk about metric learning first, before moving onto self-supervised learning with a focus on contrastive learning (the modern cousin of metric learning). |
Stefanie Jegelka |
notes contrastive feature geometry (align+uniform) contrastive learning |
pset 2 due pset 3 due pset 4 out |
|
Week 8 | |||||
Tue 10/25 | Representation learning -- theory
+ details |
Stefanie Jegelka |
slides inductive bias (negative results; optional) simplicity bias (low-rank; optional) pitfalls of simplicity bias (optional) |
||
Thu 10/27 |
DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking
[Guest Lecture]
|
Gabriele Corso |
slides paper (optional) |
||
Week 9 | |||||
Tue 11/1 |
Generative models -- basics
+ detailsDensity and energy models, samplers, GANs, autoregressive models, diffusion models |
Phillip Isola |
slides notes denoising diffusion (optional) diffusion blog (optional) |
||
Thu 11/3 | Generative models -- representation learning meets generative modeling
+ detailsVAEs, latent variables |
Phillip Isola |
slides notes |
pset 4 due pset 5 out project handout |
|
Week 10 | |||||
Tue 11/8 | Generative models --- conditional models
+ detailscGAN, cVAE, paired and unpaired translation, image-to-image, text-to-image, world models |
Phillip Isola |
slides notes |
||
Thu 11/10 | Generalization (OOD)
+ details |
Stefanie Jegelka |
slides notes (adv. examples) notes (robust opt.) notes (shortcuts in NNs; optional) notes (extrapolation; optional) |
||
Fri 11/11 | project proposal due | ||||
Week 11 | |||||
Tue 11/15 | Transfer learning -- models
+ detailsFinetuning, linear probes, knowledge distillation, foundation models |
Phillip Isola |
slides notes notes (foundation models; optional) |
||
Thu 11/17 | Transfer learning -- data
+ detailsGenerative models as data++, domain adaptation, prompting |
Phillip Isola |
slides notes (MAML) notes (DatasetGAN) |
pset 5 due | |
Week 12 | |||||
Tue 11/22 | Scaling laws
+ details |
Stefanie Jegelka |
slides |
||
Week 13 | |||||
Tue 11/29 | Curiosities about NN optimization and stability
+ details |
Stefanie Jegelka |
slides notes (edge of stability; optional) notes (unstable convergence; optional) notes (stability; optional) notes (stability of SGD; optional) notes (convergence to invariant measure; Sec. 1-3; optional) notes (statistical algorithmic stability; Sec. 1-3; optional) |
||
Thu 12/1 | Energy-efficient deep learning
+ details |
Vivienne Sze |
slides |
||
Week 14 | |||||
Tue 12/6 | Toward Responsibly-Deployable Deep Learning
+ details |
Tom Hartvigsen | |||
Thu 12/8 | No lecture OH at the usual lecture location and hour |
||||
Week 15 | |||||
Tue 12/13 | Poster session (1pm to 3pm) Grier Room (34-401) |
Final project (blog + poster) due |