Probabilistic Deep Learning

Main Idea of the course

In this short course on deep learning (DL) focus on practical aspects of DL. We understand DL models as probabilistic models which model a (conditional) distribution for the outcome and not only a point estimation. Often this is achieved by modeling the parameters of a probability distribution and the distribution parameters are controlled by a neural network. The DL approach has the advantage, that the input to these neural networks can be all kind of data: tabular (structured) data, but also unstructured raw data like images or text. From this prespective DL models are just a complex generalisation of statistical models like linear regression. The parameters of the involved neural networks itself (often called weights) can be determined by the maximum likelihood principle. You can think of this approach as an extension of generalized linear model to complex data (e.g. images) and non-linear models providing a distribution for the outcome (like a Poisson distribution for count data or Gaussian Distribution for continious data).

The basic idea can be sketched as:

Technicalities

This course is done in python since the R support of DL is quite limited. We use a high-level API (Keras) which allows to define neural networks in a very intuitive way running on top of pytorch. The course is designed so that you can run the code in the cloud on Colab.

If you want you can also run the code on your computer, you need to install the required libraries (see local_installation.md). However, we recommend to use Colab, since it is easier to set up and you can run the code on a GPU for free and can only provide limited support for local installations.

Other resources

We took inspiration (and sometimes slides / figures) from the following resources.

Probabilistic Deep Learning
This book, authored by us (the tensorchiefs), explores the probabilistic approach to deep learning using Python, Keras, and TensorFlow. While we reference some content from the book, we will not cover all aspects during the course.
Deep Learning with Python, Second Edition
Written by François Chollet, the creator of Keras, this book provides an in-depth introduction to deep learning, with a focus on practical implementation using Keras and TensorFlow.
Keras Documentation
François Chollet initially developed Keras as an easy-to-use high-level API for building and training neural networks. Its simple interface and powerful capabilities make it a cornerstone of deep learning research and application.

Dates

The course is split in 4 lectures, which excercises. You will also work on a project with data of your own choosing. For rooms see ETHZ Course Catalogue.

Date	Lectures
03.02.2025 afternoon	Intro to probabilistic DL and Keras with exercises
10.02.2025 all day	DL with different NN architectures based on images and tabular data with exercises and project time
17.02.2025 all day	Model evaluation, LLM and transformers, and project time
24.02.2025 morning	project presentation and uncertainty

Syllabus

Lecture 1
- Topic and Slides: Introduction to probabilistic deep learning
- Notebooks:
  - Getting Started with Colab
  - Banknote Example 01_fcnn_with_banknotes.ipynb
- Optional Notebooks:
  - Optional: Training with low-level liberaries
- Additional Material: Network Playground
Lecture 2
- Topic and Slides: NN training, different NN architectures (fcNN, CNN), working with pretrained NNs
- Notebooks:
Lecture 3
- Topic and Slides: Model evaluation and transformer NNs
- Notebooks:
Lecture 4
- Topic and Slides: Uncertainty, Ensembling, and Bayesian NN
- Presentations

Projects

Please register your project by 11 February 2025 in the following spreadsheet: Project Registration

Example Projects

Below are two example projects you could use, but it is also possible to come up with your own ideas.

Probabilistic Prediction of Temperature (Tabular Data) There are serveral possibilities to make a probilistic prediction of weather.
The this notebook contains starter code for predicting temperatures based on historic data.
Cuteness of Animal Images
In the kaggle competition Petfinder Pawpularity Score you are asked to predict the popularity of a pet based on its image. This is a regression task, where the target is the popularity score. Starter code is provided in the notebook.