Planning

M19-DEEP-RL: Deep Reinforcement Learning

11h 0min

Value function representation: tabular, shallow approximation, deep learning; deep value-based methods (DQN); policy gradient methods (REINFORCE); actor-critic methods (A2C, A3C, DDPG, PPO, SAC)

Main Content

Acquisition

1 Lecture content

Content: – Value function representation; – Tabular; – Approximation using shallow models; – Deep learning; – DQN; – Policy gradient methods; – With shallow models; – With deep models; – Actor-critic: – REINFORCE, A3C, A2C; – PPO; – DDPG; SAC;

2h 0min

Practice

2 Colab Notebooks

A set of colab notebooks, regarding especially these topics: – DQN applied to the Lunar Lander; – SAC applied to the inverted pendulum; – SAC applied to AntBullet; – ...

3h 0min

Investigation

3 Independent study time + review

The estimated additional time required for studying the material independently, using the lecture videos/slides and also referencing other literature and material, as necessary. Facilitates correct understanding of the material. This activity also includes the time required for review before exams.

5h 30min

Assessment

4 Quiz activities

Quiz activities meant to provide quick, unassessed feedback to students regarding their grasp of the material.

30 min

Machine Learning School

M19-DEEP-RL: Deep Reinforcement Learning

11h 0min