Skip to content

Topics to be covered

  • Markov Decision Processes & Planning
  • Model-free Evaluation
  • Model-free Control
  • Policy Search
  • Offline RL including RL from human feedback and Direct Preference Optimization
  • Exploration
  • Advanced Topics

Offline RL?

Offline RL: Limited data


Plan of action

  1. Model-free:
    • Q-learning
    • Policy Gradient
    • Actor-Critic
  2. Model-based:
    • Planning
    • Sequence models
  3. Exploration
  4. Offline RL
  5. Inverse Reinforcement Learning
  6. Meta Learning
  7. Transfer Learning
  8. Multi-agent RL

Homeworks & assignments

  1. HW1: Imitation Learning (control via supervised learning)
  2. HW2: Policy Gradient
  3. HW3: Q-learning & actor-critic algorithms
  4. HW4: Model-based RL
  5. HW5: Offline RL
  6. Final Project: Research level project of your choice