Topics to be covered
- Markov Decision Processes & Planning
- Model-free Evaluation
- Model-free Control
- Policy Search
- Offline RL including RL from human feedback and Direct Preference Optimization
- Exploration
- Advanced Topics
Offline RL?
Offline RL: Limited data
Plan of action
- Model-free:
- Q-learning
- Policy Gradient
- Actor-Critic
- Model-based:
- Planning
- Sequence models
- Exploration
- Offline RL
- Inverse Reinforcement Learning
- Meta Learning
- Transfer Learning
- Multi-agent RL
Homeworks & assignments
- HW1: Imitation Learning (control via supervised learning)
- HW2: Policy Gradient
- HW3: Q-learning & actor-critic algorithms
- HW4: Model-based RL
- HW5: Offline RL
- Final Project: Research level project of your choice