Reference Links
Berkeley bootcamp, Reinforcement learning course lectures by David Silver
A (Long) Peek into Reinforcement Learning
Reinforcement Learning Textbook
Week 1, Feb 4: Markov Decision Processes
Topics: Dynamic Programming (Value iteration, Policy iteration, and Q-learning)
Sutton Chapter 3: Markov Decision Processes and Chapter 4: Dynamic Programming
Deep RL Bootcamp Core Lecture 1 Intro to MDPs and Exact Solution Methods – Pieter Abbeel Video, Slides
Deep RL Bootcamp Core Lecture 2 Sample-based Approximations and Fitted Learning – Rocky Duan Video, Slides
Deep RL Bootcamp Lab 1: Markov Decision Processes. You will implement value iteration, policy iteration, and tabular Q-learning and apply these algorithms to simple environments including tabular maze navigation (FrozenLake) and controlling a simple crawler robot.
CS294 Reinforcement learning introduction – Sergey Levine Video, Slides
CS294 Value functions introduction – Sergey Levine Video, Slides
Introduction to Reinforcement Learning – Joshua Achiam Slides
Week 2, Feb 11 Monte Carlo Methods
Topics: Use Blackjack to implement first-visit or every-visit MC prediction
Sutton, Chapter 5.3: Monte Carlo Methods
CS294 Optimal control and planning – Sergey Levine Video, Slides
Week 3, Feb 18 Imitation Learning with Mujoco
Supervised learning and imitation (Levine) Video, Slides
CS294 Imitation Learning Project
Week 4, Feb 25 Policy Gradients
Topics: TD (Temporal Difference), use Cartpole and Humanoid for Policy Gradients
Sutton Chapter 6: Temporal-Difference Learning
Deep RL Bootcamp Core Lecture 4a Policy Gradients and Actor Critic – Pieter Abbeel Video, Slides
Deep RL Bootcamp Core Lecture 4b Pong from Pixels – Andrej Karpathy Video, Slides
CS294 Policy gradients introduction – Sergey Levine Video, Slides, Policy Gradients Project
Policy Gradient Algorithms – Lilian Weng Blog
Sutton Chapter 13.5: Actor-Critic Methods
CS294 Actor-critic introduction – Sergey Levine Video, Slides
Week 5, Mar 4 Deep Q Learning, DQN, Rainbow
Sutton Chapter 16.5: DQN
Deep RL Bootcamp Core Lecture 3 DQN + Variants – Vlad Mnih Video, [Slides]https://drive.google.com/open?id=0BxXI_RttTZAhVUhpbDhiSUFFNjg)
Deep RL Bootcamp Lab 3: Deep Q-Learning. You will implement the DQN algorithm and apply it to Atari games.
CS294 Neural networks review (Achiam) Video, Slides
CS294 Advanced Q-learning algorithms – Sergey Levine Video, Slides, DQN Project
Week 6, Mar 11 Model-based RL
Deep RL Bootcamp Core Lecture 9 Model-based RL – Chelsea Finn Video, Slides
CS294 Learning dynamical systems from data – Sergey Levine Video, Slides
CS294 Learning policies by imitating optimal controllers – Sergey Levine Video, Slides
CS294 Advanced model learning and images – Chelsea Finn Video, Slides
CS294 Connection between inference and control – Sergey Levine Video, Slides
CS294 Model Based RL Project
Week 7, Mar 18 Advanced Policy Gradients
Topics: Advanced Policy Gradients: Natural Policy, PPO (Use Roboschool instead of Mujoco license)
Deep RL Bootcamp Core Lecture 5 Natural Policy Gradients, TRPO, and PPO – John Schulman Video, Slides
Deep RL Bootcamp Lab 4: Policy Optimization Algorithms. You will implement various policy optimization algorithms, including policy gradient, natural policy gradient, trust-region policy optimization (TRPO), and asynchronous advantage actor-critic (A3C). You will apply these algorithms to classic control tasks, Atari games, and roboschool locomotion environments.
CS294 Learning policies by imitating optimal controllers – Sergey Levine Video, Slides
Week 8, Mar 25 Inverse RL
Topics: GAIL
CS294 Inverse reinforcement learning – Sergey Levine Video, Slides
Algorithms for Inverse Reinforcement Learning PDF
Learning Robust Rewards with Adversarial Inverse Reinforcement Learning PDF
Maximum Entropy Inverse Reinforcement Learning PDF
Maximum Entropy Deep Inverse Reinforcement Learning PDF
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization PDF
Generative Adversarial Imitation Learning PDF