Reinforcement learning with policy gradients in pure Python

This post is also available as a Jupyter notebook. It appears to be a right of passage for ML bloggers covering reinforcement learning to show how to implement the simplest algorithms from scratch without relying on any fancy frameworks. There is Karpathy’s now famous Pong from Pixels, and a simple Google search of “policy gradient from scratch” will yield a number of blog posts of implementations with varying levels of detail. [Read More]