Minimal Monte Carlo Policy Gradient (REINFORCE) Algorithm Implementation in Keras
The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:
Minimal implementation of Stochastic Policy Gradient Algorithm in Keras
This PG agent seems to get more frequent wins after about 8000 episodes. Below is the score graph.