Evolved Policy Gradients
Gotta Learn Fast: A new benchmark for generalization in RL
Retro Contest
Variance reduction for policy gradient with action-dependent factorized baselines
Improving GANs using optimal transport
Report from the OpenAI hackathon
On first-order meta-learning algorithms
Reptile: A scalable meta-learning algorithm
OpenAI Scholars
Some considerations on learning to explore via meta-reinforcement learning