Equivalence between policy gradients and soft Q-learning
Stochastic Neural Networks for hierarchical reinforcement learning
Unsupervised sentiment neuron
Spam detection in the physical world
Evolution strategies as a scalable alternative to reinforcement learning
One-shot imitation learning
Distill
Learning to communicate
Emergence of grounded compositional language in multi-agent populations
Prediction and control with temporal segment models