Inxa.inSearch.cc | search engine, content portal, news aggretator, circle, nexth

Dota 2

Gathering human feedback

Gathering human feedback

Better exploration with parameter noise

Better exploration with parameter noise

Proximal Policy Optimization

Proximal Policy Optimization

Robust adversarial inputs

Robust adversarial inputs

Hindsight Experience Replay

Hindsight Experience Replay

Teacher–student curriculum learning

Teacher–student curriculum learning

Faster physics in Python

Faster physics in Python

Learning from human preferences

Learning from human preferences

Learning to cooperate, compete, and communicate

Learning to cooperate, compete, and communicate

Latest

Dota 2

Dota 2

8 years ago 1 Add to circle

Gathering human feedback

Gathering human feedback

8 years ago 1 Add to circle

Better exploration with parameter noise

Better exploration with parameter noise

8 years ago 1 Add to circle

Proximal Policy Optimization

Proximal Policy Optimization

8 years ago 1 Add to circle

Robust adversarial inputs

Robust adversarial inputs

8 years ago 1 Add to circle

Hindsight Experience Replay

Hindsight Experience Replay

8 years ago 1 Add to circle

Teacher–student curriculum learning

Teacher–student curriculum learning

8 years ago 1 Add to circle

Faster physics in Python

Faster physics in Python

8 years ago 1 Add to circle

Learning from human preferences

Learning from human preferences

8 years ago 1 Add to circle

Learning to cooperate, compete, and communicate

Learning to cooperate, compete, and communicate

8 years ago 1 Add to circle

UCB exploration via Q-ensembles

UCB exploration via Q-ensembles

8 years ago 1 Add to circle

OpenAI Baselines: DQN

OpenAI Baselines: DQN

9 years ago 1 Add to circle

Robots that learn

Robots that learn

9 years ago 1 Add to circle

Roboschool

Roboschool

9 years ago 1 Add to circle

Equivalence between policy gradients and soft Q-learning

Equivalence between policy gradients and soft Q-le...

9 years ago 1 Add to circle

Stochastic Neural Networks for hierarchical reinforcement learning

Stochastic Neural Networks for hierarchical reinfo...

9 years ago 1 Add to circle

Unsupervised sentiment neuron

Unsupervised sentiment neuron

9 years ago 1 Add to circle

Spam detection in the physical world

Spam detection in the physical world

9 years ago 1 Add to circle