Deep Exploration via Bootstrapped DQN

From statwiki
Revision as of 12:54, 26 October 2017 by Ashishgaurav (talk | contribs) (edited summary a bit)
Jump to navigation Jump to search

Gist

Efficient exploration remains a major challenge for reinforcement learning. Common dithering strategies for exploration, like $\epsilon$-greedy and Boltzmann Exploration, do not carry out deep exploration. So you need exponentially more data to train your agent. Also, most algorithms for statistically efficient reinforcement learning are not computationally tractable in complex environments. A list of available exploration strategies are available on this page.

Randomized value functions offer a promising approach to efficient exploration with generalization, but existing algorithms are not compatible with nonlinearly parameterized value functions. The authors propose bootstrapped DQN as a first step towards addressing such contexts. They go on to demonstrate that bootstrapped DQN can combine deep exploration with deep neural networks for exponentially faster learning than any dithering strategy.