Papers of the day   All papers

Q-LEARNING IN ENORMOUS ACTION SPACES VIA AMORTIZED APPROXIMATE MAXIMIZATION

Comments

DeepMind: Q-learning is difficult to apply when the number of available actions is large. We show that a simple extension based on amortized stochastic search allows Q-learning to scale to high-dimensional discrete, continuous or hybrid action spaces: https://arxiv.org/abs/2001.08116

6 replies, 917 likes


David Warde-Farley 🇪🇺: Very glad to share this on arXiv today: one weird trick for getting Q-learning to work when the action space is big and complicated. Work led by Tom Van de Wiele, with Andriy Mnih and @VladMnih.

2 replies, 183 likes


Ian Osband: "One weird trick" for DQN in large (continuous) action spaces: - Initialize uniform action-sampling distribution. - Choose sampled action with highest Q. - Train sampling to produce "best action" + also some entropy. - ... Works surprisingly well! Great stuff @dwf, @VladMnih !

1 replies, 91 likes


Arash Tavakoli: If you like RL in large action spaces as much as I do, then you will likely enjoy this work from @DeepMind! By Van de Wiele, @dwf, @AndriyMnih & @VladMnih.

0 replies, 7 likes


Daisuke Okanohara: Q-learning needs a maximization over actions and cannot be applied to high-dimensional/continuous action space. With a proposal distribution trained by amortized inference, Q-learning can be used to these problems and outperform other SOTAs. https://arxiv.org/abs/2001.08116

0 replies, 6 likes


Benjamin Singleton: Q-Learning in enormous action spaces via amortized approximate maximization #BigData #Analytics https://arxiv.org/abs/2001.08116

0 replies, 1 likes


Patrick Muncaster: Q-LEARNING IN ENORMOUS ACTION SPACES VIA AMORTIZED APPROXIMATE MAXIMIZATION https://arxiv.org/pdf/2001.08116.pdf " We treat the search for the best action as another learning problem & replace the exact maximization over all actions with a maximization over a set of actions sampled from a ...

0 replies, 1 likes


Content

Found on Jan 23 2020 at https://arxiv.org/pdf/2001.08116.pdf

PDF content of a computer science paper: Q-LEARNING IN ENORMOUS ACTION SPACES VIA AMORTIZED APPROXIMATE MAXIMIZATION