Papers of the day   All papers

Q-LEARNING IN ENORMOUS ACTION SPACES VIA AMORTIZED APPROXIMATE MAXIMIZATION

Comments

Jan 23 2020 DeepMind

Q-learning is difficult to apply when the number of available actions is large. We show that a simple extension based on amortized stochastic search allows Q-learning to scale to high-dimensional discrete, continuous or hybrid action spaces: https://arxiv.org/abs/2001.08116
6 replies, 915 likes


Jan 23 2020 David Warde-Farley 🇪🇺

Very glad to share this on arXiv today: one weird trick for getting Q-learning to work when the action space is big and complicated. Work led by Tom Van de Wiele, with Andriy Mnih and @VladMnih.
2 replies, 183 likes


Jan 23 2020 Ian Osband

"One weird trick" for DQN in large (continuous) action spaces: - Initialize uniform action-sampling distribution. - Choose sampled action with highest Q. - Train sampling to produce "best action" + also some entropy. - ... Works surprisingly well! Great stuff @dwf, @VladMnih !
1 replies, 91 likes


Jan 23 2020 Arash Tavakoli

If you like RL in large action spaces as much as I do, then you will likely enjoy this work from @DeepMind! By Van de Wiele, @dwf, @AndriyMnih & @VladMnih.
0 replies, 7 likes


Jan 23 2020 Daisuke Okanohara

Q-learning needs a maximization over actions and cannot be applied to high-dimensional/continuous action space. With a proposal distribution trained by amortized inference, Q-learning can be used to these problems and outperform other SOTAs. https://arxiv.org/abs/2001.08116
0 replies, 6 likes


Jan 24 2020 Patrick Muncaster

Q-LEARNING IN ENORMOUS ACTION SPACES VIA AMORTIZED APPROXIMATE MAXIMIZATION https://arxiv.org/pdf/2001.08116.pdf " We treat the search for the best action as another learning problem & replace the exact maximization over all actions with a maximization over a set of actions sampled from a ...
0 replies, 1 likes


Feb 02 2020 Benjamin Singleton

Q-Learning in enormous action spaces via amortized approximate maximization #BigData #Analytics https://arxiv.org/abs/2001.08116
0 replies, 1 likes


Content