Papers of the day   All papers

STABILIZING TRANSFORMERS FOR REINFORCEMENT LEARNING

Comments

Max Jaderberg: Finally, Transformers working for RL! Two simple modifications: move layer-norm and add gating creates GTrXL: an incredibly stable and effective architecture for integrating experience through time in RL. Great work from Emilio interning at @DeepMindAI https://arxiv.org/abs/1910.06764 https://t.co/uQQlvPpBbX

10 replies, 794 likes


Sid Jayakumar: Really excited about our latest work showing that large Transformer-XLs can be used in RL agents. We show SoTA performance on DMLab with gated transformers and a few small changes. Led by Emilio as an internship project! @DeepMindAI

0 replies, 86 likes


Russ Salakhutdinov: Using large Transformer-XLs for stable training of RL agents. Nice work from Emilio Parisotto’s internship and colleagues at DeepMind.

0 replies, 77 likes


Xander Steenbrugge: Transformers now also taking over SOTA from LSTMs in the Reinforcement Learning domain! 🤯😁 Very curious to see this applied to long time-horizon environments like StarCraft, Dota II and more. @SchmidhuberAI is not going to like this 🤣

3 replies, 41 likes


Daisuke Okanohara: For memory-augmented RL, the agent with gated transformer-XL (GTrXL) achieves better performance than one with LSTM. Introduced gate (GRU) significantly stabilizes the training by enabling a Markov regime of training at the beginning. https://arxiv.org/abs/1910.06764

0 replies, 13 likes


Aran Komatsuzaki: Stabilizing Transformers for Reinforcement Learning https://arxiv.org/abs/1910.06764 Proposed the Gated Transformer-XL, which surpasses LSTMs and achieves sota results on the multi-task DMLab-30 benchmark suite.

0 replies, 9 likes


Caglar Gulcehre: We have shown that it is possible to train very large transformers on RL problems by introducing gating mechanism to stabilize them. It turned out this architecture is very effective on a number of RL benchmarks. As @maxjaderberg pointed out,mostly due to the work done by Emilio.

0 replies, 8 likes


Colin Raffel: @mariusmosbach This finding was unpublished but included as default in tensor2tensor for a long time because it clearly works better. Recently https://arxiv.org/abs/2002.04745 and https://arxiv.org/abs/1910.06764 discussed it more explicitly.

0 replies, 5 likes


Phillip Wang: @SonAthenos @OpenAI @CShorten30 https://arxiv.org/abs/1910.06764

0 replies, 2 likes


arXiv in review: #ICLR2020 Stabilizing Transformers for Reinforcement Learning. (arXiv:1910.06764v1 [cs\.LG]) http://arxiv.org/abs/1910.06764

0 replies, 1 likes


Benjamin Singleton: Stabilizing Transformers for Reinforcement Learning #BigData #DataScience https://arxiv.org/abs/1910.06764

0 replies, 1 likes


Content

Found on Oct 16 2019 at https://arxiv.org/pdf/1910.06764.pdf

PDF content of a computer science paper: STABILIZING TRANSFORMERS FOR REINFORCEMENT LEARNING