Papers of the day   All papers

Reward Tampering Problems and Solutions in Reinforcement Learning


Apr 13 2018 Janelle Shane

When machine learning is astonishing - I collected some highlights from a paper on algorithmic creativity Original paper:
37 replies, 3057 likes

Aug 14 2019 DeepMind

In our latest AI safety blog post, we explore principled solutions to the reward tampering problem, in which a reinforcement learning agent actively changes its reward function to maximise reward. Blog post: Paper:
3 replies, 447 likes

Sep 10 2019 hardmaru

Found this recent paper by Tom Everitt and Marcus Hutter that looks at the topic of RL agents “cheating” from an AI Safety perspective. Worth a look! Paper Blog
3 replies, 121 likes

Aug 14 2019 Vishal Maini

another step towards developing a set of best practices for designing safe RL agents - in this case, by avoiding incentives for agents to tamper with their own reward function. great work, @tom4everitt and team 🚀 🤖 ✅
1 replies, 19 likes

Aug 14 2019 Andrey Kurenkov 🤖

Random - perhaps academic labs should take some inspiration from industry labs and hire some professional graphic designers to help with figures. Yes it can seem flashy, but communicating via visuals is its own language and skillset and researchers are at best ok at it; win-win.
1 replies, 19 likes

Aug 14 2019 Victoria Krakovna

Exciting work on the reward tampering problem in AI safety, where the agent changes its reward function by exploiting how reward is implemented in the environment. The paper proposes design principles for building agents without an incentive to tamper with the reward function.
0 replies, 18 likes

Aug 14 2019 Kate Parkyn

This is cool. If you like engineering & have an interest in AI Safety then check out this blog ⬇️ and this job ➡️ #AI #AIsafety #ReinforcementLearning #deepmind
0 replies, 1 likes