Papers of the day   All papers

Reward Tampering Problems and Solutions in Reinforcement Learning

Comments

Apr 13 2018 Janelle Shane

When machine learning is astonishing - I collected some highlights from a paper on algorithmic creativity http://aiweirdness.com/post/172894792687/when-algorithms-surprise-us Original paper: https://arxiv.org/pdf/1803.03453.pdf https://t.co/MHS2Gk7zkw
37 replies, 3057 likes


Aug 14 2019 DeepMind

In our latest AI safety blog post, we explore principled solutions to the reward tampering problem, in which a reinforcement learning agent actively changes its reward function to maximise reward. Blog post: https://medium.com/@deepmindsafetyresearch/designing-agent-incentives-to-avoid-reward-tampering-4380c1bb6cd Paper: https://arxiv.org/abs/1908.04734 https://t.co/HRnoYBHBYA
3 replies, 447 likes


Sep 10 2019 hardmaru

Found this recent paper by Tom Everitt and Marcus Hutter that looks at the topic of RL agents “cheating” from an AI Safety perspective. Worth a look! Paper https://arxiv.org/abs/1908.04734 Blog https://link.medium.com/yX1b3UXERZ https://t.co/cak1IUTAlM
3 replies, 121 likes


Aug 14 2019 Vishal Maini

another step towards developing a set of best practices for designing safe RL agents - in this case, by avoiding incentives for agents to tamper with their own reward function. great work, @tom4everitt and team 🚀 🤖 ✅
1 replies, 19 likes


Aug 14 2019 Andrey Kurenkov 🤖

Random - perhaps academic labs should take some inspiration from industry labs and hire some professional graphic designers to help with figures. Yes it can seem flashy, but communicating via visuals is its own language and skillset and researchers are at best ok at it; win-win.
1 replies, 19 likes


Aug 14 2019 Victoria Krakovna

Exciting work on the reward tampering problem in AI safety, where the agent changes its reward function by exploiting how reward is implemented in the environment. The paper proposes design principles for building agents without an incentive to tamper with the reward function.
0 replies, 18 likes


Aug 14 2019 Kate Parkyn

This is cool. If you like engineering & have an interest in AI Safety then check out this blog ⬇️ and this job ➡️ https://deepmind.com/careers/jobs/1433588 #AI #AIsafety #ReinforcementLearning #deepmind
0 replies, 1 likes


Content