Title: Why Does My Self-driving Car Keep Crashing? Attacks and Defenses for Reinforcement Learning.
Abstract: To ensure the usefulness of Reinforcement Learning (RL) in real systems, they must be robust to noise and adversarial attacks. In
the subfield of adversarial RL, an external attacker can manipulate the agent's interaction with the environment. Unfortunately, the literature has devised many clever attack approaches, and even the most subtle attacks can lead to disastrous outcomes. In
addition, we show that maximally devastating attacks are computable in polynomial time. Fortunately, we show that optimal defenses correspond to an equilibrium of an appropriately defined game. Although these games are NP-hard to approximate in general, we
exploit the structure of most attack surfaces to enable efficient solutions via mutual recursion. Moreover, victims can strengthen their defenses by improving their attack detection ability.