Planning Across Controls, Reinforcement Learning, and Diffusion

dBackground: Diffusion Models for Reinforcement Learning Survey


"Current works applying Diffusion models on Reinforcement Learning mainly fall into three categories: as planners, as policies, and as data synthesizers." [1]


planning in rl: Learning to Combat Compounding-Error in Model-Based Reinforcement Learning


"Planning in RL refers to using a dynamic model to make
decisions imaginarily and selecting the appropriate action to
maximize cumulative rewards." [1:1]


"Planning is commonly used in the MBRL framework with a learned dynamic model. However, the planning sequences are usually simulated auto-regressively, which may lead to severe compounding errors, especially in the offline setting due to limited data support." [1:2]


planning in diffusion: Planning with Diffusion for Flexible Behavior Synthesis


"Diffusion models are designed to generate clips of the trajectory τ = (s1, a1, r1, . . . , sH , aH , rH ), denoted as x(τ ) = (e1, e2, . . . , eH ). H is the planning horizon. Here et represents the selected elements from (st, at, rt)" [1:3]


"In order to make the diffusion planner generate high rewarded trajectories during evaluations, guided sampling techniques are widely adopted." [1:4]


"When deploying the trained diffusion planner for online evaluations, fast sampling methods are usually adopted to reduce the inference time." [1:5]


Hierarchical Modeling


Safe Planning and Offline Reinforcement Learning


Multi-agent Reinforcement Learning


Multi-Task Reinforcement Learning


Imitation Learning and Robotics

___


  1. Diffusion Models for Reinforcement Learning Survey ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎