Planning Across Controls, Reinforcement Learning, and Diffusion
dBackground: Diffusion Models for Reinforcement Learning Survey
"Current works applying Diffusion models on Reinforcement Learning mainly fall into three categories: as planners, as policies, and as data synthesizers." [1]
planning in rl: Learning to Combat Compounding-Error in Model-Based Reinforcement Learning
"Planning in RL refers to using a dynamic model to make
decisions imaginarily and selecting the appropriate action to
maximize cumulative rewards." [1:1]
"Planning is commonly used in the MBRL framework with a learned dynamic model. However, the planning sequences are usually simulated auto-regressively, which may lead to severe compounding errors, especially in the offline setting due to limited data support." [1:2]
planning in diffusion: Planning with Diffusion for Flexible Behavior Synthesis
"Diffusion models are designed to generate clips of the trajectory τ = (s1, a1, r1, . . . , sH , aH , rH ), denoted as x(τ ) = (e1, e2, . . . , eH ). H is the planning horizon. Here et represents the selected elements from (st, at, rt)" [1:3]
"In order to make the diffusion planner generate high rewarded trajectories during evaluations, guided sampling techniques are widely adopted." [1:4]
"When deploying the trained diffusion planner for online evaluations, fast sampling methods are usually adopted to reduce the inference time." [1:5]
Safe Planning and Offline Reinforcement Learning
Multi-agent Reinforcement Learning
Multi-Task Reinforcement Learning
Imitation Learning and Robotics