Researchhjbcontrolled diffusionscontinuous time rldiffusion models

Bellman Links HJB To Reinforcement Learning

|March 30, 2026

7.2

Relevance Score

Bellman Links HJB To Reinforcement Learning — Photo: dani2442.github.io · rights & takedowns

On March 30, 2026, a technical article revisits Richard Bellman’s 1952 dynamic programming and traces its equivalence with the Hamilton–Jacobi framework from the 1840s. The piece derives the Hamilton–Jacobi–Bellman PDE for deterministic and Itô stochastic systems, connects Q-functions and policy iteration (using MLPs and model-based generators), and frames diffusion-model training as stochastic optimal control.

Scoring Rationale

Solid theoretical synthesis with clear practical implications for RL and generative-model researchers. Scored high for credibility and relevance, moderate for novelty and actionability; timely (published today) and detailed, which slightly boosts the final score.