Back to feed
arXiv cs.LG
arXiv cs.LG
5/12/2026
Distributional Reinforcement Learning via the Cram\'er Distance

Distributional Reinforcement Learning via the Cram\'er Distance

Short summary

Researchers introduce C-DSAC, a new Distributional Reinforcement Learning algorithm extending Soft Actor-Critic using Cramér distance minimization. Testing on robotic benchmarks shows superior performance to baseline SAC and other distributional methods, with larger gains in high-complexity tasks. Key insight: confidence-driven Q-value updates treat high-variance distributions as low-confidence targets, reducing overestimation errors.

  • Novel C-DSAC algorithm combines distributional RL with SAC using Cramér distance minimization
  • Empirically outperforms baseline SAC and contemporary distributional methods on robotic control tasks
  • Theoretical mechanism: confidence-driven updates reduce Q-value overestimation and improve convergence

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more