Back to feed
AR
arXiv CS.AI
5/12/2026
MemQ: Integrating Q-Learning into Self-Evolving Memory Agents over Provenance DAGs

MemQ: Integrating Q-Learning into Self-Evolving Memory Agents over Provenance DAGs

Short summary

MemQ applies reinforcement learning's TD(λ) eligibility traces to episodic memory in LLM agents, propagating credit through provenance DAGs that track how memories enable future memories. Tested on six tasks, it achieves the highest success rates, with largest gains (+5.7pp) on multi-step problems requiring deep memory chains. Code will be released soon.

  • Novel method (MemQ) for credit propagation through memory dependencies using TD(λ) traces
  • Outperforms baselines across 6 benchmarks including code generation, OS interaction, and QA
  • Largest improvements (+5.7pp) on multi-step tasks with deep provenance chains

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more