MemQ: Integrating Q-Learning into Self-Evolving Memory Agents over Provenance DAGs

Short summary

MemQ applies reinforcement learning's TD(λ) eligibility traces to episodic memory in LLM agents, propagating credit through provenance DAGs that track how memories enable future memories. Tested on six tasks, it achieves the highest success rates, with largest gains (+5.7pp) on multi-step problems requiring deep memory chains. Code will be released soon.

•Novel method (MemQ) for credit propagation through memory dependencies using TD(λ) traces
•Outperforms baselines across 6 benchmarks including code generation, OS interaction, and QA
•Largest improvements (+5.7pp) on multi-step tasks with deep provenance chains

Generated with AI, which can make mistakes.

#ai-agents #research-breakthrough

Read full article at arXiv CS.AI

Is this a good recommendation for you?

MemQ: Integrating Q-Learning into Self-Evolving Memory Agents over Provenance DAGs

Short summary

Comments

Explore more