Dev.to
5/13/2026

I Was Calling It 'Setup' for Six Months. arXiv Has a Better Word: Harness
Short summary
An engineer discovers an arXiv paper "Natural-Language Agent Harnesses" formalizing what they'd been calling "the setup" for six months. The paper identifies four structural primitives—roles, contracts, verification gates, delegation boundaries—and argues that natural-language specs remain valuable even as models improve. The author restructures their agent harness accordingly, finding verification gates the weakest primitive needing explicit definition.
- •ArXiv paper formalizes "harness" as a first-class concept in agent engineering with four key primitives: roles, contracts, verification gates, delegation boundaries
- •Natural-language specs are durable across model upgrades because they define requirements rather than one-shot prompts
- •Author restructures CLAUDE.md and orchestration scripts to formalize these primitives, discovering verification gates was their most neglected pattern
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



