Back to feed
Dev.to
Dev.to
5/13/2026
I Was Calling It 'Setup' for Six Months. arXiv Has a Better Word: Harness

I Was Calling It 'Setup' for Six Months. arXiv Has a Better Word: Harness

Short summary

An engineer discovers an arXiv paper "Natural-Language Agent Harnesses" formalizing what they'd been calling "the setup" for six months. The paper identifies four structural primitives—roles, contracts, verification gates, delegation boundaries—and argues that natural-language specs remain valuable even as models improve. The author restructures their agent harness accordingly, finding verification gates the weakest primitive needing explicit definition.

  • ArXiv paper formalizes "harness" as a first-class concept in agent engineering with four key primitives: roles, contracts, verification gates, delegation boundaries
  • Natural-language specs are durable across model upgrades because they define requirements rather than one-shot prompts
  • Author restructures CLAUDE.md and orchestration scripts to formalize these primitives, discovering verification gates was their most neglected pattern

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more