Back to feed
Dev.to
Dev.to
5/9/2026
How a $0.02/Call Model Scored 78.2% on SWE-bench Verified — Beating Every Model on the Leaderboard

How a $0.02/Call Model Scored 78.2% on SWE-bench Verified — Beating Every Model on the Leaderboard

Short summary

Xanther's Context Engine (MCP-based) enables MiniMax M2.5 ($0.02 per call) to achieve 78.2% on SWE-bench Verified, outperforming Claude Opus 4.5 at 76.8% ($0.75). Improvement comes entirely from architectural context injection, not model capability. Performance gains correlate with codebase complexity—sympy +17%, pytest +8%—making 76%+ performance 3.4x cheaper.

  • MiniMax M2.5 + Xanther Context Engine scores 78.2% on SWE-bench Verified—beats Claude Opus 4.5 (76.8%) at 3.4x lower cost
  • Context advantage is architectural: understanding codebase dependencies and inheritance chains enables better bug fixes
  • Benefit scales with complexity: deep-dependency codebases (sympy +17%) gain more than flat ones (pytest +8%)

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more