Dev.to
5/9/2026

How a $0.02/Call Model Scored 78.2% on SWE-bench Verified — Beating Every Model on the Leaderboard
Short summary
Xanther's Context Engine (MCP-based) enables MiniMax M2.5 ($0.02 per call) to achieve 78.2% on SWE-bench Verified, outperforming Claude Opus 4.5 at 76.8% ($0.75). Improvement comes entirely from architectural context injection, not model capability. Performance gains correlate with codebase complexity—sympy +17%, pytest +8%—making 76%+ performance 3.4x cheaper.
- •MiniMax M2.5 + Xanther Context Engine scores 78.2% on SWE-bench Verified—beats Claude Opus 4.5 (76.8%) at 3.4x lower cost
- •Context advantage is architectural: understanding codebase dependencies and inheritance chains enables better bug fixes
- •Benefit scales with complexity: deep-dependency codebases (sympy +17%) gain more than flat ones (pytest +8%)
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



