AR
arXiv CS.AI
5/12/2026

Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria
Short summary
Auto-Rubric as Reward (ARR) converts implicit preference knowledge into explicit, interpretable rubrics for multimodal model alignment, addressing RLHF vulnerabilities like reward hacking. Rubric Policy Optimization (RPO) maintains multi-dimensional quality evaluation while stabilizing training. Outperforms pairwise reward models on text-to-image and image-editing tasks.
- •ARR externalizes preference knowledge as explicit rubrics instead of opaque scalar signals
- •Reduces vulnerabilities to reward hacking and positional bias in RLHF
- •RPO achieves stronger performance on multimodal generation benchmarks
Generated with AI, which can make mistakes.
Is this a good recommendation for you?