arXiv cs.CL
5/12/2026

SalesSim: Benchmarking and Aligning Multimodal Language Models as Retail User Simulators
Short summary
SalesSim is a benchmarking framework for evaluating how well multimodal language models simulate realistic customer behavior in retail conversations. Testing revealed significant behavioral gaps—models achieve less than 79% alignment with persona specifications and show lower lexical diversity than humans. UserGRPO, a reinforcement learning approach, improves decision alignment by 13.8% while maintaining conversational quality.
- •New benchmarking framework (SalesSim) for evaluating LLMs as retail customer simulators with persona-driven behavior
- •Current MLLMs struggle with decision alignment—drift from persona specs and get too easily persuaded by sales agents
- •UserGRPO (RL-based solution) improves alignment by 13.8% while maintaining fluent conversation quality
Generated with AI, which can make mistakes.
Is this a good recommendation for you?