Back to feed
arXiv cs.CL
arXiv cs.CL
5/12/2026
SalesSim: Benchmarking and Aligning Multimodal Language Models as Retail User Simulators

SalesSim: Benchmarking and Aligning Multimodal Language Models as Retail User Simulators

Short summary

SalesSim is a benchmarking framework for evaluating how well multimodal language models simulate realistic customer behavior in retail conversations. Testing revealed significant behavioral gaps—models achieve less than 79% alignment with persona specifications and show lower lexical diversity than humans. UserGRPO, a reinforcement learning approach, improves decision alignment by 13.8% while maintaining conversational quality.

  • New benchmarking framework (SalesSim) for evaluating LLMs as retail customer simulators with persona-driven behavior
  • Current MLLMs struggle with decision alignment—drift from persona specs and get too easily persuaded by sales agents
  • UserGRPO (RL-based solution) improves alignment by 13.8% while maintaining fluent conversation quality

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more