SalesSim: Benchmarking and Aligning Multimodal Language Models as Retail User Simulators

Short summary

SalesSim is a benchmarking framework for evaluating how well multimodal language models simulate realistic customer behavior in retail conversations. Testing revealed significant behavioral gaps—models achieve less than 79% alignment with persona specifications and show lower lexical diversity than humans. UserGRPO, a reinforcement learning approach, improves decision alignment by 13.8% while maintaining conversational quality.

•New benchmarking framework (SalesSim) for evaluating LLMs as retail customer simulators with persona-driven behavior
•Current MLLMs struggle with decision alignment—drift from persona specs and get too easily persuaded by sales agents
•UserGRPO (RL-based solution) improves alignment by 13.8% while maintaining fluent conversation quality

Generated with AI, which can make mistakes.

#ai-tools #ai-agents #research-breakthrough

Read full article at arXiv cs.CL

Is this a good recommendation for you?

SalesSim: Benchmarking and Aligning Multimodal Language Models as Retail User Simulators

Short summary

Explore more