Where Reliability Lives in Vision-Language Models: A Mechanistic Study of Attention, Hidden States, and Causal Circuits

Short summary

Attention sharpness does NOT predict VLM correctness in 3-7B models (R_pb≈0.001), but hidden-state geometry and self-consistency are strong predictors (R_pb=0.43). Causal ablations reveal late-fusion architectures concentrate reliability in fragile bottlenecks; early-fusion models distribute robustness across layers. Monitor hidden states and sparse circuits rather than attention maps.

•Attention structure is a near-zero predictor of correctness across LLaVA-1.5, PaliGemma, and Qwen2-VL
•Self-consistency and hidden-state geometry are 10-430x better predictors of VLM reliability
•Late-fusion models have fragile late bottlenecks; early-fusion models absorb neuron ablations with minimal degradation

Generated with AI, which can make mistakes.

#research-breakthrough #ai-tools #ai-agents

Read full article at arXiv CS.AI

Is this a good recommendation for you?

Where Reliability Lives in Vision-Language Models: A Mechanistic Study of Attention, Hidden States, and Causal Circuits

Short summary

Comments

Explore more