Back to feed
arXiv cs.LG
arXiv cs.LG
6/17/2026
Correct When Paired, Wrong When Split: Decoupling and Editing Modality-Specific Neurons in MLLMs

Correct When Paired, Wrong When Split: Decoupling and Editing Modality-Specific Neurons in MLLMs

Short summary

arXiv researchers found that knowledge edits in multimodal AI models work for paired text-image inputs but revert to outdated facts for single-modality queries. Entity knowledge is stored across separate modality-specific neural pathways, not unified. The DECODE method explicitly targets these pathways to ensure consistent knowledge updates across all input modalities.

  • Knowledge editing in MLLMs fails when multimodal inputs are split into text or image alone
  • Entity knowledge is distributed across decoupled modality-specific neural pathways
  • DECODE method disentangles and targets modality-specific neurons for consistent edits

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more