Back to feed
arXiv cs.CL
arXiv cs.CL
6/17/2026
Self-Generated Error Training for Token Editing in Diffusion Language Models

Self-Generated Error Training for Token Editing in Diffusion Language Models

Short summary

Research addresses training-inference mismatch in token-editing for LLMs using self-generated error training on LLaDA2.1. LoRA continued-pretraining improves decoding accuracy while reducing edit intensity. Method mitigates failure modes including digit transcription errors and excessive self-correction.

  • Self-generated training approach fixes mismatch between training on random corruptions and inference on model-generated errors
  • LoRA continued-pretraining on LLaDA2.1-mini improves accuracy and reduces token-editing intensity
  • Addresses specific failure modes: digit transcription errors and excessive self-correction before factual answers

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more