arXiv cs.CL
6/17/2026

Self-Generated Error Training for Token Editing in Diffusion Language Models
Short summary
Research addresses training-inference mismatch in token-editing for LLMs using self-generated error training on LLaDA2.1. LoRA continued-pretraining improves decoding accuracy while reducing edit intensity. Method mitigates failure modes including digit transcription errors and excessive self-correction.
- •Self-generated training approach fixes mismatch between training on random corruptions and inference on model-generated errors
- •LoRA continued-pretraining on LLaDA2.1-mini improves accuracy and reduces token-editing intensity
- •Addresses specific failure modes: digit transcription errors and excessive self-correction before factual answers
Generated with AI, which can make mistakes.
Is this a good recommendation for you?