arXiv cs.LG
5/13/2026

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection
Short summary
LEAP is a training-free method that identifies tokens converging early during diffusion language model denoising, reducing denoising steps by ~30% and enabling 7.2 tokens per step on GSM8K while preserving accuracy. The technique breaks reliance on high confidence thresholds, enabling faster parallel decoding without retraining.
- •LEAP detects early-converging tokens in diffusion LLM denoising via future context filtering and multi-sequence superposition
- •Reduces denoising steps ~30% compared to confidence-based methods while maintaining model precision
- •Achieves 7.2 tokens per step on benchmarks, making parallel decoding practical without confidence constraints
Generated with AI, which can make mistakes.
Is this a good recommendation for you?