✨ TL;DR
This paper introduces Neural Indicator Sampling (NI Sampling), a method that optimizes the token sampling order in discrete diffusion language models to achieve up to 14.3× speedup over standard sampling while maintaining accuracy. The approach uses a trained neural indicator to intelligently select which tokens to sample at each step, dramatically reducing the number of required sampling iterations.
Discrete diffusion language models (dLLMs) offer advantages over autoregressive models by generating tokens in arbitrary orders with potential for parallel decoding. However, existing sampling strategies are inefficient because they only sample a small subset of tokens at each step using heuristic approaches. This inefficiency leaves significant room for improvement in terms of sampling speed and computational efficiency. The core challenge is determining the optimal order in which to sample tokens to minimize the total number of sampling iterations while maintaining generation quality.
The paper proposes Neural Indicator Sampling (NI Sampling), a framework that uses a learned neural indicator to optimize token sampling order. The key insight is that fully leveraging correct predictions at each step can dramatically reduce sampling iterations. The neural indicator is trained to decide which tokens should be sampled at each step, rather than relying on heuristics. The authors introduce a novel trajectory-preserving objective for training the indicator, which ensures that the learned sampling strategy maintains the quality of the generation process. This approach is designed to be general and applicable across different discrete diffusion models.
What the paper shows.
Experiments conducted on LLaDA and Dream models across multiple benchmarks demonstrate that NI Sampling achieves up to 14.3× acceleration compared to full-step sampling with negligible performance degradation. The method consistently outperforms confidence threshold sampling across different accuracy-step trade-off points. The results show that the learned neural indicator successfully identifies which tokens can be reliably predicted early in the sampling process, enabling substantial reductions in the total number of sampling steps required for high-quality generation.
The paper does not explicitly discuss limitations in detail. Implicit limitations may include the computational overhead of training the neural indicator, potential generalization challenges across different types of text generation tasks or domains, and the dependency on the quality of the underlying discrete diffusion model. The method requires additional training beyond the base diffusion model, which may add complexity to the overall pipeline. The extent to which the approach scales to very large models or extremely long sequences is not thoroughly explored.
✨ Generated by Claude · Apr 21, 2026 · Read the PDF for authoritative content.