✨ TL;DR
DLE is a deterministic decoding method that systematically explores distinct reasoning paths in a truncated decoding tree instead of sampling with replacement, improving inference efficiency and performance on math, coding, and reasoning tasks.
Self-consistency sampling improves inference performance by generating multiple reasoning traces and voting on answers. However, this approach is computationally inefficient in constrained domains like mathematics and code because it repeatedly samples the same high-probability prefixes and generates duplicate completions, wasting compute budget on redundant exploration.
Distinct Leaf Enumeration (DLE) treats truncated sampling as tree traversal through a pruned decoding tree and deterministically enumerates distinct leaf nodes rather than sampling with replacement. The method improves efficiency in two ways: algorithmically by systematically exploring previously unvisited high-probability branches to increase search space coverage, and systemically by reusing shared prefixes to reduce redundant token generation.
What the paper shows.
DLE outperforms stochastic self-consistency on math, coding, and general reasoning tasks by exploring higher-quality reasoning traces. The method achieves better performance while maintaining or improving computational efficiency through reduced redundant token generation and more systematic coverage of the truncated search space.
The paper does not explicitly discuss limitations, though the approach appears tailored to constrained domains with structured outputs (math, code) where duplicate completions are common. Applicability to open-ended generation tasks or domains with high output diversity is not addressed. The specific computational savings and performance gains are not quantified with detailed metrics or comparisons across different budget constraints.
✨ Generated by Claude · Apr 25, 2026 · Read the PDF for authoritative content.