✨ TL;DR
This paper proposes Tokenised Flow Matching for Posterior Estimation (TFMPE), a method that reduces simulator evaluation costs in hierarchical simulation-based inference by learning per-site neural surrogates and assembling synthetic observations. The approach is validated on infectious disease and computational fluid dynamics models with improved computational efficiency.
Simulation-based inference (SBI) is computationally expensive because it requires many simulator evaluations. In hierarchical settings with shared global parameters and site-level observations, existing SBI methods still require simulating across multiple sites per training sample, which is inefficient. There is a need to exploit the hierarchical structure to reduce the number of required simulator calls while maintaining inference quality.
The paper proposes likelihood factorisation (LF) to train from single-site simulations rather than multi-site ones. The method learns a per-site neural surrogate of the simulator and then assembles synthetic multi-site observations to amortise inference for the full hierarchical posterior. Building on this foundation, TFMPE uses tokenised flow matching to support function-valued observations through likelihood factorisation, enabling efficient hierarchical posterior estimation.
What the paper shows.
TFMPE produces well-calibrated posteriors while significantly reducing computational cost compared to existing hierarchical SBI methods. The approach is validated on a newly introduced benchmark for hierarchical SBI as well as realistic models including infectious disease simulations and computational fluid dynamics models, demonstrating practical applicability across different domains.
The paper focuses on hierarchical settings with exchangeable site-level parameters and observations, which may limit applicability to non-hierarchical or non-exchangeable problems. The quality of the learned neural surrogates depends on their training, and the approach's performance on very high-dimensional or highly nonlinear simulators is not thoroughly explored. The computational savings are relative to existing hierarchical SBI methods, but absolute computational requirements for training surrogates are not extensively discussed.
✨ Generated by Claude · Apr 25, 2026 · Read the PDF for authoritative content.