✨ TL;DR
BACO is a framework that compresses embedding tables in recommender systems by clustering similar users and items to share embeddings, achieving over 75% parameter reduction with minimal accuracy loss. It outperforms existing methods by being up to 346X faster while maintaining recommendation quality.
Modern recommender systems rely on dense embedding vectors for users and items, but at industrial scale these embedding tables require enormous numbers of parameters. This creates substantial computational and memory overhead during both training and inference, making deployment difficult under resource constraints. Existing compression approaches face a critical trade-off: they either severely degrade recommendation accuracy or require prohibitively high computational costs, making them impractical for real-world applications.
BACO compresses embeddings by exploiting collaborative signals in user-item interactions to group similar users and items that can share the same embeddings from a smaller codebook. The method formulates a balanced co-clustering objective that maximizes connectivity within clusters while maintaining balanced cluster sizes to prevent codebook collapse. The framework unifies canonical graph clustering techniques and implements a principled weighting scheme for users and items, an efficient label propagation solver, and introduces secondary user clusters to produce effective groupings while avoiding degenerate solutions.
What the paper shows.
BACO achieves over 75% reduction in embedding parameters while maintaining recommendation quality with at most 1.85% drop in recall metrics across benchmark datasets. The method demonstrates substantial computational efficiency gains, running up to 346X faster than the strongest baseline methods. These results show that BACO successfully balances the trade-off between compression ratio and accuracy while providing practical speedups for industrial deployment.
The paper does not explicitly discuss limitations in detail. Implicit limitations may include the dependency on the quality of collaborative signals in the interaction data, potential challenges in extremely sparse datasets where clustering may be less effective, and the need to tune hyperparameters like cluster sizes and weighting schemes for different datasets. The approach also requires an initial graph construction phase based on user-item interactions, which may add preprocessing overhead.
✨ Generated by Claude · Apr 21, 2026 · Read the PDF for authoritative content.