Closing the Domain Gap in Biomedical Imaging by In-Context Control Samples

Ana Sanchez-Fernandez; Thomas Pinetz; Werner Zellinger; Günter Klambauer

✨ TL;DR

This paper proposes CS-ARM-BN, a meta-learning method that uses negative control samples to adapt deep learning models to new experimental batches in biomedical imaging, closing the domain gap caused by batch effects. The approach achieves 0.935±0.018 accuracy on drug mechanism-of-action classification, recovering performance from 0.862±0.060 back to near training-domain levels of 0.939±0.005.

01 · Problem

Batch effects—systematic technical variations unrelated to biological signals—are a critical problem in biomedical imaging that severely limit the practical deployment of deep learning models. When trained on one experimental batch, models experience dramatic performance drops on new batches from different experimental conditions or labs, with accuracy falling from 0.939 to 0.862 on the JUMP-CP dataset. Despite years of research, no existing method has successfully closed this domain gap for deep learning systems, preventing their real-world clinical and research applications.

02 · Approach

The authors propose Control-Stabilized Adaptive Risk Minimization via Batch Normalization (CS-ARM-BN), a meta-learning adaptation method that leverages negative control samples—unperturbed reference images that are routinely included in every experimental batch by design. These control samples serve as stable in-context anchors for adaptation, allowing the model to recalibrate to new batch conditions without requiring labeled data from the target domain. The method exploits batch normalization statistics to perform principled domain adaptation.

03 · Key insights

What the paper shows.

01Negative control samples, which are standard in biomedical experiments, provide a reliable and always-available signal for domain adaptation without additional experimental cost

02Meta-learning approaches can effectively close the domain gap in biomedical imaging, unlike standard fine-tuning or foundation models with normalization techniques

03Batch effects can be neutralized through in-context adaptation, making models both practically deployable and computationally efficient

04The method recovers 97% of the performance gap (from 0.862 to 0.935), demonstrating that domain shift in biomedical imaging is addressable with appropriate adaptation strategies

04 · Results

On the JUMP-CP dataset for Mechanism-of-Action classification, CS-ARM-BN achieves 0.935±0.018 accuracy on new experimental batches, compared to 0.862±0.060 for standard ResNets and failure of foundation models even with Typical Variation Normalization. This represents recovery of the domain gap from a 7.7 percentage point drop to only 0.4 percentage points relative to the training domain performance of 0.939±0.005. The method is particularly effective when batches exhibit strong domain shifts, such as data generated in different laboratories.

05 · Limitations

The paper focuses evaluation on a single task (Mechanism-of-Action classification) on one large-scale dataset (JUMP-CP), limiting generalizability claims to other biomedical imaging tasks and modalities. The method assumes negative control samples are available in every batch, which may not hold for all experimental designs or historical datasets. The paper does not provide detailed computational cost analysis or comparison of adaptation efficiency relative to alternatives. Ablation studies on the specific components of CS-ARM-BN (batch normalization choice, meta-learning algorithm details) are not discussed.

✨ Generated by Claude · Apr 25, 2026 · Read the PDF for authoritative content.

What the paper shows.

↘ Related papers