Brian Bernhard Moser

arXiv · 2026

LUMA: Benchmarking Segmentation via a Lightweight Universal Mask Adapter

Tobias Christian Nauen, Anosh Billimoria, Federico Raue, Stanislav Frolov, Brian Bernhard Moser, Andreas Dengel

LUMA is a lightweight, *backbone-agnostic* mask-transformer head that lets us fairly compare segmentation backbones by fixing the decoder. Benchmarking 20 backbones and 11 pretraining schemes, we find that "efficient" token mixers don't actually deliver efficiency and that the pretraining objective, not the architecture, governs segmentation quality.

→ project page ↗ pdf

arXiv · 2026

OA-CutMix: Correcting the Label Bias of CutMix

Tobias Christian Nauen, Stanislav Frolov, Federico Raue, Brian Bernhard Moser, Andreas Dengel

CutMix assigns labels by patch area, not by visible object content, a systematic bias that mislabels 21.5% of samples and creates ghost labels in 17%. OA-CutMix replaces the label with one derived from object area, leaving the image mixing unchanged. It matches or beats 10+ static and dynamic mixing methods across 4 architectures and 6 datasets.

→ project page ↗ pdf ↗ code ↗ Precomputed Segmentations

CVPR · 2026

When Pretty Isn't Useful: Investigating Why Modern Text-to-Image Models Fail as Reliable Training Data Generators

Krzysztof Adamkiewicz, Brian Bernhard Moser, Stanislav Frolov, Tobias Christian Nauen, Federico Raue, Andreas Dengel

We show that newer text-to-image models are progressively worse as training data generators, despite better visual quality, because they collapse to a narrow aesthetic-centric distribution that diverges from real data.

→ project page ↗ pdf ↗ supplementary material

TMLR · 2026

TextTeacher: What Can Language Teach About Images?

Tobias Christian Nauen, Stanislav Frolov, Brian Bernhard Moser, Federico Raue, Ahmed Anwar, Andreas Dengel

We use a frozen text encoder on image captions as a lightweight training-time auxiliary objective for image classifiers. The text components are dropped at inference, leaving a fast, unimodal vision model. Accuracy on ImageNet improves by up to +2.7 p.p. and downstream transfer by +1.0 p.p. on average, outperforming vision knowledge distillation at a fraction of the compute.

→ project page ↗ pdf ↗ code ↗ Precomputed Embeddings ↗ openreview

arXiv · 2026

Hyperspherical Forward-Forward with Prototypical Representations

Shalini Sarode, Brian Bernhard Moser, Joachim Folz, Federico Raue, Tobias Christian Nauen, Stanislav Frolov, Andreas Dengel

We fix Forward-Forward's slow inference by replacing per-class passes with a single forward pass through hyperspherical prototype matching. Thus, we achieve 40× faster inference with competitive accuracy.

→ project page ↗ pdf

TMLR · 2026

PRISM: Diversifying Dataset Distillation by Decoupling Architectural Priors

Brian Bernhard Moser, Shalini Sarode, Federico Raue, Krzysztof Adamkiewicz, Arundhati Shanbhag, Joachim Folz, Tobias Christian Nauen, Andreas Dengel

We introduce PRISM, a framework that disentangles architectural priors for dataset distillation, outperforming single-teacher setups.

→ project page ↗ pdf ↗ openreview

arXiv · 2025

SubZeroCore: A Submodular Approach with Zero Training for Coreset Selection

Brian Bernhard Moser, Tobias Christian Nauen, Arundhati Shanbhag, Federico Raue, Stanislav Frolov, Joachim Folz, Andreas Dengel

We introduce SubZeroCore, a novel, training-free coreset selection method that integrates submodular coverage and density into a single, unified objective.

→ project page ↗ pdf

Accepted to ICPR 2026 · 2025

HyperCore: Coreset Selection under Noise via Hypersphere Models

Brian Bernhard Moser, Arundhati Shanbhag, Tobias Christian Nauen, Stanislav Frolov, Federico Raue, Joachim Folz, Andreas Dengel

We present HyperCore, a lightweight adaptive coreset selection framework designed for noisy environments. HyperCore utilizes per class hypersphere models and adaptively selects pruning thresholds.

→ project page ↗ pdf

ICIP · 2025

When 512×512 is not Enough: Local Degradation-Aware Multi-Diffusion for Extreme Image Super-Resolution

Brian Bernhard Moser, Stanislav Frolov, Tobias Christian Nauen, Federico Raue, Andreas Dengel

We extend pretrained super-resolution models to larger images by using local-aware prompts.

→ project page ↗ pdf ↗ code ↗ doi

IJCNN · 2025

Distill the Best, Ignore the Rest: A Study in Latent Dataset Distillation on Core-Sets

Brian Bernhard Moser, Federico Raue, Tobias Christian Nauen, Stanislav Frolov, Andreas Dengel

We improve dataset distillation by distilling only a representative coreset.

→ project page ↗ pdf ↗ code ↗ doi

arXiv · 2025

ForAug: Mitigating Biases in Image Classification via Controlled Image Compositions

Tobias Christian Nauen, Brian Bernhard Moser, Federico Raue, Stanislav Frolov, Andreas Dengel

Image classification datasets carry compositional biases (objects are centered, at a typical size, on class-specific backgrounds) that models lean on and then break under distribution shift. ForAug segments and recombines foregrounds and backgrounds to break these correlations, making models across 10 architectures more accurate and more robust.

→ project page ↗ pdf ↗ code ↗ dataset ↗ Supplementary Material

Accepted to ICPR 2026 · 2025

A Study in Dataset Distillation for Image Super-Resolution

Tobias Dietz, Brian Bernhard Moser, Tobias Christian Nauen, Federico Raue, Stanislav Frolov, Andreas Dengel

We conduct the first systematic study of dataset distillation for Super-Resolution.

→ project page ↗ pdf

arXiv · 2024

Just Leaf It: Accelerating Diffusion Classifiers with Hierarchical Class Pruning

Arundhati Shanbhag, Brian Bernhard Moser, Tobias Christian Nauen, Stanislav Frolov, Federico Raue, Andreas Dengel

We speed up diffusion classifiers by utilizing a label hierarchy and pruning unrelated paths.

→ project page ↗ pdf

Accepted to ICPR 2026 · 2024

A Low-Resolution Image is Worth 1x1 Words: Enabling Fine Image Super-Resolution with Transformers and TaylorShift

Sanath Budakegowdanadoddi Nagaraju, Brian Bernhard Moser, Tobias Christian Nauen, Stanislav Frolov, Federico Raue, Andreas Dengel

We utilize the TaylorShift attention mechanism for global pixel-wise-attention in image super-resolution.

→ project page ↗ pdf

Brian Bernhard Moser.

Co-authored Publications: 14

LUMA: Benchmarking Segmentation via a Lightweight Universal Mask Adapter

OA-CutMix: Correcting the Label Bias of CutMix

When Pretty Isn't Useful: Investigating Why Modern Text-to-Image Models Fail as Reliable Training Data Generators

TextTeacher: What Can Language Teach About Images?

Hyperspherical Forward-Forward with Prototypical Representations

PRISM: Diversifying Dataset Distillation by Decoupling Architectural Priors

SubZeroCore: A Submodular Approach with Zero Training for Coreset Selection

HyperCore: Coreset Selection under Noise via Hypersphere Models

When 512×512 is not Enough: Local Degradation-Aware Multi-Diffusion for Extreme Image Super-Resolution

Distill the Best, Ignore the Rest: A Study in Latent Dataset Distillation on Core-Sets

ForAug: Mitigating Biases in Image Classification via Controlled Image Compositions

A Study in Dataset Distillation for Image Super-Resolution

Just Leaf It: Accelerating Diffusion Classifiers with Hierarchical Class Pruning

A Low-Resolution Image is Worth 1x1 Words: Enabling Fine Image Super-Resolution with Transformers and TaylorShift