

Tobias Christian Nauen
PhD Student
My research interests include efficiency of machine learning models, multimodal learning, and transformer models.
Publications
A comprehensive benchmark and analysis of more than 45 transformer models for image classification to evaluate their efficiency, considering various performance metrics. We find the optimal architectures to use and uncover that model-scaling is more efficient than image scaling.
Tobias Christian Nauen, Sebastian Palacio, Federico Raue, Andreas Dengel
This paper introduces TaylorShift, a novel reformulation of the attention mechanism using Taylor softmax that enables computing full token-to-token interactions in linear time. We analytically and empirically determine the crossover points where employing TaylorShift becomes more efficient than traditional attention. TaylorShift outperforms the traditional transformer architecture in 4 out of 5 tasks.
Tobias Christian Nauen, Sebastian Palacio, Andreas Dengel
We improve dataset distillation by distilling only a representative coreset.
Brian Bernhard Moser, Federico Raue, Tobias Christian Nauen, Stanislav Frolov, Andreas Dengel
We speed up diffusion classifiers by utilizing a label hierarchy and pruning unrelated paths.
Arundhati S Shanbhag, Brian Bernhard Moser, Tobias Christian Nauen, Stanislav Frolov, Federico Raue, Andreas Dengel
We extend pretrained super-resolution models to larger images by using local-aware prompts.
Brian B. Moser, Stanislav Frolov, Tobias Christian Nauen, Federico Raue, Andreas Dengel
We utilize the TaylorShift attention mechanism for global pixel-wise-attention in image super-resolution.
Sanath Budakegowdanadoddi Nagaraju, Brian Bernhard Moser, Tobias Christian Nauen, Stanislav Frolov, Federico Raue, Andreas Dengel