Tobias Christian Nauen
PhD Student
My research interests include efficiency of machine learning models, multimodal learning, and transformer models.
Publications
This paper introduces TaylorShift, a novel reformulation of the attention mechanism using Taylor softmax that enables computing full token-to-token interactions in linear time. We analytically and empirically determine the crossover points where employing TaylorShift becomes more efficient than traditional attention. TaylorShift outperforms the traditional transformer architecture in 4 out of 5 tasks.
Tobias Christian Nauen, Sebastian Palacio, Andreas Dengel
A comprehensive benchmark of more than 30 transformer models for vision to evaluate their efficiency, considering various performance metrics.
Tobias Christian Nauen, Sebastian Palacio, Andreas Dengel