Tobias Nauen
Tobias Nauen
Home
Publications
Projects
Contact
Light
Dark
Automatic
paper-conference
TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax
This paper introduces TaylorShift, a novel reformulation of the attention mechanism using Taylor softmax that enables computing full token-to-token interactions in linear time. We analytically and empirically determine the crossover points where employing TaylorShift becomes more efficient than traditional attention. TaylorShift outperforms the traditional transformer architecture in 4 out of 5 tasks.
Tobias Christian Nauen
,
Sebastian Palacio
,
Andreas Dengel
PDF
Cite
Code
Project
Appendix
Cite
×