TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax

Tobias Christian Nauen

ICPR 2024 (oral)

December 2024

1 slides

Abstract

Oral presentation at ICPR 2024 introducing TaylorShift, a novel reformulation of the attention mechanism using Taylor-Softmax that enables full token-to-token interactions in linear time.

Topics

Deep Learning Efficient AI

Related Resources

PDF Code

Slides