Attention Mechanisms Variants
Kernel Attention
Approach replacing softmax with positive kernel functions to achieve linear complexity. Enables efficient approximations while preserving the mathematical properties of attention.
← 뒤로