Efficient Transformers
Local Attention
Attention mechanism restricted to local neighborhoods around each position, drastically reducing the number of token pairs to consider. This approach is particularly effective for data with strong local structure.
← Tillbaka