Sparse Attention
Random Attention
Approach where each token randomly attends to a subset of distant tokens, preserving long-distance connections with low computational overhead.
← IndietroApproach where each token randomly attends to a subset of distant tokens, preserving long-distance connections with low computational overhead.
← Indietro