Sparse Attention
Clustering-based Attention
Method that first groups tokens into similar clusters then applies attention at the cluster level, reducing the number of required computations.
← Quay lại