🏠 Trang chủ
Benchmark
📊 Tất cả benchmark 🦖 Khủng long v1 🦖 Khủng long v2 ✅ Ứng dụng To-Do List 🎨 Trang tự do sáng tạo 🎯 FSACB - Trình diễn cuối cùng 🌍 Benchmark dịch thuật
Mô hình
🏆 Top 10 mô hình 🆓 Mô hình miễn phí 📋 Tất cả mô hình ⚙️ Kilo Code
Tài nguyên
💬 Thư viện prompt 📖 Thuật ngữ AI 🔗 Liên kết hữu ích

Thuật ngữ AI

Từ điển đầy đủ về Trí tuệ nhân tạo

162
danh mục
2.032
danh mục con
23.060
thuật ngữ
📖
thuật ngữ

Longformer

Transformer architecture using a combination of local sliding window attention and global attention to efficiently process very long sequences with linear complexity.

📖
thuật ngữ

BigBird

Model implementing sparse attention through three patterns: local, global, and random attention, allowing processing of sequences up to 4096 tokens with theoretical preservation of universal properties.

📖
thuật ngữ

Sliding Window Attention

Technique where each token only attends to a fixed number of neighbors in a sliding window, reducing complexity to O(n*w) where w is the window size.

📖
thuật ngữ

Dilated Sliding Window

Variant of sliding window attention using jumps (dilation) to increase the receptive field without increasing computational complexity.

📖
thuật ngữ

Global Attention

Mechanism where certain predefined tokens (like [CLS] tokens) can attract attention from all other tokens, allowing information propagation across the entire sequence.

📖
thuật ngữ

Random Attention

Approach where each token randomly attends to a subset of distant tokens, preserving long-distance connections with low computational overhead.

📖
thuật ngữ

Pattern-based Attention

Strategy applying predefined sparse attention patterns (like fixed or learned patterns) to determine which query-key pairs to compute.

📖
thuật ngữ

Linear Complexity Attention

Class of attention methods reducing algorithmic complexity from O(n²) to O(n), enabling scaling for very long sequences.

📖
thuật ngữ

Kernel-based Attention

Approach using kernels to approximate softmax attention, enabling linear complexity calculations through techniques like FAVOR+ (Fast Attention Via Positive Orthogonal Random Features).

📖
thuật ngữ

Low-rank Approximation

Technique approximating the attention matrix through low-rank decomposition, significantly reducing memory and computational requirements.

📖
thuật ngữ

Clustering-based Attention

Method that first groups tokens into similar clusters then applies attention at the cluster level, reducing the number of required computations.

📖
thuật ngữ

Routing Attention

Mechanism that learns to route queries to the most relevant keys using content-based routing functions, avoiding unnecessary computations.

📖
thuật ngữ

Reformer

Architecture using locality-sensitive hashing (LSH) to limit attention computations to the most similar pairs, with quasi-linear complexity in sequence length.

📖
thuật ngữ

Performer

Model based on FAVOR+ attention that efficiently approximates softmax attention through positive orthogonal random features, enabling linear complexity.

📖
thuật ngữ

Linformer

Architecture that projects the key-value matrix into a lower-dimensional space, transforming complexity from O(n²) to O(n*k) where k << n.

📖
thuật ngữ

Routing Transformer

Model using k-means based routing to group tokens and apply attention selectively, optimizing computations for long-distance dependencies.

📖
thuật ngữ

Sinkhorn Sorting

Algorithm using Sinkhorn iteration to transform attention into a differentiable permutation, applied in sparse attention architectures.

📖
thuật ngữ

Efficient Attention

Paradigm encompassing all attention variants aimed at reducing computational complexity while preserving the modeling capabilities of Transformers.

🔍

Không tìm thấy kết quả