🏠 Trang chủ
Benchmark
📊 Tất cả benchmark 🦖 Khủng long v1 🦖 Khủng long v2 ✅ Ứng dụng To-Do List 🎨 Trang tự do sáng tạo 🎯 FSACB - Trình diễn cuối cùng 🌍 Benchmark dịch thuật
Mô hình
🏆 Top 10 mô hình 🆓 Mô hình miễn phí 📋 Tất cả mô hình ⚙️ Kilo Code
Tài nguyên
💬 Thư viện prompt 📖 Thuật ngữ AI 🔗 Liên kết hữu ích

Thuật ngữ AI

Từ điển đầy đủ về Trí tuệ nhân tạo

162
danh mục
2.032
danh mục con
23.060
thuật ngữ
📖
thuật ngữ

Multi-Head Self-Attention (MHSA)

Mechanism allowing the model to focus on different parts of the image simultaneously by computing multiple attention matrices in parallel, thus capturing various types of spatial relationships.

📖
thuật ngữ

Layer Scale

Regularization technique introduced in deep ViTs where learnable weights are applied to residual outputs to stabilize the training of initial layers.

📖
thuật ngữ

Windowed Attention

Attention mechanism restricted to local non-overlapping windows of the image, reducing computational complexity from O(n²) to O(n) where n is the number of patches.

📖
thuật ngữ

Shifted Window Attention

Technique where attention windows are shifted between layers to enable cross-window connections, thereby improving the model's ability to model long-range relationships.

📖
thuật ngữ

DeiT (Data-efficient Image Transformer)

Variant of ViT trainable with more modest amounts of data through a knowledge distillation strategy where a distillation token is added to learn from a CNN teacher.

📖
thuật ngữ

Distillation Token

Additional token in DeiT that learns to mimic the predictions of a teacher model (often a CNN), facilitating knowledge transfer and improving performance with less data.

📖
thuật ngữ

Masked Autoencoder (MAE)

Self-supervised approach for ViT where random patches of the image are masked (up to 75%) and the model learns to reconstruct them, revealing surprising learning capabilities.

📖
thuật ngữ

Patch Merging

Operation in hierarchical transformers that combines groups of 2x2 adjacent patches to create lower-resolution tokens, thereby increasing depth and receptive field.

📖
thuật ngữ

Relative Position Bias

Bias added to attention scores that depends on the relative positions of patches, improving the model's ability to understand spatial relationships without absolute position encoding.

📖
thuật ngữ

Hybrid Architecture

Approach combining an initial convolutional network for feature extraction with a transformer for global processing, used in early ViT implementations to reduce data requirements.

📖
thuật ngữ

Token Labeling

Training strategy where each patch receives a supervised label instead of a single label per image, forcing the model to learn richer and more localized representations.

🔍

Không tìm thấy kết quả