🏠 홈
벤치마크
📊 모든 벤치마크 🦖 공룡 v1 🦖 공룡 v2 ✅ 할 일 목록 앱 🎨 창의적인 자유 페이지 🎯 FSACB - 궁극의 쇼케이스 🌍 번역 벤치마크
모델
🏆 톱 10 모델 🆓 무료 모델 📋 모든 모델 ⚙️ 킬로 코드 모드
리소스
💬 프롬프트 라이브러리 📖 AI 용어 사전 🔗 유용한 링크

AI 용어집

인공지능 완전 사전

162
카테고리
2,032
하위 카테고리
23,060
용어
📖
용어

Multi-Head Self-Attention (MHSA)

Mechanism allowing the model to focus on different parts of the image simultaneously by computing multiple attention matrices in parallel, thus capturing various types of spatial relationships.

📖
용어

Layer Scale

Regularization technique introduced in deep ViTs where learnable weights are applied to residual outputs to stabilize the training of initial layers.

📖
용어

Windowed Attention

Attention mechanism restricted to local non-overlapping windows of the image, reducing computational complexity from O(n²) to O(n) where n is the number of patches.

📖
용어

Shifted Window Attention

Technique where attention windows are shifted between layers to enable cross-window connections, thereby improving the model's ability to model long-range relationships.

📖
용어

DeiT (Data-efficient Image Transformer)

Variant of ViT trainable with more modest amounts of data through a knowledge distillation strategy where a distillation token is added to learn from a CNN teacher.

📖
용어

Distillation Token

Additional token in DeiT that learns to mimic the predictions of a teacher model (often a CNN), facilitating knowledge transfer and improving performance with less data.

📖
용어

Masked Autoencoder (MAE)

Self-supervised approach for ViT where random patches of the image are masked (up to 75%) and the model learns to reconstruct them, revealing surprising learning capabilities.

📖
용어

Patch Merging

Operation in hierarchical transformers that combines groups of 2x2 adjacent patches to create lower-resolution tokens, thereby increasing depth and receptive field.

📖
용어

Relative Position Bias

Bias added to attention scores that depends on the relative positions of patches, improving the model's ability to understand spatial relationships without absolute position encoding.

📖
용어

Hybrid Architecture

Approach combining an initial convolutional network for feature extraction with a transformer for global processing, used in early ViT implementations to reduce data requirements.

📖
용어

Token Labeling

Training strategy where each patch receives a supervised label instead of a single label per image, forcing the model to learn richer and more localized representations.

🔍

결과를 찾을 수 없습니다