🏠 Startseite
Vergleiche
📊 Alle Benchmarks 🦖 Dinosaurier v1 🦖 Dinosaurier v2 ✅ To-Do-Listen-Apps 🎨 Kreative freie Seiten 🎯 FSACB - Ultimatives Showcase 🌍 Übersetzungs-Benchmark
Modelle
🏆 Top 10 Modelle 🆓 Kostenlose Modelle 📋 Alle Modelle ⚙️ Kilo Code
Ressourcen
💬 Prompt-Bibliothek 📖 KI-Glossar 🔗 Nützliche Links

KI-Glossar

Das vollständige Wörterbuch der Künstlichen Intelligenz

162
Kategorien
2.032
Unterkategorien
23.060
Begriffe
📖
Begriffe

Relative Position Encoding

Positional encoding technique based on relative distances between tokens rather than their absolute positions. Improves the model's generalization ability to sequence lengths not seen during training.

📖
Begriffe

Rotary Position Embedding (RoPE)

Positional encoding method applying matrix rotation to query and key embeddings based on their positions. Naturally integrates positional information into the attention mechanism without adding parameters.

📖
Begriffe

Linear Attention

Family of attention mechanisms with linear complexity O(n) using matrix decompositions or kernels to avoid explicit computation of the attention matrix. Enables processing very long sequences with increased computational efficiency.

📖
Begriffe

Longformer Attention

Hybrid architecture combining local attention via sliding window and global attention for certain tokens. Enables processing documents of several thousand tokens with linear complexity.

📖
Begriffe

BigBird Attention

Sparse attention mechanism combining three types of connections: random, local, and global to approximate full attention. Theoretically proven as a universal approximator for complete graphs with linear complexity.

📖
Begriffe

Reformer Attention

Efficient implementation using LSH (Locality Sensitive Hashing) to limit attention to similar tokens only. Drastically reduces complexity while preserving important semantic relationships.

📖
Begriffe

Linformer Attention

Low-dimensional projection of key and value matrices to reduce complexity from O(n²) to O(n). Based on the hypothesis that attention matrices have low rank in many practical scenarios.

📖
Begriffe

Kernel Attention

Approach replacing softmax with positive kernel functions to achieve linear complexity. Enables efficient approximations while preserving the mathematical properties of attention.

📖
Begriffe

Adaptive Attention Span

Mécanisme où chaque tête d'attention apprend dynamiquement sa portée optimale pendant l'entraînement. Optimise l'utilisation computationnelle en concentrant l'attention là où elle est nécessaire selon les patterns appris.

🔍

Keine Ergebnisse gefunden