🏠 Startseite
Vergleiche
📊 Alle Benchmarks 🦖 Dinosaurier v1 🦖 Dinosaurier v2 ✅ To-Do-Listen-Apps 🎨 Kreative freie Seiten 🎯 FSACB - Ultimatives Showcase 🌍 Übersetzungs-Benchmark
Modelle
🏆 Top 10 Modelle 🆓 Kostenlose Modelle 📋 Alle Modelle ⚙️ Kilo Code
Ressourcen
💬 Prompt-Bibliothek 📖 KI-Glossar 🔗 Nützliche Links

KI-Glossar

Das vollständige Wörterbuch der Künstlichen Intelligenz

162
Kategorien
2.032
Unterkategorien
23.060
Begriffe
📖
Begriffe

Additive Attention

Proposed by Bahdanau, this method combines the decoder's hidden state and encoder outputs through a feed-forward network to calculate attention weights.

📖
Begriffe

Multiplicative Attention

Introduced by Luong, calculates attention scores by dot product between the decoder state and encoder outputs, offering a more efficient implementation.

📖
Begriffe

Multi-Head Attention

Extension of self-attention using multiple attention heads in parallel to capture different types of relationships in the data.

📖
Begriffe

Context Vector

Weighted representation of encoder outputs, calculated using attention weights and provided to the decoder as contextual information.

📖
Begriffe

Scaled Dot-Product Attention

Attention variant used in Transformers where the dot product is divided by the square root of the dimension to stabilize training.

📖
Begriffe

Global Attention

Attention mechanism considering all positions of the source sequence to calculate the context vector at each decoding step.

📖
Begriffe

Local Attention

Attention variant considering only a subset of predicted positions around a central position, reducing computational complexity.

📖
Begriffe

Hierarchical Attention

Multi-level architecture applying attention at different granularities, first at the word level then at the sentence or document level.

📖
Begriffe

Query, Key, Value

Triple of fundamental vectors in attention: Query represents the current request, Key the available keys, and Value the values to retrieve.

📖
Begriffe

Temporal Attention

Mechanism specialized in capturing temporal dependencies in time series by weighting relevant time steps.

📖
Begriffe

Spatial Attention

Application of attention to spatial data (images, videos) to focus on the most informative regions in space.

📖
Begriffe

Adaptive Attention

Approach where the attention mechanism dynamically adjusts during training to optimize its parameters according to the task.

📖
Begriffe

Sparse Attention

Attention variant that computes weights only for a subset of positions, enabling efficient processing of longer sequences.

📖
Begriffe

Attention Mask

Technique that masks certain positions to prevent attention on irrelevant tokens such as padding or future tokens.

📖
Begriffe

Linear Attention

Approximation of standard attention with linear complexity rather than quadratic, enabling processing of much longer sequences.

📖
Begriffe

Performer Attention

Variant using feature mapping kernels to approximate attention with efficient linear complexity in memory and computation.

🔍

Keine Ergebnisse gefunden