🏠 Home
Benchmark Hub
📊 All Benchmarks 🦖 Dinosaur v1 🦖 Dinosaur v2 ✅ To-Do List Applications 🎨 Creative Free Pages 🎯 FSACB - Ultimate Showcase 🌍 Translation Benchmark
Models
🏆 Top 10 Models 🆓 Free Models 📋 All Models ⚙️ Kilo Code
Resources
💬 Prompts Library 📖 AI Glossary 🔗 Useful Links

AI Glossary

The complete dictionary of Artificial Intelligence

162
categories
2,032
subcategories
23,060
terms
📂
subcategories

Attention Mechanism

Mathematical foundation allowing models to weight the relative importance of elements in a data sequence.

5 terms
📂
subcategories

Self-Attention

Mechanism where each element of a sequence computes its attention relative to all other elements in the same sequence.

0 terms
📂
subcategories

Multi-Head Attention

Attention extension using multiple attention heads in parallel to capture different types of relationships.

3 terms
📂
subcategories

Positional Encoding

Technique for incorporating the sequential position of elements into embeddings without using an RNN.

12 terms
📂
subcategories

Encoder-Decoder Architecture

Fundamental structure of Transformers separating input processing (encoder) and output generation (decoder).

2 terms
📂
subcategories

Attention Scaling

Square root of dimensionality normalization to stabilize training and prevent exploding gradients.

14 terms
📂
subcategories

Cross-Attention

Attention mechanism between two different sequences, used in translation and multimodal tasks.

8 terms
📂
subcategories

Sparse Attention

Variant of attention computed only on a subset of positions to reduce computational complexity.

3 terms
📂
subcategories

Attention Masks

Control mechanisms allowing to mask certain positions during attention computation to prevent information leakage.

9 terms
📂
subcategories

Vision Transformers

Adaptation of the Transformer architecture to computer vision tasks by treating images as sequences of patches.

9 terms
📂
subcategories

Efficient Attention

Set of optimizations aimed at reducing the quadratic complexity of standard attention for longer sequences.

2 terms
📂
subcategories

Hierarchical Attention

Multi-level attention structure capturing relationships at different hierarchical scales in the data.

12 terms
🔍

No results found