🏠 Home
Benchmark Hub
📊 All Benchmarks 🦖 Dinosaur v1 🦖 Dinosaur v2 ✅ To-Do List Applications 🎨 Creative Free Pages 🎯 FSACB - Ultimate Showcase 🌍 Translation Benchmark
Models
🏆 Top 10 Models 🆓 Free Models 📋 All Models ⚙️ Kilo Code
Resources
💬 Prompts Library 📖 AI Glossary 🔗 Useful Links

AI Glossary

The complete dictionary of Artificial Intelligence

162
categories
2,032
subcategories
23,060
terms
📖
terms

Memory Registers

Fastest and private memory of each SM (Streaming Multiprocessor) thread, used to store local variables with a one-clock-cycle access latency.

📖
terms

Memory Thrashing

Performance degradation phenomenon during non-optimized memory accesses generating a high rate of cache misses and memory bank conflicts.

📖
terms

Memory Bank Conflicts

Simultaneous access contention to different locations of the same shared memory bank, resulting in access serialization and performance reduction.

📖
terms

Asynchronous Memory Transfer

CPU-GPU data transfers executed in parallel with kernel computations via CUDA streams, hiding memory latency and optimizing GPU utilization.

📖
terms

Memory Alignment

Alignment of data structures on specific byte boundaries (128, 256, 512 bits) to ensure coalesced and maximum memory transactions.

📖
terms

Zero-Copy Memory

Technique allowing the GPU to directly access host memory without copying, using memory mapping to reduce memory consumption and transfer times.

📖
terms

CUDA Streams

Sequence of operations executed in order on the GPU enabling task parallelism and computation-transfer overlap to optimize resource utilization.

📖
terms

Memory Pool

Pre-allocation of a GPU memory block for fast allocations/deallocations, reducing fragmentation and dynamic allocation costs during execution.

📖
terms

Memory Prefetching

Preloading data into GPU cache memory before actual use, masking memory latency and improving instruction-data parallelism.

📖
terms

Memory Paging

Management of memory pages between CPU and GPU involving on-demand migration and usage-based eviction to optimize the use of limited GPU memory.

📖
terms

CUDA Unified Virtual Addressing

Single virtual address space combining host and device memory, enabling transparent transfers and valid pointers between CPU and GPU.

📖
terms

Memory Occupancy

Ratio of active warps per SM impacted by memory usage, determining the achievable level of parallelism and GPU resource utilization efficiency.

🔍

No results found