GPU Memory Management

📖

terms

Memory Registers

Fastest and private memory of each SM (Streaming Multiprocessor) thread, used to store local variables with a one-clock-cycle access latency.

📖

terms

Memory Thrashing

Performance degradation phenomenon during non-optimized memory accesses generating a high rate of cache misses and memory bank conflicts.

📖

terms

Memory Bank Conflicts

Simultaneous access contention to different locations of the same shared memory bank, resulting in access serialization and performance reduction.

📖

terms

Asynchronous Memory Transfer

CPU-GPU data transfers executed in parallel with kernel computations via CUDA streams, hiding memory latency and optimizing GPU utilization.

📖

terms

Memory Alignment

Alignment of data structures on specific byte boundaries (128, 256, 512 bits) to ensure coalesced and maximum memory transactions.

📖

terms

Zero-Copy Memory

Technique allowing the GPU to directly access host memory without copying, using memory mapping to reduce memory consumption and transfer times.

📖

terms

CUDA Streams

Sequence of operations executed in order on the GPU enabling task parallelism and computation-transfer overlap to optimize resource utilization.

📖

terms

Memory Pool

Pre-allocation of a GPU memory block for fast allocations/deallocations, reducing fragmentation and dynamic allocation costs during execution.

📖

terms

Memory Prefetching

Preloading data into GPU cache memory before actual use, masking memory latency and improving instruction-data parallelism.

📖

terms

Memory Paging

Management of memory pages between CPU and GPU involving on-demand migration and usage-based eviction to optimize the use of limited GPU memory.

📖

terms

CUDA Unified Virtual Addressing

Single virtual address space combining host and device memory, enabling transparent transfers and valid pointers between CPU and GPU.

📖

terms

Memory Occupancy

Ratio of active warps per SM impacted by memory usage, determining the achievable level of parallelism and GPU resource utilization efficiency.

AI Glossary

Memory Registers

Memory Thrashing

Memory Bank Conflicts

Asynchronous Memory Transfer

Memory Alignment

Zero-Copy Memory

CUDA Streams

Memory Pool

Memory Prefetching

Memory Paging

CUDA Unified Virtual Addressing

Memory Occupancy

No results found