GPU Memory Management - Glossariusz AI

📖

pojęcia

Memory Registers

Fastest and private memory of each SM (Streaming Multiprocessor) thread, used to store local variables with a one-clock-cycle access latency.

📖

pojęcia

Memory Thrashing

Performance degradation phenomenon during non-optimized memory accesses generating a high rate of cache misses and memory bank conflicts.

📖

pojęcia

Memory Bank Conflicts

Simultaneous access contention to different locations of the same shared memory bank, resulting in access serialization and performance reduction.

📖

pojęcia

Asynchronous Memory Transfer

CPU-GPU data transfers executed in parallel with kernel computations via CUDA streams, hiding memory latency and optimizing GPU utilization.

📖

pojęcia

Memory Alignment

Alignment of data structures on specific byte boundaries (128, 256, 512 bits) to ensure coalesced and maximum memory transactions.

📖

pojęcia

Zero-Copy Memory

Technique allowing the GPU to directly access host memory without copying, using memory mapping to reduce memory consumption and transfer times.

📖

pojęcia

CUDA Streams

Sequence of operations executed in order on the GPU enabling task parallelism and computation-transfer overlap to optimize resource utilization.

📖

pojęcia

Memory Pool

Pre-allocation of a GPU memory block for fast allocations/deallocations, reducing fragmentation and dynamic allocation costs during execution.

📖

pojęcia

Memory Prefetching

Preloading data into GPU cache memory before actual use, masking memory latency and improving instruction-data parallelism.

📖

pojęcia

Memory Paging

Management of memory pages between CPU and GPU involving on-demand migration and usage-based eviction to optimize the use of limited GPU memory.

📖

pojęcia

CUDA Unified Virtual Addressing

Single virtual address space combining host and device memory, enabling transparent transfers and valid pointers between CPU and GPU.

📖

pojęcia

Memory Occupancy

Ratio of active warps per SM impacted by memory usage, determining the achievable level of parallelism and GPU resource utilization efficiency.

Słownik AI

Memory Registers

Memory Thrashing

Memory Bank Conflicts

Asynchronous Memory Transfer

Memory Alignment

Zero-Copy Memory

CUDA Streams

Memory Pool

Memory Prefetching

Memory Paging

CUDA Unified Virtual Addressing

Memory Occupancy

Nie znaleziono wyników