Słownik AI
Kompletny słownik sztucznej inteligencji
Memory Registers
Fastest and private memory of each SM (Streaming Multiprocessor) thread, used to store local variables with a one-clock-cycle access latency.
Memory Thrashing
Performance degradation phenomenon during non-optimized memory accesses generating a high rate of cache misses and memory bank conflicts.
Memory Bank Conflicts
Simultaneous access contention to different locations of the same shared memory bank, resulting in access serialization and performance reduction.
Asynchronous Memory Transfer
CPU-GPU data transfers executed in parallel with kernel computations via CUDA streams, hiding memory latency and optimizing GPU utilization.
Memory Alignment
Alignment of data structures on specific byte boundaries (128, 256, 512 bits) to ensure coalesced and maximum memory transactions.
Zero-Copy Memory
Technique allowing the GPU to directly access host memory without copying, using memory mapping to reduce memory consumption and transfer times.
CUDA Streams
Sequence of operations executed in order on the GPU enabling task parallelism and computation-transfer overlap to optimize resource utilization.
Memory Pool
Pre-allocation of a GPU memory block for fast allocations/deallocations, reducing fragmentation and dynamic allocation costs during execution.
Memory Prefetching
Preloading data into GPU cache memory before actual use, masking memory latency and improving instruction-data parallelism.
Memory Paging
Management of memory pages between CPU and GPU involving on-demand migration and usage-based eviction to optimize the use of limited GPU memory.
CUDA Unified Virtual Addressing
Single virtual address space combining host and device memory, enabling transparent transfers and valid pointers between CPU and GPU.
Memory Occupancy
Ratio of active warps per SM impacted by memory usage, determining the achievable level of parallelism and GPU resource utilization efficiency.