GPU Virtualization - AI Glossarium

📖

termen

GPU Passthrough

Technique allowing a virtual machine to directly and exclusively access physical GPU hardware without an intermediate virtualization layer. This approach offers native performance but limits GPU sharing between multiple VMs.

📖

termen

Virtual GPU (vGPU)

Virtualization technology that divides a physical GPU into multiple virtual instances shared between different virtual machines or containers. Each vGPU functions as an independent GPU with its own allocated resources.

📖

termen

Multi-Instance GPU (MIG)

NVIDIA architecture allowing partitioning of an Ampere GPU into multiple isolated instances with dedicated resources (compute, memory, cache). MIG ensures strict isolation between instances to guarantee quality of service.

📖

termen

Time-Sliced Sharing

GPU sharing method where multiple users alternate access to the GPU through time slices. This approach maximizes utilization but may introduce variable latency depending on the load.

📖

termen

CUDA Virtualization

Specific virtualization of the CUDA API allowing GPU applications to run in virtualized environments with optimized performance. Includes intercepting and routing CUDA calls to appropriate GPU resources.

📖

termen

API Forwarding

Mechanism that intercepts graphics or compute API calls from VMs and redirects them to the host physical GPU. Enables compatibility with existing applications without code modification.

📖

termen

Profile-based Allocation

GPU allocation strategy based on predefined resource profiles (memory, compute, bandwidth). Allows precise adaptation of GPU resources to the specific needs of different workloads.

📖

termen

GPU Partitioning

Process of logical or physical division of GPU resources into smaller segments assignable to different applications or VMs. Includes partitioning of memory, compute units, and memory controllers.

📖

termen

Mediated Passthrough

Hybrid between direct passthrough and full virtualization, offering near-native GPU access with minimal mediation layer. Combines optimal performance with better resource management and isolation.

📖

termen

GPU Scheduler

Component that manages scheduling and allocation of GPU resources between multiple concurrent requests. Optimizes GPU usage while respecting priorities and quality of service constraints.

📖

termen

Direct GPU Access

Architecture allowing virtualized applications to directly access GPU resources without going through software emulation layers. Reduces latency and maximizes computational performance.

📖

termen

Virtual GPU Manager

Centralized administration software that manages the lifecycle of vGPU instances, their allocation and monitoring. Coordinates available GPU resources according to policies defined by the administrator.

📖

termen

GPU Memory Virtualization

Technique for abstracting physical GPU memory allowing multiple VMs to share VRAM while maintaining the illusion of dedicated memory. Includes paging, dynamic allocation and memory isolation.

📖

termen

SR-IOV for GPUs

Adaptation of the Single Root I/O Virtualization standard for GPUs, enabling creation of virtual functions (VFs) with direct hardware access paths. Offers isolation and near-bare metal performance.

📖

termen

GPU Containerization

Integration of GPU resources into lightweight containers with driver and CUDA library isolation. Enables rapid deployment of GPU applications with minimal overhead compared to VMs.

📖

termen

Remote GPU Virtualization

Architecture allowing access to remote GPU resources over the network as if they were local. Uses optimized protocols to minimize latency and preserve computational performance.

📖

termen

Dynamic GPU Allocation

Ability to dynamically allocate and deallocate GPU resources according to the immediate needs of applications. Optimizes GPU usage by adjusting resource quotas in real-time.

📖

termen

GPU Pooling

Aggregation of multiple physical GPUs into a unified resource pool that can be distributed on demand. Enables load balancing and elasticity of GPU computational resources at the datacenter scale.

AI-woordenlijst

GPU Passthrough

Virtual GPU (vGPU)

Multi-Instance GPU (MIG)

Time-Sliced Sharing

CUDA Virtualization

API Forwarding

Profile-based Allocation

GPU Partitioning

Mediated Passthrough

GPU Scheduler

Direct GPU Access

Virtual GPU Manager

GPU Memory Virtualization

SR-IOV for GPUs

GPU Containerization

Remote GPU Virtualization

Dynamic GPU Allocation

GPU Pooling

Geen resultaten gevonden