GPU Virtualization - 인공지능 용어집

📖

용어

GPU Passthrough

Technique allowing a virtual machine to directly and exclusively access physical GPU hardware without an intermediate virtualization layer. This approach offers native performance but limits GPU sharing between multiple VMs.

📖

용어

Virtual GPU (vGPU)

Virtualization technology that divides a physical GPU into multiple virtual instances shared between different virtual machines or containers. Each vGPU functions as an independent GPU with its own allocated resources.

📖

용어

Multi-Instance GPU (MIG)

NVIDIA architecture allowing partitioning of an Ampere GPU into multiple isolated instances with dedicated resources (compute, memory, cache). MIG ensures strict isolation between instances to guarantee quality of service.

📖

용어

Time-Sliced Sharing

GPU sharing method where multiple users alternate access to the GPU through time slices. This approach maximizes utilization but may introduce variable latency depending on the load.

📖

용어

CUDA Virtualization

Specific virtualization of the CUDA API allowing GPU applications to run in virtualized environments with optimized performance. Includes intercepting and routing CUDA calls to appropriate GPU resources.

📖

용어

API Forwarding

Mechanism that intercepts graphics or compute API calls from VMs and redirects them to the host physical GPU. Enables compatibility with existing applications without code modification.

📖

용어

Profile-based Allocation

GPU allocation strategy based on predefined resource profiles (memory, compute, bandwidth). Allows precise adaptation of GPU resources to the specific needs of different workloads.

📖

용어

GPU Partitioning

Process of logical or physical division of GPU resources into smaller segments assignable to different applications or VMs. Includes partitioning of memory, compute units, and memory controllers.

📖

용어

Mediated Passthrough

Hybrid between direct passthrough and full virtualization, offering near-native GPU access with minimal mediation layer. Combines optimal performance with better resource management and isolation.

📖

용어

GPU Scheduler

Component that manages scheduling and allocation of GPU resources between multiple concurrent requests. Optimizes GPU usage while respecting priorities and quality of service constraints.

📖

용어

Direct GPU Access

Architecture allowing virtualized applications to directly access GPU resources without going through software emulation layers. Reduces latency and maximizes computational performance.

📖

용어

Virtual GPU Manager

Centralized administration software that manages the lifecycle of vGPU instances, their allocation and monitoring. Coordinates available GPU resources according to policies defined by the administrator.

📖

용어

GPU Memory Virtualization

Technique for abstracting physical GPU memory allowing multiple VMs to share VRAM while maintaining the illusion of dedicated memory. Includes paging, dynamic allocation and memory isolation.

📖

용어

SR-IOV for GPUs

Adaptation of the Single Root I/O Virtualization standard for GPUs, enabling creation of virtual functions (VFs) with direct hardware access paths. Offers isolation and near-bare metal performance.

📖

용어

GPU Containerization

Integration of GPU resources into lightweight containers with driver and CUDA library isolation. Enables rapid deployment of GPU applications with minimal overhead compared to VMs.

📖

용어

Remote GPU Virtualization

Architecture allowing access to remote GPU resources over the network as if they were local. Uses optimized protocols to minimize latency and preserve computational performance.

📖

용어

Dynamic GPU Allocation

Ability to dynamically allocate and deallocate GPU resources according to the immediate needs of applications. Optimizes GPU usage by adjusting resource quotas in real-time.

📖

용어

GPU Pooling

Aggregation of multiple physical GPUs into a unified resource pool that can be distributed on demand. Enables load balancing and elasticity of GPU computational resources at the datacenter scale.

AI 용어집