Code Compilation and Optimization

📖

terms

TVM (Tensor Virtual Machine)

An open-source compilation framework designed to optimize and execute tensors across various hardware architectures, lowering the abstraction level of deep learning models.

📖

terms

Just-In-Time (JIT) Compilation

A compilation technique that translates bytecode or intermediate code into native machine code at runtime, enabling optimizations based on the actual system state.

📖

terms

Ahead-of-Time (AOT) Compilation

The process of compiling source code into native machine code before execution, reducing startup latency and enabling aggressive optimizations independent of the runtime environment.

📖

terms

Graph IR (Intermediate Representation)

An abstract representation of an AI model's computation graph, used by compilers to analyze dependencies and apply optimization transformations before code generation.

📖

terms

Operator Fusion

An optimization technique that combines multiple elementary operations from the computation graph into a single computation kernel, reducing memory overhead and improving data locality.

📖

terms

Auto-scheduling

An automated process of searching for the best execution configuration (tiling, vectorization, parallelization) for a computation kernel on a given target hardware architecture.

📖

terms

Target-specific Optimization

A set of compilation techniques that adapt the generated code to the unique characteristics of a hardware architecture (CPU, GPU, TPU, ASIC) to maximize performance.

📖

terms

Relay IR

A high-level functional intermediate representation in TVM, supporting computation graphs with control flow and enabling complex semantic optimizations.

📖

terms

Tensor Expression (TE)

Domain-specific language in TVM for describing tensor computations at a high level of abstraction, facilitating automatic generation of optimized code for various targets.

📖

terms

Kernel Auto-tuning

Process of systematically exploring the optimization parameter space of a computational kernel to identify the configuration offering the best performance on specific hardware.

📖

terms

HLO (High-Level Optimizer) IR

Intermediate representation used by XLA, describing computations as high-level tensor operations, optimized before code generation for accelerators.

📖

terms

Codegen (Code Generation)

Final phase of compilation where the optimized intermediate representation is translated into executable machine code for the specific target architecture.

📖

terms

Polyhedral Model

Mathematical model used to represent and transform nested loops, enabling complex optimizations like tiling and automatic parallelization.

📖

terms

LLVM (Low Level Virtual Machine)

Modular compilation infrastructure used by many AI compilers to generate optimized machine code for different CPU architectures.

📖

terms

Memory Layout Optimization

Technique of reorganizing data in memory to improve spatial and temporal locality, reducing access latencies and increasing computational throughput.

📖

terms

Hardware Abstraction Layer (HAL)

Software interface that hides the specific details of the underlying hardware, allowing compilers to generate portable code while leveraging native optimizations.

📖

terms

Vectorization

Optimization technique that transforms scalar operations into vector operations (SIMD), leveraging the parallel computing units of modern processors.

📖

terms

Tiling

Data partitioning strategy into blocks (tiles) to improve cache reuse and parallelization efficiency in tensor computations.

📖

terms

Graph Rewriting

Systematic transformation of the computation graph by applying rewriting rules to replace subgraphs with more efficient equivalents.

AI Glossary

TVM (Tensor Virtual Machine)

Just-In-Time (JIT) Compilation

Ahead-of-Time (AOT) Compilation

Graph IR (Intermediate Representation)

Operator Fusion

Auto-scheduling

Target-specific Optimization

Relay IR

Tensor Expression (TE)

Kernel Auto-tuning

HLO (High-Level Optimizer) IR

Codegen (Code Generation)

Polyhedral Model

LLVM (Low Level Virtual Machine)

Memory Layout Optimization

Hardware Abstraction Layer (HAL)

Vectorization

Tiling

Graph Rewriting

No results found