Mixed Precision Computing
Precision Matrix Operations
Set of linear operations (GEMM, convolution) where different parts of the calculation use different precisions - typically accumulation in FP32 with multiplication in FP16/BF16 to optimize throughput on modern GPUs.
← Tillbaka