Edge MLOps
Model Quantization
Technique for reducing the numerical precision of ML model weights and activations (typically from 32-bit to 8-bit) to optimize its size and inference time on edge devices.
← Geri