Quantification and Optimization
Block-wise Quantization
Quantization strategy that divides weight tensors into smaller blocks and applies independent quantization to each block, better preserving the value distribution and reducing the overall error compared to global quantization.
← Tillbaka