Quantization
LLM.int8()
Specific 8-bit quantization method for large language models, combining matrix decomposition and hybrid quantization.
← BackSpecific 8-bit quantization method for large language models, combining matrix decomposition and hybrid quantization.
← Back