Quantization
LLM.int8()
Specific 8-bit quantization method for large language models, combining matrix decomposition and hybrid quantization.
← 뒤로Specific 8-bit quantization method for large language models, combining matrix decomposition and hybrid quantization.
← 뒤로