Quantification and Optimization
Low-Rank Adaptation (LoRA)
Efficient adaptation method that freezes the weights of a pre-trained model and injects small decomposable low-rank matrices, drastically reducing the number of trainable parameters for fine-tuning while preserving performance.
← 뒤로