AI-ordlista
Den kompletta ordlistan över AI
Parameter-efficient fine-tuning (PEFT)
Fine-tuning methods that modify only a small subset of the model's parameters while freezing the majority of weights, thereby reducing computational and storage costs.
QLoRA (Quantized LoRA)
A variant of LoRA that combines 4-bit quantization and low-rank adaptation, enabling fine-tuning of very large models on limited hardware resources.
Prefix tuning
A method that optimizes only continuous prefixes added to input sequences, without modifying the model's weights, to adapt its behavior to specific tasks.
Prompt tuning
Optimization of learned prompt embeddings specifically designed to guide the behavior of a pre-trained model without modifying its internal parameters.
Instruction fine-tuning
Additional training process on instruction-response pairs to teach the model to precisely follow instructions and generate appropriate responses.
DPO (Direct Preference Optimization)
An alternative to RLHF that directly optimizes the model from human preference data without requiring an intermediate reward model, simplifying the alignment process.