AI Glossary
The complete dictionary of Artificial Intelligence
Low-rank matrices
Mathematical representation where a matrix is expressed as the product of two smaller matrices with reduced rank. This decomposition reduces the number of required parameters while capturing the essential information of transformations.
Memory efficiency
Optimization of RAM and VRAM usage during AI model training and inference. Techniques like LoRA drastically reduce memory consumption by limiting the modified parameters.
Trainable parameters
Subset of neural network weights that are actually modified during the learning process. In LoRA, only a small percentage (typically 0.1-1%) of total parameters are trainable.
Rank decomposition
Algebraic technique factoring a weight matrix W into W + BA where B and A are low-rank matrices. This decomposition forms the mathematical foundation of LoRA adaptation.
Efficient fine-tuning
Paradigm of adapting pre-trained models aiming to minimize computational and memory resources required. Methods like LoRA, Adapters or Prefix-tuning allow model specialization without modifying all their parameters.
PEFT (Parameter-Efficient Fine-Tuning)
Category of model adaptation techniques aiming to modify a minimum of parameters during fine-tuning. LoRA is one of the most popular PEFT approaches along with Adapters, Prefix-tuning and soft prompts.
Alpha scaling factor
Crucial hyperparameter in LoRA controlling the amplitude of adaptation applied to original weights. This scaling factor adjusts the relative influence of low-rank matrices compared to pre-trained weights.
Multi-LoRA
Architecture allowing simultaneous application of multiple specialized LoRA adaptations to the same base model. This approach facilitates rapid switching between different tasks or domains of expertise without full model reloading.
Zero-shot adaptation
Ability of a model adapted with LoRA to generalize to tasks or domains not seen during adaptation training. This property emerges from preserving the base model's general knowledge while adding targeted specializations.
LoRA rank hyperparameter
Parameter determining the dimension of the low-rank matrices in LoRA decomposition, controlling the trade-off between expressiveness and efficiency. Typical ranks range from 4 to 64 depending on the complexity of the adaptation task.
Weight merging
Process of integrating LoRA adaptations into the base model weights to eliminate computational overhead during inference. This merging allows recovering a standard model with the same performance as the adapted version.