Transformer Scaling Laws - Glossariusz AI

📖

pojęcia

Chinchilla Scaling Law

Empirical principle established by DeepMind indicating that for optimal computational budget, model size and training data volume should be scaled isometrically, with a data/parameters ratio of approximately 20:1.

📖

pojęcia

Power Law

Mathematical relationship of the form L(N, D, C) = A * N^α * D^β * C^γ, where loss L decreases predictably based on the number of parameters N, dataset size D, and computational budget C.

📖

pojęcia

Scaling Transfer

Phenomenon where scaling laws observed on smaller models can accurately predict the performance of much larger models, even before their complete training.

📖

pojęcia

Optimal Computational Budget

Resource allocation (FLOPs) that maximizes model performance for a given computational cost, by judiciously balancing model size and training data quantity.

📖

pojęcia

Data Saturation

Point beyond which increasing training data volume no longer provides significant improvement to model performance for a given model size, indicating model underfitting.

📖

pojęcia

Scaling Exponent

Coefficient (α, β, γ) in the power law that quantifies how efficiently performance improves when increasing the number of parameters, data size, or computational budget respectively.

📖

pojęcia

Compute-Bound Regime

Training phase where performance is primarily limited by the available computational resources, making increasing model size more effective than increasing data.

📖

pojęcia

Data-Bound Regime

Training phase where performance is primarily limited by the quantity and quality of available data, making increasing data volume more effective than increasing model size.

📖

pojęcia

Predicted Test Loss

Value of the loss on a test dataset, estimated in advance using scaling laws based on model size, data size, and computational budget.

📖

pojęcia

Critical Scaling

Model size threshold from which performance gains follow a steeper scaling law, often observed in very large language models.

📖

pojęcia

Emergence via Scaling

Appearance of new capabilities (reasoning, understanding) that did not exist in smaller models and emerge spontaneously when model size exceeds a certain critical threshold.

📖

pojęcia

Scaling Efficiency

Measure of performance obtained per unit of resource (parameter, data, or FLOP), allowing comparison of different allocation strategies for a given budget.

📖

pojęcia

Chinchilla Isomorphism Hypothesis

Postulate that for a fixed computational budget, model parameter count and training tokens must be increased proportionally to achieve optimal performance.

📖

pojęcia

Kaplan's Law

Set of initial scaling laws proposed by OpenAI that suggested performance was primarily a function of model size, with less importance given to data volume.

📖

pojęcia

Pareto Frontier in Scaling

Set of optimal resource allocations (model size vs. data) where it is impossible to improve one factor without degrading the other, defining efficient trade-offs in scaling.

📖

pojęcia

Scaling Performance Metric

Quantitative indicator (validation loss, perplexity, benchmark score) used to measure model effectiveness and track its improvement based on scaling different resources.

📖

pojęcia

Predictability of Scaling

Ability of scaling laws to accurately anticipate the performance of models not yet trained, based on extrapolation of trends observed on smaller models.

📖

pojęcia

Multi-Objective Optimization in Scaling

Process aimed at finding the best compromise between multiple conflicting objectives (performance, cost, latency) when determining the optimal model and data size.

Słownik AI

Chinchilla Scaling Law

Power Law

Scaling Transfer

Optimal Computational Budget

Data Saturation

Scaling Exponent

Compute-Bound Regime

Data-Bound Regime

Predicted Test Loss

Critical Scaling

Emergence via Scaling

Scaling Efficiency

Chinchilla Isomorphism Hypothesis

Kaplan's Law

Pareto Frontier in Scaling

Scaling Performance Metric

Predictability of Scaling

Multi-Objective Optimization in Scaling

Nie znaleziono wyników