🏠 Home
Benchmark
📊 Tutti i benchmark 🦖 Dinosauro v1 🦖 Dinosauro v2 ✅ App To-Do List 🎨 Pagine libere creative 🎯 FSACB - Ultimate Showcase 🌍 Benchmark traduzione
Modelli
🏆 Top 10 modelli 🆓 Modelli gratuiti 📋 Tutti i modelli ⚙️ Kilo Code
Risorse
💬 Libreria di prompt 📖 Glossario IA 🔗 Link utili

Glossario IA

Il dizionario completo dell'Intelligenza Artificiale

162
categorie
2.032
sottocategorie
23.060
termini
📖
termini

Chinchilla Scaling Law

Empirical principle established by DeepMind indicating that for optimal computational budget, model size and training data volume should be scaled isometrically, with a data/parameters ratio of approximately 20:1.

📖
termini

Power Law

Mathematical relationship of the form L(N, D, C) = A * N^α * D^β * C^γ, where loss L decreases predictably based on the number of parameters N, dataset size D, and computational budget C.

📖
termini

Scaling Transfer

Phenomenon where scaling laws observed on smaller models can accurately predict the performance of much larger models, even before their complete training.

📖
termini

Optimal Computational Budget

Resource allocation (FLOPs) that maximizes model performance for a given computational cost, by judiciously balancing model size and training data quantity.

📖
termini

Data Saturation

Point beyond which increasing training data volume no longer provides significant improvement to model performance for a given model size, indicating model underfitting.

📖
termini

Scaling Exponent

Coefficient (α, β, γ) in the power law that quantifies how efficiently performance improves when increasing the number of parameters, data size, or computational budget respectively.

📖
termini

Compute-Bound Regime

Training phase where performance is primarily limited by the available computational resources, making increasing model size more effective than increasing data.

📖
termini

Data-Bound Regime

Training phase where performance is primarily limited by the quantity and quality of available data, making increasing data volume more effective than increasing model size.

📖
termini

Predicted Test Loss

Value of the loss on a test dataset, estimated in advance using scaling laws based on model size, data size, and computational budget.

📖
termini

Critical Scaling

Model size threshold from which performance gains follow a steeper scaling law, often observed in very large language models.

📖
termini

Emergence via Scaling

Appearance of new capabilities (reasoning, understanding) that did not exist in smaller models and emerge spontaneously when model size exceeds a certain critical threshold.

📖
termini

Scaling Efficiency

Measure of performance obtained per unit of resource (parameter, data, or FLOP), allowing comparison of different allocation strategies for a given budget.

📖
termini

Chinchilla Isomorphism Hypothesis

Postulate that for a fixed computational budget, model parameter count and training tokens must be increased proportionally to achieve optimal performance.

📖
termini

Kaplan's Law

Set of initial scaling laws proposed by OpenAI that suggested performance was primarily a function of model size, with less importance given to data volume.

📖
termini

Pareto Frontier in Scaling

Set of optimal resource allocations (model size vs. data) where it is impossible to improve one factor without degrading the other, defining efficient trade-offs in scaling.

📖
termini

Scaling Performance Metric

Quantitative indicator (validation loss, perplexity, benchmark score) used to measure model effectiveness and track its improvement based on scaling different resources.

📖
termini

Predictability of Scaling

Ability of scaling laws to accurately anticipate the performance of models not yet trained, based on extrapolation of trends observed on smaller models.

📖
termini

Multi-Objective Optimization in Scaling

Process aimed at finding the best compromise between multiple conflicting objectives (performance, cost, latency) when determining the optimal model and data size.

🔍

Nessun risultato trovato