🏠 Home
Benchmark
📊 Tutti i benchmark 🦖 Dinosauro v1 🦖 Dinosauro v2 ✅ App To-Do List 🎨 Pagine libere creative 🎯 FSACB - Ultimate Showcase 🌍 Benchmark traduzione
Modelli
🏆 Top 10 modelli 🆓 Modelli gratuiti 📋 Tutti i modelli ⚙️ Kilo Code
Risorse
💬 Libreria di prompt 📖 Glossario IA 🔗 Link utili

Glossario IA

Il dizionario completo dell'Intelligenza Artificiale

162
categorie
2.032
sottocategorie
23.060
termini
📖
termini

Generative Adversarial Imitation Learning

Method combining generative adversarial networks with imitation learning to distinguish agent behaviors from expert demonstrations without requiring explicit rewards.

📖
termini

GAIL (Generative Adversarial Imitation Learning)

Pioneering algorithm using an adversarial game between a discriminator and a generator to learn optimal policies from expert demonstrations.

📖
termini

Discriminator Network

Neural network trained to classify trajectories as coming from either the expert or the agent, thus providing an implicit reward signal.

📖
termini

Generator Network

Agent's policy that generates actions in the environment, seeking to produce trajectories indistinguishable from expert demonstrations by the discriminator.

📖
termini

Implicit Reward Function

Reward signal derived from the discriminator's output, replacing traditional explicit reward functions in reinforcement learning.

📖
termini

Behavior Distribution

Probabilistic distribution of action-state trajectories that the agent seeks to align with the distribution of expert demonstrations.

📖
termini

Jensen-Shannon Divergence

Symmetric metric measuring the similarity between probability distributions, used to evaluate convergence between the agent and expert policies.

📖
termini

Min-Max Game

Mathematical formulation where the discriminator maximizes and the generator minimizes a common objective function, leading to an optimal equilibrium.

📖
termini

State-Action Trajectory

Chronological sequence of observed states and actions executed by the agent or expert in the learning environment.

📖
termini

Adversarial Optimization

Simultaneous training process where discriminator and generator parameters are optimized antagonistically.

📖
termini

Observation Space

Set of all possible observations the agent can perceive from the environment, forming the input to neural networks.

📖
termini

Replay Memory

Buffer storing previous trajectories of the agent and expert to stabilize training and improve sample efficiency.

📖
termini

Entropy Coefficient

Regularization parameter encouraging exploration by penalizing overly deterministic action distributions in the agent's policy.

📖
termini

Total Variation Distance

Alternative metric measuring dissimilarity between two probability distributions, sometimes used instead of JS divergence.

📖
termini

Importance Ratio

Correction factor weighting off-policy samples to adjust for the difference between behavior policy and target policy.

📖
termini

Training Stabilization

Set of techniques (gradient penalty, spectral normalization) preventing oscillatory instability in adversarial learning.

📖
termini

Mode Collapse

Phenomenon where the generator only produces a limited subset of possible behaviors, ignoring the diversity of expert demonstrations.

📖
termini

Alignment Metric

Quantitative indicator evaluating the similarity between the behavior distributions of the agent and the expert during learning.

🔍

Nessun risultato trovato