Glossario IA
Il dizionario completo dell'Intelligenza Artificiale
Distributional Q-Learning
A variant of Q-learning that learns the complete probability distribution of Q-values instead of estimating only the mathematical expectation, allowing for better characterization of uncertainty.
Categorical Distributional RL
A method that represents the distribution of returns as a discrete set of probabilities over predefined value atoms, using categorical projection to ensure stability.
Quantile Regression RL
A distributional approach using quantile regression to directly model the quantiles of the return distribution, offering a flexible and continuous representation.
Implicit Quantile Network (IQN)
A neural network architecture that learns the cumulative distribution of returns by generating implicit quantiles via quantile embedding functions, enabling continuous estimation.
Distributional Bellman Operator
A generalization of the classical Bellman operator that acts on return distributions rather than scalar values, preserving the complete distribution structure.
Batch Constraint Distributional RL
An offline approach that applies batch constraints to distributional methods to ensure that policies remain close to the behavior observed in the training dataset.
Offline Distributional Critic
A critic module in offline learning that estimates the distribution of returns to evaluate actions, using techniques to handle distribution shift and selection bias.
Distributional Policy Gradient
An extension of policy gradient methods that directly optimizes the distribution parameters of returns rather than only their expectation, enabling fine control over risk properties.
Risk-Aware Distributional RL
Offline approach that uses the complete distribution of returns to make risk-aware decisions, optimizing according to metrics like CVaR or coherent risk measures.
Distributional Dynamics Model
Data-based model that captures not only the average dynamics of the environment but also the distribution of transitions and rewards, essential for robust offline learning.
Distributional Advantage Estimation
Advantage estimation technique considering the complete distribution of returns rather than only average values, allowing for a more nuanced evaluation of actions offline.
Conservative Distributional Learning
Offline learning paradigm that maintains conservative estimates of the return distribution to avoid overestimation due to distribution shift in offline data.
Distributional Sample Efficiency
Measure of how efficiently offline distributional methods learn from limited samples, leveraging the rich structure of distributional information.