Distributional RL - Glossario IA

📖

termini

Distributional Reinforcement Learning

Reinforcement learning paradigm that models the complete distribution of returns rather than just their mathematical expectation to capture uncertainty and improve robustness.

📖

termini

Quantile Regression DQN

Distributional RL algorithm using quantile regression to approximate the return distribution as a set of quantiles, allowing fine-grained distribution estimation.

📖

termini

Categorical DQN (C51)

Distributional RL method that discretizes the return distribution into 51 probability atoms, learning a categorical distribution to represent value uncertainty.

📖

termini

Value Distribution

Complete probability distribution of expected future returns for a state-action pair, replacing the traditional single-value expectation approach.

📖

termini

Risk-Sensitive RL

Reinforcement learning approach that explicitly considers risk by using the complete return distribution rather than just the mean for decision making.

📖

termini

Wasserstein Distance

Metric used in Distributional RL to measure the distance between probability distributions, particularly effective for comparing distributions with different supports.

📖

termini

Distributional Bellman Equation

Extension of the classical Bellman equation that operates on return distributions rather than scalar values, preserving the complete distribution information.

📖

termini

Projected Bellman Operator

Operator that projects the updated return distribution onto the chosen parameterized representation space, ensuring stability of distributional learning.

📖

termini

Aleatoric Uncertainty

Intrinsic and irreducible uncertainty in environment returns, naturally captured by Distributional RL models through the variance of the distribution.

📖

termini

Epistemic Uncertainty

Uncertainty due to lack of knowledge or data, which can be reduced with more experience and modeled through the evolution of distributions during learning.

📖

termini

Expectile Regression

Regression method that generalizes the notion of quantile for symmetric distributions, used in some Distributional RL algorithms to efficiently approximate distributions.

📖

termini

Distributional Value Function

Function that associates each state-action pair with a complete probability distribution over future returns, replacing the traditional scalar value function.

📖

termini

Categorical Projection

Mathematical operation used in C51 to project the updated return distribution onto the predefined discrete support of categorical atoms.

📖

termini

Quantile Projection

Mechanism in QR-DQN that maintains a set of quantiles as an approximation of the distribution, directly updating the positions of quantiles without explicit projection.

📖

termini

Distributional Policy Evaluation

Process of evaluating a policy by learning the complete distribution of returns rather than just their expected value, providing a richer analysis of performance.

📖

termini

Implicit Quantile Networks (IQN)

Advanced network architecture that can generate any quantile of the return distribution on the fly, offering a continuous and flexible representation of distributions.

📖

termini

Distributional Bellman Backup

Distributional RL update operation that propagates reward and transition distributions to compute the new value distribution according to the distributional Bellman equation.

📖

termini

Return Distribution Modeling

Fundamental technique of Distributional RL that involves explicitly modeling the probability distribution of cumulative returns rather than just their mathematical expectation.

Glossario IA

Distributional Reinforcement Learning

Quantile Regression DQN

Categorical DQN (C51)

Value Distribution

Risk-Sensitive RL

Wasserstein Distance

Distributional Bellman Equation

Projected Bellman Operator

Aleatoric Uncertainty

Epistemic Uncertainty

Expectile Regression

Distributional Value Function

Categorical Projection

Quantile Projection

Distributional Policy Evaluation

Implicit Quantile Networks (IQN)

Distributional Bellman Backup

Return Distribution Modeling

Nessun risultato trovato