KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Distributional Reinforcement Learning
Reinforcement learning paradigm that models the complete distribution of returns rather than just their mathematical expectation to capture uncertainty and improve robustness.
Quantile Regression DQN
Distributional RL algorithm using quantile regression to approximate the return distribution as a set of quantiles, allowing fine-grained distribution estimation.
Categorical DQN (C51)
Distributional RL method that discretizes the return distribution into 51 probability atoms, learning a categorical distribution to represent value uncertainty.
Value Distribution
Complete probability distribution of expected future returns for a state-action pair, replacing the traditional single-value expectation approach.
Risk-Sensitive RL
Reinforcement learning approach that explicitly considers risk by using the complete return distribution rather than just the mean for decision making.
Wasserstein Distance
Metric used in Distributional RL to measure the distance between probability distributions, particularly effective for comparing distributions with different supports.
Distributional Bellman Equation
Extension of the classical Bellman equation that operates on return distributions rather than scalar values, preserving the complete distribution information.
Projected Bellman Operator
Operator that projects the updated return distribution onto the chosen parameterized representation space, ensuring stability of distributional learning.
Aleatoric Uncertainty
Intrinsic and irreducible uncertainty in environment returns, naturally captured by Distributional RL models through the variance of the distribution.
Epistemic Uncertainty
Uncertainty due to lack of knowledge or data, which can be reduced with more experience and modeled through the evolution of distributions during learning.
Expectile Regression
Regression method that generalizes the notion of quantile for symmetric distributions, used in some Distributional RL algorithms to efficiently approximate distributions.
Distributional Value Function
Function that associates each state-action pair with a complete probability distribution over future returns, replacing the traditional scalar value function.
Categorical Projection
Mathematical operation used in C51 to project the updated return distribution onto the predefined discrete support of categorical atoms.
Quantile Projection
Mechanism in QR-DQN that maintains a set of quantiles as an approximation of the distribution, directly updating the positions of quantiles without explicit projection.
Distributional Policy Evaluation
Process of evaluating a policy by learning the complete distribution of returns rather than just their expected value, providing a richer analysis of performance.
Implicit Quantile Networks (IQN)
Advanced network architecture that can generate any quantile of the return distribution on the fly, offering a continuous and flexible representation of distributions.
Distributional Bellman Backup
Distributional RL update operation that propagates reward and transition distributions to compute the new value distribution according to the distributional Bellman equation.
Return Distribution Modeling
Fundamental technique of Distributional RL that involves explicitly modeling the probability distribution of cumulative returns rather than just their mathematical expectation.