KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Value Distribution
Complete representation of uncertainty about future returns in reinforcement learning, modeling the entire probability distribution of each possible return rather than just its expectation.
Distributional Reinforcement Learning
RL paradigm that explicitly models the full distribution of expected returns to capture uncertainty and variability of future outcomes.
Distributional Q-Function
Extension of the Q-value function that returns a probability distribution over expected returns instead of a single scalar value.
Atomization Parametrization
Technique for discretizing continuous distributions into finite sets of points (atoms) with associated probabilities to facilitate computational learning.
Categorical Distributional RL (C51)
Pioneering algorithm that models the return distribution as a discrete categorical distribution over a fixed support of values.
Distributional Bellman Operator
Generalization of the classical Bellman operator that applies to full distributions rather than just expected values.
Wasserstein Distance
Metric used to measure similarity between value distributions in the return space, allowing capture of both the location and shape of distributions.
Distributional Projection
Process of projecting continuous distributions onto a predefined discrete support, essential for practical implementation of distributional algorithms.
Distributional Risk
Measure of the uncertainty and variability in return predictions, quantified through the higher statistical moments of the value distribution.
Higher-Order Moments
Statistics (variance, skewness, kurtosis) describing the shape of the return distribution beyond the mean, capturing asymmetry and probability concentration.
Distributional Temporal Variation
Temporal evolution of the full shape of the return distribution rather than just its expected value, revealing changing risk patterns.
Discrete Value Support
Finite and ordered set of values on which continuous distributions are approximated in practical distributional algorithms.
Distributional Propagation
Process of updating value distributions via the Bellman operator, preserving uncertainty information at each time step.
Distributional Stability
Property of convergence of value distributions to a stable form during learning, ensuring the consistency of uncertainty estimates.