🏠 Hem
Benchmarkar
📊 Alla benchmarkar 🦖 Dinosaur v1 🦖 Dinosaur v2 ✅ To-Do List-applikationer 🎨 Kreativa fria sidor 🎯 FSACB - Ultimata uppvisningen 🌍 Översättningsbenchmark
Modeller
🏆 Topp 10 modeller 🆓 Gratis modeller 📋 Alla modeller ⚙️ Kilo Code
Resurser
💬 Promptbibliotek 📖 AI-ordlista 🔗 Användbara länkar

AI-ordlista

Den kompletta ordlistan över AI

162
kategorier
2 032
underkategorier
23 060
termer
📖
termer

State-action distribution

Probabilistic representation of the Q(s,a) value function that models the complete distribution of possible returns rather than just their mathematical expectation.

📖
termer

Distributional transition model

Model-based reinforcement learning model that captures uncertainty in state transitions by modeling probability distributions over next states.

📖
termer

Probabilistic dynamics model

Predictive model in model-based RL that generates probability distributions over next states or rewards rather than deterministic predictions.

📖
termer

Epistemic uncertainty in RL

Uncertainty due to lack of knowledge about the environment model, modeled by distributions in distributional model-based RL approaches.

📖
termer

Aleatoric uncertainty in RL

Inherent uncertainty in the environment that cannot be reduced even with more data, captured by distributions in distributional RL models.

📖
termer

Distributional policy gradient

Extension of policy gradient methods that directly optimizes over the distribution of returns rather than their expectation, enabling risk-sensitive policies.

📖
termer

Risk-sensitive RL

Reinforcement learning approach that uses distributional information to optimize risk metrics like CVaR or standard deviation instead of just expectation.

📖
termer

Model ensembles in distributional RL

Technique using multiple independently learned models to capture epistemic uncertainty in distributional model-based RL approaches.

📖
termer

Particle-based distribution models

Distributional modeling approach that represents distributions by a set of weighted particles, useful for complex transitions in model-based RL.

📖
termer

Wasserstein distance in distributional RL

Metric used to measure dissimilarity between distributions in the distributional Bellman operator, offering better convergence properties than KL distance.

📖
termer

Moment matching in distributional RL

Optimization technique that adjusts parameters to match statistical moments (mean, variance, etc.) of predicted and target distributions.

📖
termer

Variational inference in RL

Method for approximating complex distributions by optimizing a family of simpler distributions, applied in model-based RL to handle uncertainty.

📖
termer

Bayesian model-based RL

Approach that maintains a distribution over possible environment models, using Bayesian methods to quantify and exploit epistemic uncertainty.

📖
termer

Distributional Bellman operator

Extension of the classic Bellman operator that operates on return distributions rather than scalar values, preserving distributional structure.

📖
termer

Horizon-dependent distributions

Concept in distributional RL where the return distribution changes with the time horizon, capturing the evolution of uncertainty over different time scales.

📖
termer

Categorical atomic projection

Mathematical operation used in C51 that projects the target distribution onto predefined atom support to maintain distributional consistency.

📖
termer

Distributional uncertainty propagation

Process in model-based RL where the uncertainty of model predictions is propagated through planning steps to evaluate policy robustness.

🔍

Inga resultat hittades