🏠 Strona Główna
Benchmarki
📊 Wszystkie benchmarki 🦖 Dinozaur v1 🦖 Dinozaur v2 ✅ Aplikacje To-Do List 🎨 Kreatywne wolne strony 🎯 FSACB - Ostateczny pokaz 🌍 Benchmark tłumaczeń
Modele
🏆 Top 10 modeli 🆓 Darmowe modele 📋 Wszystkie modele ⚙️ Kilo Code
Zasoby
💬 Biblioteka promptów 📖 Słownik AI 🔗 Przydatne linki

Słownik AI

Kompletny słownik sztucznej inteligencji

162
kategorie
2 032
podkategorie
23 060
pojęcia
📖
pojęcia

Multi-Step Distributional TD

Temporal-difference algorithm that propagates information over multiple time steps in the distribution space, improving the stability and efficiency of learning.

📖
pojęcia

Quantile Regression in RL

Distributional approach that directly estimates the quantiles of the return distribution, offering a flexible representation without requiring prior discretization.

📖
pojęcia

Wasserstein Metric

Distance between distributions used in distributional learning to measure the similarity between return distributions, taking into account the geometry of the reward space.

📖
pojęcia

N-Step Return Distribution

Probability distribution of the sum of rewards over N future steps, used to accelerate information propagation in multi-step distributional algorithms.

📖
pojęcia

Distributional Policy Evaluation

Process of estimating the complete return distribution for a given policy, rather than just its expected value, allowing for finer performance analysis.

📖
pojęcia

Risk-Sensitive RL

Extension of distributional reinforcement learning that optimizes specific risk measures (CVaR, variance) rather than expectation alone.

📖
pojęcia

Distributional Policy Gradient

Policy optimization algorithm that uses the complete information of the return distribution to update parameters, enabling explicit risk-reward trade-offs.

📖
pojęcia

Distributional Actor-Critic

Architecture where the critic evaluates the return distribution rather than a single scalar value, providing a richer learning signal to the actor.

📖
pojęcia

Distributional Dynamic Programming

Extension of dynamic programming methods that operates on value distributions, allowing more precise resolution of problems with uncertainty.

📖
pojęcia

Atomic Support in C51

Discrete set of predefined values used as support to represent return distributions in the C51 algorithm, allowing efficient approximation of continuous distributions.

📖
pojęcia

Distributional Bootstrap

Estimation technique where the distribution of a state is updated using the distribution of next states, preserving the stochastic structure across iterations.

📖
pojęcia

Stability in Distributional RL

Property guaranteeing the convergence of distributional algorithms, often improved through the use of multi-step methods and appropriate projections.

📖
pojęcia

Distributional Risk Measures

Functionals of the return distribution (Value-at-Risk, Expected Shortfall) used to characterize and optimize behavior in the face of uncertainty.

📖
pojęcia

Multi-Step Uncertainty Propagation

Mechanism by which uncertainty about future returns is effectively propagated across multiple time horizons in the distributional framework.

📖
pojęcia

Distributional Sampling

Sampling technique from predicted return distributions to estimate gradients and update policies in distributional algorithms.

🔍

Nie znaleziono wyników