KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Conservative Q-Learning (CQL)
Method that penalizes overestimated Q-values to keep the policy close to the data distribution.
Batch Constrained Q-learning (BCQ)
Approach that constrains actions to remain close to those observed in the dataset to avoid distribution shift.
Decision Transformer
Transformer architecture that treats offline reinforcement learning as a sequence-to-sequence problem.
Implicit Q-Learning (IQL)
Method that implicitly learns the Q function without requiring an explicit max operator.
Model-Based Offline RL
Approach using learned models of the environment to improve out-of-distribution sampling.
Offline-to-Online Transfer Learning
Techniques for effectively transferring offline learnings to online settings.
Distributional Offline RL
Methods modeling the full distribution of returns rather than just their mathematical expectation.
Safe Offline Reinforcement Learning
Approaches ensuring safety when deploying policies learned solely on static data.
Uncertainty-Aware Offline RL
Methods quantifying epistemic uncertainty to avoid out-of-distribution actions.
Trajectory Transformer
Transformer model that generates complete trajectories by learning the distribution of state-action-reward sequences.
Advantage-Weighted Regression (AWR)
Approach weighting regressions based on advantage to improve out-of-distribution action selection.
Offline Multi-Task Reinforcement Learning
Paradigm for simultaneous learning of multiple tasks from shared batch datasets.