Batch Constrained Q-learning (BCQ)
Offline Reinforcement Learning
Learning paradigm where the agent learns exclusively from a fixed set of previously collected data, without interaction with the environment. This approach is essential when real-time exploration is costly or dangerous.
← Geri