Offline Multi-Task Reinforcement Learning
Multi-Task Offline Value Function Factorization
Decomposition of the value function into shared and task-specific components to improve offline multi-task learning.
← Indietro