Offline Multi-Task Reinforcement Learning
Task-Specific Policy Heads
Network architecture with shared common trunk and distinct output heads for each task in offline multi-task learning.
← 뒤로Network architecture with shared common trunk and distinct output heads for each task in offline multi-task learning.
← 뒤로