Meta-Reinforcement Learning
Proximal Meta-Policy Optimization (ProMP)
Meta-RL algorithm that extends PPO to meta-learning, optimizing a meta-policy capable of generating task-specific policies.
← Back