Centralized-Decentralized MARL
Multi-Agent Proximal Policy Optimization (MAPPO)
Extension of PPO to multi-agent environments using centralized critics to evaluate individual decentralized policies. MAPPO maintains PPO's training stability while handling multi-agent non-stationarity.
← Terug