MARL Continu
Multi-Agent Deep Deterministic Policy Gradient (MADDPG)
CTDE algorithm extending DDPG to multi-agent environments, using centralized critics and decentralized actors to learn in continuous action spaces.
← 뒤로