Deep Deterministic Policy Gradient (DDPG)
Target Networks
Duplicated neural networks with slowly updated weights to stabilize learning by providing more consistent targets.
← ZurückDuplicated neural networks with slowly updated weights to stabilize learning by providing more consistent targets.
← Zurück