Model Parallelism
Expert Parallelism
A technique specific to dense mixture-of-experts (MoE) models where different expert networks are distributed across separate accelerators to balance the computational load.
← Back