Text generation
Gradient checkpointing
Memory optimization technique that reduces RAM footprint by recomputing certain activations during backpropagation, enabling the training of larger models.
← Kembali