GPT (Generative Pre-trained Transformer)
Decoder-Only
Transformer architecture consisting exclusively of decoder blocks with causal masking, optimized for autoregressive language modeling and generation tasks.
← Indietro