Multi-Modal Transformers
CoCa
Contrastive Captioners model combining a contrastive objective for representation learning and a generative objective for captioning in a single unified Transformer architecture.
← Kembali