Self-Supervised Pre-training
Multi-Modal Pre-training
Pre-training on multiple types of data (text, image, audio) simultaneously to create unified representations that enhance cross-modal transfer capabilities.
← Tillbaka