Vision Transformers
Masked Autoencoders (MAE)
Self-supervised pre-training approach where Vision Transformers learn by reconstructing masked image patches from the remaining visible patches. This simple yet effective method achieves state-of-the-art pre-training performance while being highly computationally efficient.
← Zurück