Vision Transformers for Detection
Transformer Decoder Head
Final module of DETR architectures transforming encoder features into bounding box predictions and classes via attention on object queries.
← 뒤로