Transformers for detection - Yapay Zeka Sözlüğü

📖

terimler

DETR (DEtection TRansformer)

Pioneering architecture that eliminates the need for anchors and non-maximum suppression by treating object detection as a direct set prediction problem, using a bipartite transformer to model relationships between objects.

📖

terimler

Bipartite Transformer

Variant of the Transformer architecture where attention mechanisms are applied between image features and a small fixed set of learnable object queries, enabling parallel object prediction.

📖

terimler

Object Queries

Learnable positional embedding vectors that serve as slots for each potential object prediction, interacting with image features through the attention mechanism to extract relevant information.

📖

terimler

Bipartite Matching Loss

Loss function based on the Hungarian algorithm that finds an optimal one-to-one matching between model predictions and ground truths, solving the permutation problem of unsupervised predictions.

📖

terimler

Transformer Encoder-Decoder

Structure where the encoder processes image features to create a context-rich representation, and the decoder uses object queries to decode this representation into final box and class predictions.

📖

terimler

Multi-Scale Multi-head Attention (MSA)

Attention mechanism that operates on fused features from multiple levels of the feature map, allowing the model to simultaneously capture local and global information for better detection of objects of various sizes.

📖

terimler

DETR-ResNet

Variant of DETR that uses a ResNet convolutional neural network as the main feature extractor, combining the power of CNNs for feature extraction with the global reasoning of Transformers.

📖

terimler

Mask2Former

Unified architecture for panoptic, instance, and semantic segmentation that masks regions of interest and directly predicts masks using transformers, outperforming previous approaches in terms of accuracy and simplicity.

📖

terimler

Positional Embeddings

Vectors added to image features to provide spatial information to the Transformer, essential for the model to understand scene geometry and correctly locate objects.

📖

terimler

Conditional DETR

Improvement of DETR that accelerates convergence by conditioning object queries on image content, allowing better query specialization and more accurate predictions.

📖

terimler

Deformable DETR

Variant of DETR that integrates deformable attention modules to focus on a small set of key points, significantly improving convergence speed and performance, especially for small objects.

📖

terimler

Sparse R-CNN

Fully sparse detection approach that uses a fixed set of learnable proposed boxes and a cascade of transformers to refine predictions, eliminating the need for heuristics like anchors or NMS.

📖

terimler

Query-to-Attention

Mechanism where object queries guide the model's attention to relevant regions of the image, unlike global attention, improving prediction efficiency and specialization.

📖

terimler

DINO (DETR with Improved deNoising Anchor Boxes)

State-of-the-art model that combines improved denoising anchor boxes with a Transformer architecture, achieving state-of-the-art performance on detection benchmarks without requiring NMS.

📖

terimler

Focal Loss for Transformers

Loss function designed to address the slow convergence problem of DETR models by focusing on hard samples and reducing the contribution of well-classified easy samples.

📖

terimler

Panoptic Segmentation by Transformer

Application of Transformer architectures to the unified task of panoptic segmentation, simultaneously predicting semantic masks for things and background using a single end-to-end model.

📖

terimler

Mamba-DETR

Detection architecture that replaces attention mechanisms with State Space Blocks inspired by Mamba, offering linear complexity and competitive performance for real-time object detection.

YZ Sözlüğü

DETR (DEtection TRansformer)

Bipartite Transformer

Object Queries

Bipartite Matching Loss

Transformer Encoder-Decoder

Multi-Scale Multi-head Attention (MSA)

DETR-ResNet

Mask2Former

Positional Embeddings

Conditional DETR

Deformable DETR

Sparse R-CNN

Query-to-Attention

DINO (DETR with Improved deNoising Anchor Boxes)

Focal Loss for Transformers

Panoptic Segmentation by Transformer

Mamba-DETR

Sonuç bulunamadı