Two-stage detection
Feature Extractor Backbone
Base CNN network (like ResNet, VGG, or EfficientNet) that extracts visual features from the input image, shared between proposal and classification stages.
← Indietro