Thuật ngữ AI
Từ điển đầy đủ về Trí tuệ nhân tạo
R-CNN (Regions with CNN features)
Pioneering two-step detection algorithm that first extracts candidate regions via Selective Search, then classifies each region with a pre-trained convolutional neural network.
Selective Search
Hierarchical segmentation method that generates candidate region proposals by grouping similar pixels based on color, texture, and size.
RoI Pooling (Region of Interest Pooling)
Neural network layer that transforms variable-sized candidate regions into a fixed-size output for the classifier, preserving spatial features.
RPN (Region Proposal Network)
Fully convolutional sub-network that simultaneously predicts candidate bounding boxes and object scores at each spatial location of the feature map.
Anchor Boxes
Predefined reference boxes with different sizes and aspect ratios used by the RPN to normalize bounding box predictions and speed up convergence.
Feature Pyramid Network (FPN)
Architecture that builds a multi-scale feature pyramid with lateral and top-down pathways, improving the detection of objects at different sizes in Faster R-CNN.
Cascade R-CNN
Multi-stage architecture where detectors are trained sequentially with increasing Intersection over Union (IoU) thresholds, progressively refining box predictions.
Bounding Box Regression
Regression task that refines the coordinates of predicted bounding boxes by learning transformations to minimize the gap with the ground truth boxes.
RoIAlign
Improvement over RoI Pooling that avoids forced quantization by using precise bilinear sampling, better preserving spatial alignment for instance segmentation.
Feature Extractor Backbone
Base CNN network (like ResNet, VGG, or EfficientNet) that extracts visual features from the input image, shared between proposal and classification stages.
Two-Stage Detector
Detection paradigm that explicitly separates candidate region generation from precise classification and localization, typically offering better accuracy at the cost of speed.
Region Proposal Quality
Measure of how effectively an algorithm generates relevant candidate regions, evaluated by recall at different IoU thresholds with ground truth boxes.
Multi-Scale Training
Training strategy that uses images resized to different scales to improve detector robustness against object size variations.
Contextual Reasoning Module
Component that models relationships between objects and their global context to improve detection, often integrating attention or graph mechanisms.