Thuật ngữ AI
Từ điển đầy đủ về Trí tuệ nhân tạo
FPN (Feature Pyramid Network)
Convolutional neural network architecture that builds a pyramid of high-level features through a top-down pathway and lateral connections, improving object detection at all scales.
PANet (Path Aggregation Network)
Improvement of FPN that adds a bottom-up pathway to shorten the information flow between lower and upper layers, strengthening feature localization and information propagation through the network.
Top-Down Pathway
Part of an FPN that upsamples higher resolution feature maps from abstract layers, allowing prediction of smaller objects with rich semantic features.
Bottom-Up Pathway
In an architecture like PANet, this path strengthens the propagation of low-level features to upper layers, improving localization accuracy for small objects.
NAS-FPN (Neural Architecture Search FPN)
Feature pyramid whose structure is automatically discovered by neural architecture search, optimizing connections between scales for maximum performance in object detection.
BiFPN (Bidirectional Feature Pyramid Network)
Efficient FPN architecture that uses bidirectional connections (top-down and bottom-up) and weighted feature fusion to improve accuracy while reducing computational complexity.
Weighted Feature Fusion
Mechanism used in architectures like BiFPN where contributions of different feature maps are weighted and learnable, allowing the network to determine the importance of each scale.
Multi-Scale Anchor Box
Use of anchor boxes of different sizes and aspect ratios at each level of the feature pyramid, ensuring better matching between proposals and objects of varying sizes.
Multi-Scale RoIAlign
Application of the RoIAlign operation on the features of the most appropriate pyramid level for a region of interest (RoI) size, ensuring precise feature extraction for objects of all sizes.
Multi-Scale Anchor-Free Detection
Detection approach that directly predicts key points or centers of objects across multiple levels of the feature pyramid, eliminating the need for predefined anchor boxes.
Atrous Spatial Pyramid Pooling (ASPP)
Module that captures context at multiple scales using atrous (dilated) convolutions with different dilation rates, often integrated into detection architectures to handle scale variations.
TridentNet
Detection architecture that builds parallel processing branches, each specialized for a specific range of object scales, sharing weights for computational efficiency.
SF-Net (Scale Fusion Network)
Network that explicitly fuses features from different scales using attention mechanisms to highlight the most relevant scales for each detected object.
M2Det (Multi-Level Multi-Scale Detector)
Detector that builds a multi-level feature pyramid network (MLFPN) to learn richer and more discriminative multi-scale representations, improving detection of objects of vastly different sizes.
Multi-Scale Cascade R-CNN
Extension of Cascade R-CNN where each cascade stage operates on a different level of the feature pyramid, progressively refining detections at increasingly precise scales.